Cookies help us deliver our services. By using our services, you agree to our use of cookies. More information

Difference between revisions of "MapReduce"

From NoSQLZoo
Jump to: navigation, search
Line 31: Line 31:
  
 
<pre class=def>
 
<pre class=def>
 +
db.world.mapReduce(
 +
  function(){emit(this.continent, 1);},
 +
  function(k, v){ return v.length; },
 +
  {out:{inline:1}}
 +
)
 +
</pre>
 +
<pre class=ans>
 
db.world.mapReduce(
 
db.world.mapReduce(
 
   function(){emit(this.continent, 1);},  
 
   function(){emit(this.continent, 1);},  

Revision as of 19:50, 2 August 2016

#ENCODING
import io
import sys
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-16')
#MONGO
from pymongo import MongoClient
client = MongoClient()
client.progzoo.authenticate('scott','tiger')
db = client['progzoo']
#PRETTY
import pprint
pp = pprint.PrettyPrinter(indent=4)

Introducing the MapReduce function

The MapReduce function is an aggregate function that consists of two functions: Map and Reduce.

The map is always performed before the reduce.

The map function examines every document in the collection and emits (key,value) pairs.

The map function takes no input however the current document can be accessed as this

The reduce function has two inputs, for every distinct key emitted by map the reduce function is called with a list of the corresponding values.

emit all continents

This example returns the number of countries in each continent.

db.world.mapReduce(
  function(){emit(this.continent, 1);}, 
  function(k, v){ return v.length; },
  {out:{inline:1}}
)
db.world.mapReduce(
  function(){emit(this.continent, 1);}, 
  function(k, v){ return v.length; },
  {out:{inline:1}}
)

emit only some continents

The map function may emit only sometimes.

In the example we are only counting the countries that have a large population

db.world.mapReduce(
  function(){
    if (this.population>100000000)
      emit(this.continent, this.name);},
  function(k, v){ return v.length; }
  {out:{"inline":1}}
)