Difference between revisions of "MapReduce"
(14 intermediate revisions by 2 users not shown) | |||
Line 16: | Line 16: | ||
Here the map function emits the continent and the population for each country. | Here the map function emits the continent and the population for each country. | ||
− | The reduce function uses the JavaScript function Array.sum to add the populations. | + | The reduce function uses the JavaScript function <code>Array.sum</code> to add the populations. |
− | <pre class=def> | + | <pre class="def"><nowiki> |
db.world.mapReduce( | db.world.mapReduce( | ||
− | function(){emit(this.continent, this.population);}, | + | function () {emit(this.continent, this.population);}, |
− | function(k, v){ return Array.sum(v); }, | + | function (k, v) { return Array.sum(v); }, |
− | {out:{inline:1}} | + | {out: {inline: 1}} |
− | ) | + | );</nowiki></pre> |
− | </pre> | ||
</div> | </div> | ||
− | |||
==Number of countries in each continent== | ==Number of countries in each continent== | ||
Line 31: | Line 29: | ||
Instead of sending populations you can send a list one 1s to the reduce function. | Instead of sending populations you can send a list one 1s to the reduce function. | ||
− | + | The reduce function will now create a count of the number of countries in each continent. | |
− | <pre class=def> | + | <pre class="def"><nowiki> |
db.world.mapReduce( | db.world.mapReduce( | ||
− | function(){emit(this.continent, 1);}, | + | function () {emit(this.continent, 1);}, |
− | function(k, v){ return Array.sum(v); }, | + | function (k, v) { return Array.sum(v); }, |
− | {out:{inline:1}} | + | {out: {inline: 1}} |
− | ) | + | );</nowiki></pre> |
− | </pre> | ||
</div> | </div> | ||
==Count only some countries== | ==Count only some countries== | ||
<div class=q data-lang="mongo"> | <div class=q data-lang="mongo"> | ||
− | The map function | + | The map function does not need to emit once for every entry. |
+ | |||
+ | In this example we are only counting the countries that have a large population. | ||
+ | <pre class="def"><nowiki> | ||
+ | db.world.mapReduce( | ||
+ | function () { | ||
+ | if (this.population > 100000000) | ||
+ | { | ||
+ | emit(this.continent, 1); | ||
+ | } | ||
+ | }, | ||
+ | function (k, v) { return Array.sum(v); }, | ||
+ | {out: {"inline": 1}} | ||
+ | );</nowiki></pre> | ||
+ | </div> | ||
+ | |||
+ | ==Examine the reduce function== | ||
+ | <div class=q data-lang="mongo"> | ||
+ | <p class="strong">Examine the reduce function.</p> | ||
+ | |||
+ | Here we emit the continent and the name, and in the reduce function we <code>return v.join(',')</code> to see a comma separated list of the values in the list. | ||
+ | <pre class="def"><nowiki> | ||
+ | db.world.mapReduce( | ||
+ | function () { | ||
+ | if (this.population > 100000000) { | ||
+ | emit(this.continent, this.name); | ||
+ | } | ||
+ | }, | ||
+ | function (k, v) { return v.join(','); }, | ||
+ | {out: {"inline": 1}} | ||
+ | );</nowiki></pre> | ||
+ | </div> | ||
+ | |||
+ | ==Reduce to a single value== | ||
+ | <div class=q data-lang="mongo"> | ||
+ | If you emit the same key every time you will get exactly one result from your query. | ||
− | + | Here we emit the value 1 as the key and 1 as the value. The reduce function sums those 1s to get a count of the total number of countries. | |
− | <pre class=def> | + | <pre class="def"><nowiki> |
db.world.mapReduce( | db.world.mapReduce( | ||
− | function(){ | + | function () { |
− | + | emit(1, 1); | |
− | + | }, | |
− | function(k, v){ return Array.sum(v); }, | + | function (k, v) { return Array.sum(v); }, |
− | {out:{"inline":1}} | + | {out: {"inline": 1}} |
− | ) | + | );</nowiki></pre> |
− | </pre> | ||
</div> | </div> | ||
− | == | + | ==Emit a name== |
<div class=q data-lang="mongo"> | <div class=q data-lang="mongo"> | ||
− | + | You can use the list given in the reduce function. | |
− | Here we emit the continent and the name | + | Here we emit the key '''this.continent''' and the value '''this.name'''. |
− | <pre class=def> | + | The reduce function returns the first element of the collected list. |
+ | <pre class="def"><nowiki> | ||
db.world.mapReduce( | db.world.mapReduce( | ||
− | function(){ | + | function () { |
− | + | emit(this.continent, this.name); | |
− | + | }, | |
− | function(k, v){ return v | + | function (k, v) { return v[0]; }, |
− | {out:{"inline":1}} | + | {out: {"inline": 1}} |
− | ) | + | );</nowiki></pre> |
− | </pre> | ||
</div> | </div> |
Latest revision as of 08:47, 26 June 2018
Contents
Introducing the MapReduce function
The MapReduce function is an aggregate function that consists of two functions: Map and Reduce.
The map is always performed before the reduce.
The map function examines every document in the collection and emits (key,value) pairs.
The map function takes no input however the current document can be accessed as this
The reduce function has two inputs, for every distinct key emitted by map the reduce function is called with a list of the corresponding values.
Population of each continent
Here the map function emits the continent and the population for each country.
The reduce function uses the JavaScript function Array.sum
to add the populations.
db.world.mapReduce( function () {emit(this.continent, this.population);}, function (k, v) { return Array.sum(v); }, {out: {inline: 1}} );
Number of countries in each continent
Instead of sending populations you can send a list one 1s to the reduce function.
The reduce function will now create a count of the number of countries in each continent.
db.world.mapReduce( function () {emit(this.continent, 1);}, function (k, v) { return Array.sum(v); }, {out: {inline: 1}} );
Count only some countries
The map function does not need to emit once for every entry.
In this example we are only counting the countries that have a large population.
db.world.mapReduce( function () { if (this.population > 100000000) { emit(this.continent, 1); } }, function (k, v) { return Array.sum(v); }, {out: {"inline": 1}} );
Examine the reduce function
Examine the reduce function.
Here we emit the continent and the name, and in the reduce function we return v.join(',')
to see a comma separated list of the values in the list.
db.world.mapReduce( function () { if (this.population > 100000000) { emit(this.continent, this.name); } }, function (k, v) { return v.join(','); }, {out: {"inline": 1}} );
Reduce to a single value
If you emit the same key every time you will get exactly one result from your query.
Here we emit the value 1 as the key and 1 as the value. The reduce function sums those 1s to get a count of the total number of countries.
db.world.mapReduce( function () { emit(1, 1); }, function (k, v) { return Array.sum(v); }, {out: {"inline": 1}} );
Emit a name
You can use the list given in the reduce function.
Here we emit the key this.continent and the value this.name. The reduce function returns the first element of the collected list.
db.world.mapReduce( function () { emit(this.continent, this.name); }, function (k, v) { return v[0]; }, {out: {"inline": 1}} );