Cookies help us deliver our services. By using our services, you agree to our use of cookies. More information

Difference between revisions of "MAPREDUCE Elite"

From NoSQLZoo
Jump to: navigation, search
 
(38 intermediate revisions by 3 users not shown)
Line 1: Line 1:
<pre class=setup>
+
==Introducing the elite database==
#ENCODING
+
These questions will introduce the "elite" database, which contains data about the video game [https://www.elitedangerous.com/ Elite Dangerous]<br/><br/><br/>
import io
+
There are two collections, <code>commodities</code> and <code>systems</code>.<br/>Inside <code>systems</code> there are nested documents called <code>stations</code>.<br/>
import sys
+
A <b>system</b> has many <b>stations</b>, and a <b>station</b> has many trade <code>listings</code>.<br/><br/>
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-16')
 
#MONGO
 
from pymongo import MongoClient
 
client = MongoClient()
 
client.elite.authenticate('scott','tiger')
 
db = client['elite']
 
#PRETTY
 
import pprint
 
pp = pprint.PrettyPrinter(indent=4)
 
#JS
 
from bson.code import Code
 
</pre>
 
==Introducing the elite database **WORK IN PROGRESS==
 
These questions will introduce the "elite" database, which contains data about the video game [https://www.elitedangerous.com/ Elite Dangerous]<br/>
 
<code>from bson.code import Code</code> has been added to the setup so that you no longer need to include it in your answer.<br/><br/>
 
There are two collections, <code>commodities</code> and <code>systems</code>. Inside <code>systems</code> there is are nested documents called <code>stations</code><br/>
 
A <b>system</b> has many <b>stations</b>, and a <b>station</b> has many trade <code>listings</code><br/><br/>
 
 
Keys used in this database.
 
Keys used in this database.
 
<pre>
 
<pre>
     commodities:  
+
     commodities:
 
         _id, average_price, category, name
 
         _id, average_price, category, name
     systems:  
+
     systems:
 
         _id, allegiance, faction, government, name, population, primary_economy, security, state, stations, updated_at, x, y, z
 
         _id, allegiance, faction, government, name, population, primary_economy, security, state, stations, updated_at, x, y, z
 
 
     systems.stations:  
 
     systems.stations:  
         allegiance, distance_to_star, economies, export_commodities,has_blackmarket, has_commodities, has_rearm, has_repair,
+
         _id, allegiance, distance_to_star, economies, export_commodities,has_blackmarket, has_commodities, has_rearm, has_repair,
 
         has_shipyard, has_outfitting, faction, government, listings, max_landing_pad, name, state, type, updated_at
 
         has_shipyard, has_outfitting, faction, government, listings, max_landing_pad, name, state, type, updated_at
 
 
     systems.stations.listings:  
 
     systems.stations.listings:  
         buy_price, collected_at, demand, commodity, sell_price, supply, update_count
+
         _id, buy_price, collected_at, demand, commodity, sell_price, supply, update_count
       
 
 
</pre>
 
</pre>
 
Read more about the structure here: [[Elite Document Structure]]
 
Read more about the structure here: [[Elite Document Structure]]
Line 39: Line 19:
  
 
==Questions==
 
==Questions==
<div class=q data-lang="py3">The <code>commodities</code> collection contains the <code>name</code> and <code>average_price</code> of each commodity.<br/>
+
<div class="q" data-lang="mongo" data-switches='elite'>The <code>commodities</code> collection contains the <code>name</code> and <code>average_price</code> of each commodity.<br/>
 
There are 99 unique commodities and 15 categories.
 
There are 99 unique commodities and 15 categories.
<p class="strong">Find the average price of each category, round to the nearest whole number</p>
+
<p class="strong">Find the average price of each category, round to the nearest whole number.</p>
<pre class=def>
+
<pre class="def"><nowiki>
pp.pprint(
+
db.commodities.mapReduce(
     db.commodities.find_one()
+
  function(){
)
+
    emit(1, 1);
</pre>
+
  },
<div class="ans">
+
  function(k, v){
from bson.code import Code;temp = db.commodities.map_reduce( map=Code("function(){emit(this.category,this.average_price)}"), reduce=Code("""function(key,values){var total = 0;for (var i = 0; i < values.length; i++){total += values[i];}return Math.round(total/values.length);} """),out={"inline":1} );pp.pprint(temp['results'])
+
     return Array.sum(v);
</div>
+
  },
 +
  {out: {inline: 1}}
 +
);</nowiki></pre>
 +
<pre class="ans"><nowiki>db.commodities.mapReduce(function(){emit(this.category,this.average_price);},function(k,v){return Math.round(Array.sum(v)/v.length);},{out:{inline:1}});</nowiki></pre>
 
</div>
 
</div>
 
+
<div class="q" data-lang="mongo" data-switches='elite'>Each system has an <code>allegiance</code>. There are three main factions: <b>The Federation</b>, <b>The Empire</b>, and <b>The Alliance</b>.<br/>
<div class=q data-lang="py3">Each system has an <code>allegiance</code>. There are three main factions: <b>The Federation</b>, <b>The Empire</b>, and <b>The Alliance</b>.<br/>
 
 
<p>Non-populated systems without stations do not have an allegiance, and should be ignored.</p>  
 
<p>Non-populated systems without stations do not have an allegiance, and should be ignored.</p>  
<p class=strong>Show the amount of systems following each type of allegiance.</p>
+
<p class="strong">Show the amount of systems following each type of allegiance.</p>
<pre class=def>
+
<pre class="def"><nowiki>
</pre>
+
db.systems.mapReduce(
<div class="ans">
+
  function(){
temp = db.systems.map_reduce( query={"allegiance": {"$ne":None}}, map=Code("function(){emit(this.allegiance, 1)}"), reduce=Code("""function(key,values){var total = 0;values.forEach(function(value){total += value;});return total;} """), out={"inline":1} );pp.pprint(temp['results']) 
+
    emit(1, 1);
 +
  },
 +
  function(k, v){
 +
    return Array.sum(v);
 +
  },
 +
  {out: {inline: 1}}
 +
);</nowiki></pre>
 +
<pre class="ans"><nowiki>db.systems.mapReduce(function(){if (this.allegiance!=null){emit(this.allegiance,1);}},function(k,v){return Array.sum(v);},{out:{inline:1}});</nowiki></pre>
 
</div>
 
</div>
</div>
+
<div class="q" data-lang="mongo" data-switches='elite'>
 
 
<div class=q data-lang="py3">
 
 
<p class="strong">What are the populations of the three main factions?</p>
 
<p class="strong">What are the populations of the three main factions?</p>
<div class=hint title="Three main factions">["Alliance","Federation","Empire"]</div>
+
<div class="hint" title="Three main factions">["Alliance","Federation","Empire"]</div>
<pre class=def>
+
<div class="hint" title="NaN?">Some systems are not populated and will have '''null''' population fields, make sure to exclude them using <code>!isNaN()</code>.</div>
</pre>
+
<pre class="def"><nowiki>
<div class="ans">
+
db.systems.mapReduce(
temp = db.systems.map_reduce(query={"allegiance":{"$in":["Alliance","Empire","Federation"]}}, map=Code("function(){emit(this.allegiance,this.population)}"), reduce=Code("""function(key,values){var total = 0;values.forEach(function(value){total += value;});return total;} """), out={"inline":1} );pp.pprint(temp['results'])
+
  function(){
</div>
+
    emit(1, 1);
 +
  },
 +
  function(k, v){
 +
    return Array.sum(v);
 +
  },
 +
  {out: {inline: 1}}
 +
);</nowiki></pre>
 +
<pre class="ans"><nowiki>db.systems.mapReduce(function(){if(!isNaN(this.population)&&this.allegiance!=null&&this.allegiance!="Independent"&&this.allegiance!="Anarchy"){emit(this.allegiance,this.population);}},function(k,v){return Array.sum(v);},{out:{inline:1}});</nowiki></pre>
 
</div>
 
</div>
  
 
==Harder Questions==
 
==Harder Questions==
<div class=q data-lang="py3">How much Hydrogen Fuel is owned by stations in systems that are allied with the three main factions? Limit your query to the first 5000 stations.
+
<div class="q" data-lang="mongo" data-switches='elite'>
 +
<p class="strong">
 +
How much Hydrogen Fuel is owned by each faction? Limit your query to the first 5000 stations.
 +
</p>
 
<div class="hint" title="hint">
 
<div class="hint" title="hint">
The amount of stations <b>and</b> the amount of listings aren't fixed, you'll need to <code>query</code> to ensure that they exist and find a way of iterating through them in your <code>map</code> stage.
+
The amount of stations in a system <b>and</b> the amount of listings to a station aren't fixed. <code>query</code> can be used to ensure that they exist.
 
</div>
 
</div>
<pre class=def>
+
<pre class="def"><nowiki>
</pre>
+
db.systems.mapReduce(
<div class="ans">
+
  function(){
temp = db.systems.map_reduce( limit=5000, query={"allegiance":{"$in":["Alliance","Empire","Federation"]},"stations.listings.commodity":"Hydrogen Fuel","stations.listings.supply":{"$exists":1}}, map=Code("""function(){ for(var i in this.stations) for(var j in this.stations[i].listings) emit(this.allegiance,this.stations[i].listings[j].supply) }"""), reduce=Code("""function(k,vs){var t = 0;vs.forEach(function(v){t += v});return t;} """), out={"inline":1} );  
+
    emit(1, 1);
pp.pprint(temp['results']);
+
  },
 +
  function(k, v){
 +
    return Array.sum(v);
 +
  },
 +
  {out: {inline: 1}}
 +
);</nowiki></pre>
 +
<pre class="ans"><nowiki>db.systems.mapReduce(function(){if(this.stations)for(let i=0;i<this.stations.length;i++){let t=this.stations[i];if(t.listings&&t.allegiance)for(let s=0;s<t.listings.length;s++){let n=t.listings[s];"Hydrogen Fuel"===n.commodity&&emit(t.allegiance,n.supply)}}},function(i,t){return Array.sum(t)},{out:{inline:1},limit:5e3});</nowiki></pre>
 
</div>
 
</div>
</div>
+
<div class="q" data-lang="mongo" data-switches='elite'>A <code>power_control_faction</code> or <b>Power</b> is an individual or organisation who is in control of a system.<br/>
 
+
These powers have allegiance to a faction, but the systems they control do not nescessarily have the same allegiance that they do.
<div class=q data-lang="py3">A <code>power_control_faction</code> or <b>Power</b> is an individual or organisation who is in control of a system.<br/>
 
These powers have allegiances, but the systems they control do not nescessarily have the same allegiance as they do.
 
 
<div class="hint" title="Example"> At the time of writing <b>Zemina Torval</b> is allied with the <b>Empire</b> and controls <b>47</b> systems.<br/>
 
<div class="hint" title="Example"> At the time of writing <b>Zemina Torval</b> is allied with the <b>Empire</b> and controls <b>47</b> systems.<br/>
 
<pre>
 
<pre>
 
     {  '_id': 'Zemina Torval',
 
     {  '_id': 'Zemina Torval',
         'value': {  'alliance': 0.0,
+
         'value': {  'Alliance': 0.0,
                     'anarchy': 0.0,
+
                     'Anarchy': 0.0,
                     'empire': 39.0,
+
                     'Empire': 39.0,
                     'federation': 3.0,
+
                     'Federation': 3.0,
                     'independent': 5.0}}]
+
                     'Independent': 5.0}}]
 
</pre>
 
</pre>
 
</div>
 
</div>
 
<p class="strong">Show the allegiance of each of the power's systems</p>
 
<p class="strong">Show the allegiance of each of the power's systems</p>
<pre class=def>
+
<pre class="def"><nowiki>
</pre>
+
db.systems.mapReduce(
<div class="ans">
+
  function(){
temp = db.systems.map_reduce(query={"power_control_faction":{"$exists":1}},map=Code("""function(){switch(this.allegiance){case "Alliance":emit(this.power_control_faction,{alliance:1,anarchy:0,empire:0,federation:0,independent:0});break;case "Anarchy":emit(this.power_control_faction,{alliance:0,anarchy:1,empire:0,federation:0,independent:0});break; case "Empire":emit(this.power_control_faction,{alliance:0,anarchy:0,empire:1,federation:0,independent:0});break;case "Federation":emit(this.power_control_faction,{alliance:0,anarchy:0,empire:0,federation:1,independent:0});break;case "Independent":emit(this.power_control_faction,{alliance:0,anarchy:0,empire:0,federation:0,independent:1});break;}}"""),reduce=Code("""function(k,vs){var a=vs[0];for(var i=1;i<vs.length;i++){var b=vs[i];a.alliance+=b.alliance;a.anarchy+=b.anarchy;a.empire+=b.empire;a.federation+=b.federation;a.independent+=b.independent;}return a}"""),out={"inline":1});  
+
    emit(1, 1);
pp.pprint(temp['results']);
+
  },
</div>
+
  function(k,v){
 +
    return Array.sum(v);
 +
  },
 +
  {
 +
    query: {"power_control_faction": {"$exists": 1}},
 +
    out: {inline: 1}
 +
  }
 +
);</nowiki></pre>
 +
<pre class="ans"><nowiki>db.systems.mapReduce(function(){emit(this.power_control_faction,{[this.allegiance]:1});},function(_,v){let a={"Alliance":0,"Anarchy":0,"Empire":0,"Federation":0,"Independent":0};for(let i=0;i<v.length;i++){let b=v[i];a.Alliance+=b.Alliance||0;a.Anarchy+=b.Anarchy||0;a.Empire+=b.Empire||0;a.Federation+=b.Federation||0;a.Independent+=b.Independent||0;}return a;},{out:{"inline":1},query:{"power_control_faction":{"$exists":1}},sort:{"_id":1}});</nowiki></pre>
 
</div>
 
</div>
 
+
<div class="q" data-lang="mongo" data-switches='elite'>Our dataset doesn't contain the allegiance of a power:
<div class=q data-lang="py3">Our dataset doesn't contain the allegiance of a power:
 
 
<p class="strong">Using the result from the previous question, guess the power's allegiance by the faction that the majority of their systems follow.</p>
 
<p class="strong">Using the result from the previous question, guess the power's allegiance by the faction that the majority of their systems follow.</p>
<div class="hint" title="Example"><code>Zemina Torval: Empire(39.0)</code></div>
+
<p>To achieve this, you'll need to use the <code>finalize: function(k, v){}</code> in the third argument to find the key with the largest value.</p>
<pre class=def>
+
<div class="hint" title="Example">
</pre>
+
<pre>
<div class="ans">
+
{
temp = db.systems.map_reduce(query={"power_control_faction":{"$exists":1}},map=Code("""function(){switch(this.allegiance){case "Alliance":emit(this.power_control_faction,{alliance:1,anarchy:0,empire:0,federation:0,independent:0});break;case "Anarchy":emit(this.power_control_faction,{alliance:0,anarchy:1,empire:0,federation:0,independent:0});break; case "Empire":emit(this.power_control_faction,{alliance:0,anarchy:0,empire:1,federation:0,independent:0});break;case "Federation":emit(this.power_control_faction,{alliance:0,anarchy:0,empire:0,federation:1,independent:0});break;case "Independent":emit(this.power_control_faction,{alliance:0,anarchy:0,empire:0,federation:0,independent:1});break;}}"""),reduce=Code("""function(k,vs){var a=vs[0];for(var i=1;i<vs.length;i++){var b=vs[i];a.alliance+=b.alliance;a.anarchy+=b.anarchy;a.empire+=b.empire;a.federation+=b.federation;a.independent+=b.independent;}return a}"""),out={"inline":1});
+
    "_id" : "Zemina Torval",
for power in temp['results']:
+
    "value" : "Empire"
    max = 0;
+
}
    key = "";
+
</pre></div>
    #print(power['value'])
+
<pre class="def"><nowiki>
    for id in power['value']:
+
db.systems.mapReduce(
        if (power['value'][id] > max):  
+
  function(){
            max = power['value'][id]
+
    emit(1, 1);
            key = id
+
  },
    print(power['_id']+": "+key+"("+str(max)+")")
+
  function(k, v){
</div>
+
    return Array.sum(v);
 +
  },
 +
  {
 +
    finally: function(k, v){
 +
      return v;
 +
    },
 +
    query: {"power_control_faction": {"$exists": 1}},
 +
    out: {inline: 1}
 +
  }
 +
);</nowiki></pre>
 +
<pre class="ans"><nowiki>db.systems.mapReduce(function(){emit(this.power_control_faction,{[this.allegiance]:1});},function(k,v){let a={"Alliance":0,"Anarchy":0,"Empire":0,"Federation":0,"Independent":0};for(let i=0;i<v.length;i++){let b=v[i];a.Alliance+=b.Alliance||0;a.Anarchy+=b.Anarchy||0;a.Empire+=b.Empire||0;a.Federation+=b.Federation||0;a.Independent+=b.Independent||0;}return a;},{finalize:function(k,v){return Object.keys(v).reduce((a,b)=>v[a]>v[b]?a:b);},out:{"inline":1},query:{"power_control_faction":{"$exists":1}},sort:{"_id":1}});</nowiki></pre>
 
</div>
 
</div>
 +
[https://goo.gl/forms/ep8rBbCQSa381ic82 {{huge| Survey}}] <br/>
 +
Do you have thoughts about this website that you would like to share? Help improve NoSQLZoo!

Latest revision as of 13:48, 17 October 2018

Introducing the elite database

These questions will introduce the "elite" database, which contains data about the video game Elite Dangerous


There are two collections, commodities and systems.
Inside systems there are nested documents called stations.
A system has many stations, and a station has many trade listings.

Keys used in this database.

    commodities:
        _id, average_price, category, name
    systems:
        _id, allegiance, faction, government, name, population, primary_economy, security, state, stations, updated_at, x, y, z
    systems.stations: 
        _id, allegiance, distance_to_star, economies, export_commodities,has_blackmarket, has_commodities, has_rearm, has_repair,
        has_shipyard, has_outfitting, faction, government, listings, max_landing_pad, name, state, type, updated_at
    systems.stations.listings: 
        _id, buy_price, collected_at, demand, commodity, sell_price, supply, update_count

Read more about the structure here: Elite Document Structure

Questions

The commodities collection contains the name and average_price of each commodity.

There are 99 unique commodities and 15 categories.

Find the average price of each category, round to the nearest whole number.

db.commodities.mapReduce(
  function(){
    emit(1, 1);
  },
  function(k, v){
    return Array.sum(v);
  },
  {out: {inline: 1}}
);
db.commodities.mapReduce(function(){emit(this.category,this.average_price);},function(k,v){return Math.round(Array.sum(v)/v.length);},{out:{inline:1}});
Each system has an allegiance. There are three main factions: The Federation, The Empire, and The Alliance.

Non-populated systems without stations do not have an allegiance, and should be ignored.

Show the amount of systems following each type of allegiance.

db.systems.mapReduce(
  function(){
    emit(1, 1);
  },
  function(k, v){
    return Array.sum(v);
  },
  {out: {inline: 1}}
);
db.systems.mapReduce(function(){if (this.allegiance!=null){emit(this.allegiance,1);}},function(k,v){return Array.sum(v);},{out:{inline:1}});

What are the populations of the three main factions?

["Alliance","Federation","Empire"]
Some systems are not populated and will have null population fields, make sure to exclude them using !isNaN().
db.systems.mapReduce(
  function(){
    emit(1, 1);
  },
  function(k, v){
    return Array.sum(v);
  },
  {out: {inline: 1}}
);
db.systems.mapReduce(function(){if(!isNaN(this.population)&&this.allegiance!=null&&this.allegiance!="Independent"&&this.allegiance!="Anarchy"){emit(this.allegiance,this.population);}},function(k,v){return Array.sum(v);},{out:{inline:1}});

Harder Questions

How much Hydrogen Fuel is owned by each faction? Limit your query to the first 5000 stations.

The amount of stations in a system and the amount of listings to a station aren't fixed. query can be used to ensure that they exist.

db.systems.mapReduce(
  function(){
    emit(1, 1);
  },
  function(k, v){
    return Array.sum(v);
  },
  {out: {inline: 1}}
);
db.systems.mapReduce(function(){if(this.stations)for(let i=0;i<this.stations.length;i++){let t=this.stations[i];if(t.listings&&t.allegiance)for(let s=0;s<t.listings.length;s++){let n=t.listings[s];"Hydrogen Fuel"===n.commodity&&emit(t.allegiance,n.supply)}}},function(i,t){return Array.sum(t)},{out:{inline:1},limit:5e3});
A power_control_faction or Power is an individual or organisation who is in control of a system.

These powers have allegiance to a faction, but the systems they control do not nescessarily have the same allegiance that they do.

At the time of writing Zemina Torval is allied with the Empire and controls 47 systems.
    {   '_id': 'Zemina Torval',
        'value': {   'Alliance': 0.0,
                     'Anarchy': 0.0,
                     'Empire': 39.0,
                     'Federation': 3.0,
                     'Independent': 5.0}}]

Show the allegiance of each of the power's systems

db.systems.mapReduce(
  function(){
    emit(1, 1);
  },
  function(k,v){
    return Array.sum(v);
  },
  {
    query: {"power_control_faction": {"$exists": 1}},
    out: {inline: 1}
  }
);
db.systems.mapReduce(function(){emit(this.power_control_faction,{[this.allegiance]:1});},function(_,v){let a={"Alliance":0,"Anarchy":0,"Empire":0,"Federation":0,"Independent":0};for(let i=0;i<v.length;i++){let b=v[i];a.Alliance+=b.Alliance||0;a.Anarchy+=b.Anarchy||0;a.Empire+=b.Empire||0;a.Federation+=b.Federation||0;a.Independent+=b.Independent||0;}return a;},{out:{"inline":1},query:{"power_control_faction":{"$exists":1}},sort:{"_id":1}});
Our dataset doesn't contain the allegiance of a power:

Using the result from the previous question, guess the power's allegiance by the faction that the majority of their systems follow.

To achieve this, you'll need to use the finalize: function(k, v){} in the third argument to find the key with the largest value.

{
    "_id" : "Zemina Torval",
    "value" : "Empire"
}
db.systems.mapReduce(
  function(){
    emit(1, 1);
  },
  function(k, v){
    return Array.sum(v);
  },
  {
    finally: function(k, v){
      return v;
    },
    query: {"power_control_faction": {"$exists": 1}},
    out: {inline: 1}
  }
);
db.systems.mapReduce(function(){emit(this.power_control_faction,{[this.allegiance]:1});},function(k,v){let a={"Alliance":0,"Anarchy":0,"Empire":0,"Federation":0,"Independent":0};for(let i=0;i<v.length;i++){let b=v[i];a.Alliance+=b.Alliance||0;a.Anarchy+=b.Anarchy||0;a.Empire+=b.Empire||0;a.Federation+=b.Federation||0;a.Independent+=b.Independent||0;}return a;},{finalize:function(k,v){return Object.keys(v).reduce((a,b)=>v[a]>v[b]?a:b);},out:{"inline":1},query:{"power_control_faction":{"$exists":1}},sort:{"_id":1}});

Survey
Do you have thoughts about this website that you would like to share? Help improve NoSQLZoo!