Difference between revisions of "AGGREGATE world"

Revision as of 12:36, 16 July 2015

#ENCODING
import io
import sys
sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-16')
#MONGO
from pymongo import MongoClient
client = MongoClient()
client.progzoo.authenticate('scott','tiger')
db = client['progzoo']
#PRETTY
import pprint
pp = pprint.PrettyPrinter(indent=4)

Country Profile

For these questions you should use find() on the collection world

Show the name and population for the countries that have a population of at least 200 million.

pp.pprint(list(
    db.world.find({},{"name":1,"_id":0})
))

pp.pprint(list(

   db.world.find({"population":{"$gt":20000000}},{"name":1,"population":1,"_id":0})

))

Give the name and the per capita GDP for those countries with a population of at least 200 million.

per capita GDP is the GDP divided by the population GDP/population

The aggregation framework is a data processing pipeline. There are many operators that you can use.
$match uses a query to limit or 'filter' what documents are to be used in the next stage of the pipeline.
$project is used to "shape" documents by adding or removing fields. It also allows you to compare fields with the syntax $<fieldname>

pp.pprint(list(
    db.world.aggregate([
        {"$match":{
            "population":{"$gte":250000000}
        }},
        {"$project":{
            "_id":0,
            "name":1,
            "per capita GDP": {"$divide": ["$gdp",1000000]}
        }}
    ])
))

pp.pprint(list(

   db.world.aggregate([
       {"$match":{
           "population":{"$gte":250000000}
       }},
       {"$project":{
           "_id":0,
           "name":1,
           "per capita GDP": {"$divide": ["$gdp","$population"]}
       }}
   ])

))