Cookies help us deliver our services. By using our services, you agree to our use of cookies. More information

Difference between revisions of "Join"

From NoSQLZoo
Jump to: navigation, search
(Created page with "{{TopTenTips}} The examples in this page are taken from the [https://docs.mongodb.com/manual/reference/ MongoDB reference documentation]. ==Modelling Relations in Document Dat...")
 
 
(6 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
==Modelling Relations in Document Databases==
 
==Modelling Relations in Document Databases==
 
Document databases typically do not implement [https://sqlzoo.net/wiki/The_JOIN_operation JOIN operations]. Unlike relational databases, data is usually stored in a '''denormalised''' format and document structure is not enforced by default.<br/>
 
Document databases typically do not implement [https://sqlzoo.net/wiki/The_JOIN_operation JOIN operations]. Unlike relational databases, data is usually stored in a '''denormalised''' format and document structure is not enforced by default.<br/>
Instead, One-to-One and One-to-Many relationships between documents are typically done using what is known as '''embedded''' or '''nested''' documents, e.g:<br/>
+
Instead, One-to-One and One-to-Many relationships between documents are typically done using what are known as '''embedded''' or '''nested''' documents, e.g:<br/>
 
<p class="strong">One-to-One</p>
 
<p class="strong">One-to-One</p>
 
<pre>
 
<pre>
 
{
 
{
  _id: "joe",
+
    _id: "joe",
  name: "Joe Bookreader",
+
    name: "Joe Bookreader",
  address: {
+
    address: {
              street: "123 Fake Street",
+
        street: "123 Fake Street",
              city: "Faketon",
+
        city: "Faketon",
              state: "MA",
+
        state: "MA",
              zip: "12345"
+
        zip: "12345"
            }
+
    }
 
}
 
}
 
</pre>
 
</pre>
Line 20: Line 20:
 
<pre>
 
<pre>
 
{
 
{
  _id: "joe",
+
    _id: "joe",
  name: "Joe Bookreader",
+
    name: "Joe Bookreader",
  addresses: [
+
    addresses: [{
                {
+
            street: "123 Fake Street",
                  street: "123 Fake Street",
+
            city: "Faketon",
                  city: "Faketon",
+
            state: "MA",
                  state: "MA",
+
            zip: "12345"
                  zip: "12345"
+
        },
                },
+
        {
                {
+
            street: "1 Some Other Street",
                  street: "1 Some Other Street",
+
            city: "Boston",
                  city: "Boston",
+
            state: "MA",
                  state: "MA",
+
            zip: "12345"
                  zip: "12345"
+
        }
                }
+
    ]
              ]
+
}
}
 
 
</pre>
 
</pre>
  
Some document databases also allow the use of '''references''', where documents can '''point''' to other documents in the collection, allowing for Many-to-Many relationships. In the following example concerning publishers and publications, the 'books' field contains the IDs of related documents - books published by O'Reilly Media.  
+
Some document databases also allow the use of '''references''', where documents can '''point''' to other documents in the collection, allowing for Many-to-Many relationships.<br/> In the following example concerning publishers and publications, the 'books' field contains the IDs of related documents - books published by O'Reilly Media.  
 
<pre>
 
<pre>
 
{
 
{
  name: "O'Reilly Media",
+
    name: "O'Reilly Media",
  founded: 1980,
+
    founded: 1980,
  location: "CA",
+
    location: "CA",
  books: [123456789, 234567890, ...]
+
    books: [123456789, 234567890, ...]
}
+
}, {
 
 
{
 
 
     _id: 123456789,
 
     _id: 123456789,
 
     title: "MongoDB: The Definitive Guide",
 
     title: "MongoDB: The Definitive Guide",
     author: [ "Kristina Chodorow", "Mike Dirolf" ],
+
     author: ["Kristina Chodorow", "Mike Dirolf"],
 
     published_date: ISODate("2010-09-24"),
 
     published_date: ISODate("2010-09-24"),
 
     pages: 216,
 
     pages: 216,
 
     language: "English"
 
     language: "English"
}
+
}, {
 
+
    _id: 234567890,
{
+
    title: "50 Tips and Tricks for MongoDB Developer",
  _id: 234567890,
+
    author: "Kristina Chodorow",
  title: "50 Tips and Tricks for MongoDB Developer",
+
    published_date: ISODate("2011-05-06"),
  author: "Kristina Chodorow",
+
    pages: 68,
  published_date: ISODate("2011-05-06"),
+
    language: "English"
  pages: 68,
 
  language: "English"
 
 
}
 
}
 
</pre>
 
</pre>
It would be possible to embed the publisher information into each book document but this would lead to replication of data and any changes to O'Reilly's details would require updating several documents rather than just one, which in turn may lead to inconsistent data.<br/> An alternative would be to embed each book in a list inside the publisher document, but over time this would lead to large document growth which may cause the document to be relocated on disk when it outgrows its allocated space - resulting in a performance hit.<br/><br/>
+
It would be possible to embed the publisher information into each book document but this would lead to replication of data and any changes to O'Reilly's details would require updating several documents rather than just one, which in turn may lead to inconsistent data.<br/> An alternative would be to embed each book in a list inside the publisher document, but over time this would lead to large document growth which may cause the document to be relocated on disk when it outgrows its allocated space - resulting in a drop in performance.<br/><br/>
  
Using a list of references reduces this problem but does not remove it entirely. To avoid document growth entirely an additional field can be added to each book document:
+
Using a list of references reduces this problem but does not remove it entirely. To avoid this an additional field can be added to each book document:
 
<pre>
 
<pre>
 
{
 
{
  _id: "oreilly",
+
    _id: "oreilly",
  name: "O'Reilly Media",
+
    name: "O'Reilly Media",
  founded: 1980,
+
    founded: 1980,
  location: "CA"
+
    location: "CA"
}
+
}, {
 
+
    _id: 123456789,
{
+
    title: "MongoDB: The Definitive Guide",
  _id: 123456789,
+
    author: ["Kristina Chodorow", "Mike Dirolf"],
  title: "MongoDB: The Definitive Guide",
+
    published_date: ISODate("2010-09-24"),
  author: [ "Kristina Chodorow", "Mike Dirolf" ],
+
    pages: 216,
  published_date: ISODate("2010-09-24"),
+
    language: "English",
  pages: 216,
+
    publisher_id: "oreilly"
  language: "English",
+
}, {
  publisher_id: "oreilly"
+
    _id: 234567890,
}
+
    title: "50 Tips and Tricks for MongoDB Developer",
 
+
    author: "Kristina Chodorow",
{
+
    published_date: ISODate("2011-05-06"),
  _id: 234567890,
+
    pages: 68,
  title: "50 Tips and Tricks for MongoDB Developer",
+
    language: "English",
  author: "Kristina Chodorow",
+
    publisher_id: "oreilly"
  published_date: ISODate("2011-05-06"),
 
  pages: 68,
 
  language: "English",
 
  publisher_id: "oreilly"
 
 
}
 
}
 
</pre>
 
</pre>
 
Unlike relational databases, there are no primary-foreign key constraints in document databases. Therefore, operations that preserve database integrity such as  [https://www.ibm.com/support/knowledgecenter/en/SSGU8G_12.1.0/com.ibm.sqls.doc/ids_sqs_0304.htm '''cascading delete'''] must be implemented by the developer rather than the underlying database software.
 
Unlike relational databases, there are no primary-foreign key constraints in document databases. Therefore, operations that preserve database integrity such as  [https://www.ibm.com/support/knowledgecenter/en/SSGU8G_12.1.0/com.ibm.sqls.doc/ids_sqs_0304.htm '''cascading delete'''] must be implemented by the developer rather than the underlying database software.
 +
 
==MongoDB's <code>$lookup</code>==
 
==MongoDB's <code>$lookup</code>==
 
MongoDB's <code>$lookup</code> stage of the aggregation pipeline can be used to provide similar behaviour to a '''LEFT OUTER JOIN''' in SQL.<br/>
 
MongoDB's <code>$lookup</code> stage of the aggregation pipeline can be used to provide similar behaviour to a '''LEFT OUTER JOIN''' in SQL.<br/>
Line 105: Line 97:
 
<pre>
 
<pre>
 
SELECT *, inventory_docs
 
SELECT *, inventory_docs
FROM orders
+
  FROM orders
WHERE inventory_docs IN (SELECT *
+
WHERE inventory_docs IN (
FROM inventory
+
    SELECT *
WHERE sku= orders.item);
+
      FROM inventory
 +
    WHERE sku = orders.item
 +
);
 
</pre>
 
</pre>
 
<div class="hint" title="'orders' collection">
 
<div class="hint" title="'orders' collection">
 
<pre>
 
<pre>
{ "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 }
+
{"_id": 1, "item": "almonds", "price": 12, "quantity": 2}
{ "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 }
+
{"_id": 2, "item": "pecans", "price": 20, "quantity": 1}
{ "_id" : 3 }
+
{"_id": 3}
 
</pre>
 
</pre>
 
</div>
 
</div>
Line 142: Line 136:
 
</pre>
 
</pre>
 
</div>
 
</div>
<pre class=def>
+
<pre class="def"><nowiki>
 
db.orders.aggregate([
 
db.orders.aggregate([
   {
+
   {$lookup: {
    $lookup:
+
       from: "inventory",
       {
+
      localField: "item",
        from: "inventory",
+
      foreignField: "sku",
        localField: "item",
+
      as: "inventory_docs"
        foreignField: "sku",
+
    }
        as: "inventory_docs"
+
}]);</nowiki></pre>
      }
 
  }
 
])
 
</pre>
 
 
</div>
 
</div>

Latest revision as of 16:54, 18 July 2018

The examples in this page are taken from the MongoDB reference documentation.

Modelling Relations in Document Databases

Document databases typically do not implement JOIN operations. Unlike relational databases, data is usually stored in a denormalised format and document structure is not enforced by default.
Instead, One-to-One and One-to-Many relationships between documents are typically done using what are known as embedded or nested documents, e.g:

One-to-One

{
    _id: "joe",
    name: "Joe Bookreader",
    address: {
        street: "123 Fake Street",
        city: "Faketon",
        state: "MA",
        zip: "12345"
    }
}

One-to-Many

{
    _id: "joe",
    name: "Joe Bookreader",
    addresses: [{
            street: "123 Fake Street",
            city: "Faketon",
            state: "MA",
            zip: "12345"
        },
        {
            street: "1 Some Other Street",
            city: "Boston",
            state: "MA",
            zip: "12345"
        }
    ]
}

Some document databases also allow the use of references, where documents can point to other documents in the collection, allowing for Many-to-Many relationships.
In the following example concerning publishers and publications, the 'books' field contains the IDs of related documents - books published by O'Reilly Media.

{
    name: "O'Reilly Media",
    founded: 1980,
    location: "CA",
    books: [123456789, 234567890, ...]
}, {
    _id: 123456789,
    title: "MongoDB: The Definitive Guide",
    author: ["Kristina Chodorow", "Mike Dirolf"],
    published_date: ISODate("2010-09-24"),
    pages: 216,
    language: "English"
}, {
    _id: 234567890,
    title: "50 Tips and Tricks for MongoDB Developer",
    author: "Kristina Chodorow",
    published_date: ISODate("2011-05-06"),
    pages: 68,
    language: "English"
}

It would be possible to embed the publisher information into each book document but this would lead to replication of data and any changes to O'Reilly's details would require updating several documents rather than just one, which in turn may lead to inconsistent data.
An alternative would be to embed each book in a list inside the publisher document, but over time this would lead to large document growth which may cause the document to be relocated on disk when it outgrows its allocated space - resulting in a drop in performance.

Using a list of references reduces this problem but does not remove it entirely. To avoid this an additional field can be added to each book document:

{
    _id: "oreilly",
    name: "O'Reilly Media",
    founded: 1980,
    location: "CA"
}, {
    _id: 123456789,
    title: "MongoDB: The Definitive Guide",
    author: ["Kristina Chodorow", "Mike Dirolf"],
    published_date: ISODate("2010-09-24"),
    pages: 216,
    language: "English",
    publisher_id: "oreilly"
}, {
    _id: 234567890,
    title: "50 Tips and Tricks for MongoDB Developer",
    author: "Kristina Chodorow",
    published_date: ISODate("2011-05-06"),
    pages: 68,
    language: "English",
    publisher_id: "oreilly"
}

Unlike relational databases, there are no primary-foreign key constraints in document databases. Therefore, operations that preserve database integrity such as cascading delete must be implemented by the developer rather than the underlying database software.

MongoDB's $lookup

MongoDB's $lookup stage of the aggregation pipeline can be used to provide similar behaviour to a LEFT OUTER JOIN in SQL.

The following example joins documents the orders and inventory collections and is equivalent to the following SQL statement:

SELECT *, inventory_docs
  FROM orders
 WHERE inventory_docs IN (
    SELECT *
      FROM inventory
     WHERE sku = orders.item
);
{"_id": 1, "item": "almonds", "price": 12, "quantity": 2}
{"_id": 2, "item": "pecans", "price": 20, "quantity": 1}
{"_id": 3}
{
        "_id" : 1,
        "sku" : "almonds",
        "description" : "product 1",
        "instock" : 120
}
{ "_id" : 2, "sku" : "bread", "description" : "product 2", "instock" : 80 }
{
        "_id" : 3,
        "sku" : "cashews",
        "description" : "product 3",
        "instock" : 60
}
{
        "_id" : 4,
        "sku" : "pecans",
        "description" : "product 4",
        "instock" : 70
}
{ "_id" : 5, "sku" : null, "description" : "Incomplete" }
{ "_id" : 6 }
db.orders.aggregate([
   {$lookup: {
       from: "inventory",
       localField: "item",
       foreignField: "sku",
       as: "inventory_docs"
    }
}]);