performance analysis of joins in reactive querieskubitron/courses/... · in a reactive manner. this...

Performance analysis of joins in reactive queriesAmy Pavel

University of California, [email protected]

Mitar MilutinovicUniversity of California, Berkeley

[email protected]

ABSTRACTSimple document-oriented databases like MongoDB forgomany traditional features of relational databases in favor ofbetter performance. This means that applications often needto build additional features on top of the simple provided fea-tures. For example, applications often have to resolve rela-tionships between stored documents. Recursive queries in-curred by resolving relationships introduce delays. To mit-igate such delays, we developed PeerDB, a modificationto MongoDB. PeerDB optimizes for the common use casewhere the main document only requires a subset of fieldsfrom any related documents. PeerDB stores subsets of re-lated fields as subdocuments of the main document. In theory,this modification should reduce existing recursive read de-lays as applications will no longer need to recursively querydocuments in many cases. However, we have not quantita-tively evaluated PeerDB to find out whether it improves per-formance of MongoDB. In addition, there is no method forautomatically selecting which fields to embed using PeerDB.

We contribute a comparison of read and write times be-tween three system: PeerDB, MongoDB and PostgreSQL.In addition, we make each comparison with both low-levelqueries and high-level web applications. We find that al-though PeerDB contributes to faster reads than other systemsunder some conditions, this performance gain comes withtrade-offs (e.g. write time and traffic).

To help developers manage these trade-offs when choosingwhich fields to embed, we propose and evaluate a new algo-rithm for automatically selecting which fields to embed basedon a given read and write workload. Our algorithm comparesmany possible configurations using a novel cost model to pre-dict expected impact, then returns the lowest cost configura-tion. We test our algorithm under different workloads andparameter settings.

Author KeywordsDocument stores, Reactive queries, Embeddingsubdocuments, MongoDB, PeerDB, PostgreSQL, Meteor

INTRODUCTIONIn the traditional SQL world of relational databases you dojoins between related documents every time you read themfrom the database. This makes reading slower, your database

UC Berkeley CS262A Class Project, Fall 2014.PeerDB source code is available at:https://github.com/peerlibrary/meteor-peerdbBenchmarking source code is available at:https://github.com/mitar/peerdb-benchmarkDocker images used when running benchmarks are available at:https://registry.hub.docker.com/u/mitar/peerdb-benchmark/

Client Database

SQL

Q1

Client Database

PeerDB

R1

Q1

R1

MongoDB

Time

Time

Client Database

Q1

R1

Time

Q2Q3Q4Q5Q6Q7

R2R3R4

R5R6R7

Figure 1. In traditional SQL, one query is needed to obtain data spawn-ing multiple relations, but such results can be potentially large and timeto compute joins is needed. In MongoDB one has to create many subse-quent queries to resolve relations, especially in many-to-many relations.In PeerDB one has to simply fetch one document, with all relations al-ready resolved, without any computation or communication among in-stances needed.

management system is redoing the same computation of joinsfor every read, and also horizontal scaling of a database tomany instances is harder because every read might potentiallyhave to talk to other instances.

NoSQL databases like MongoDB remove relations betweendocuments and leave it to users to resolve relations on theirown. This often means fetching one document, observingwhich other documents it references, and recursively fetch-ing those as well. For many-to-many relations this leads topotentially large number of queries. Because each of thosedocuments are stand-alone and static, it is relatively easy andquick for a database management system like MongoDB tofind and return them. Such an approach is quick and it scaleshorizontally easily, but the downside is the multiple roundtrips you have to do in your code to get all documents youare interested in. In modern web applications where programlogic is often running directly in the browser, accessing datathrough a web API, those round trips become even worse be-cause those queries are coming over the Internet where la-tency is much higher.

Modern web frameworks like Meteor and DerbyJS intro-duced to the web developers community another family ofdatabase queries: reactive queries. Using those frameworksyou can make a database query where you receive results likeyou would for a normal query, but afterwards the query staysopen and any changes to the results data is pushed to youin a reactive manner. This allows you (or a framework youuse) to respond to this changes in data. For example, updatea rendered template in a browser. Programming in this waybecomes much more declarative.

The downside is complexity a programmer is faced with whenmaking queries with joins. While frameworks provide a sim-ple to use primitive for making reactive queries against oneMongoDB collection and receiving updates to documents inthat collection as they are made for the open query, pro-

https://github.com/peerlibrary/meteor-peerdb

https://github.com/mitar/peerdb-benchmark

https://registry.hub.docker.com/u/mitar/peerdb-benchmark/

https://registry.hub.docker.com/u/mitar/peerdb-benchmark/

gramming how manually resolved relations should respondto those updates in data is far from straightforward. Com-munity is slowly building libraries which are aiming to ad-dress these issues, but current solutions are made for specifictype of queries and all share a common issue based on pre-viously mentioned downside of manually resolving queries:high latency and number of queries. As query results data ischanged, queries for related documents have to be rerun againand again, even if the results of those new queries might notbe different to previous ones, or even if only a smaller queryfor just a subset of documents would satisfy.

Thus, combination of joins in NoSQL databases with reac-tive queries pose a challenge to programmers. Luckily, wecan observe that in many cases we are mostly interested onlyin few fields of a related document, again and again. Insteadof recomputing joins every time we read, we could use Mon-goDB’s sub-documents feature to embed those fields alongwith the reference. Instead of just storing the _id of a re-lated document, we could store also those few often usedfields. For example, if you are displaying blog posts, youwant to display the author’s name together with the blog post.You will not really need only the blog post without the authorname. If an example blog post document looks like:

{"_id": "frqejWeGWjDTPMj7P","body": "A simple blog post","author": "yeK7R5Lws6MSeRQad","tags": [

"k7cgWtxQpPQ3gLgxa","KMYNwr7TsZvEboXCw"

]}

Then the blog post with embedded common fields would looklike:

{"_id": "frqejWeGWjDTPMj7P","body": "A simple blog post","author": {

"_id": "yeK7R5Lws6MSeRQad","name": "Wesley Crusher","picture": "/img/yeK7R5Lws6MSeRQad.jpg"

},"tags": [

{"_id": "k7cgWtxQpPQ3gLgxa","name": "announcement","description": "Public announcements."

},{

"_id": "KMYNwr7TsZvEboXCw""name": "test","description": "Tests of the system."

}],"comments": [

{"_id": "tMgj8mF2zF3gjCftS",

"body": "A test comment."}

]}

Now we have to fetch only this one document and we haveeverything needed to display a blog post. It is easy for us toquery it and use it as any other document, with direct accessto all subfields over which we can query as well. For example,it is easy to query for all blog posts tagged with a given tag,provided that we have only its textual name to begin with. Wecan make a MongoDB query {"tags.name": "test"}to query all the documents tagged with tag test. The querycan be reactive and we can easily get updates to it as datachanges, without having to redo any queries for related docu-ments.

Observe that blog post contains also comments data. Eachcomment has a reference to a blog post of which a commentit is. Information about a relation is thus stored in commentdocuments and not in posts themselves. But we can embedthis relation in the reverse direction as a list of subdocumentsrepresenting each comment referencing the blog post.

From the latency perspective such embedding of related doc-uments makes queries much faster when reading. Reactivequeries do not have to requery related documents. From theprogrammer’s perspective this is much simpler to use becausethey can use familiar MongoDB queries directly.

Now, storing the author’s name along with every blog postdocument brings an issue common when one is dealing withdenormalized data. What if user changes their name? Thenwe have to update all those fields in documents referencingthe user. So we would have to make sure that anywhere in ourcode where we are changing the name, you are also updatingfields in references. What about changes to the database com-ing from outside of our code?

To address this issue and make easier for a programmer to em-bed subdocuments of related documents we designed and de-veloper a layer on top of MongoDB called PeerDB. Program-mer can now in a declarative way specify relations betweendocuments and which fields they would like to embed. Theyhave to define those references once and then PeerDB makessure data stays in sync as it is changed across documents.It does not matter where the changes come from, it will de-tect them and update fields in referenced sub-documents ac-cordingly. Instead of having to make recursive queries everytime you read documents with related documents, PeerDBmake those recursive queries once and store them into em-bedded documents. The downside is that every modificationto the data in the database triggers updates in documents inother collections. While modifications to the document arestored quickly, changes have to propagate across all otherdocuments.

To better understand performance characteristics of PeerDBwe first present its implementation and then performanceanalysis approach and results. We find that PeerDB performsbetter for reads than MongoDB alone and PostgreSQL for ourbenchmarks, when we manually select which fields to embed.

Next, we present an algorithm to automatically pick whichfields to embed based on expected impact of reads and writesunder a user-specified workload. We find that our algorithmproduces results that agree with intuition. We conclude withour observations and discuss possible future work.

RELATED WORKPeerDB is related to existing research in database cracking,materialized views, customizing key value stores, and discus-sions in the MongoDB development community. Our auto-matic embedding algorithm is related to work in automati-cally tuning databases.

Materialized viewsA view is a function from a set of base tables to a derived ta-ble. A materialized view is where you store tuples of a viewin the database itself. This way, database accesses to the ma-terialized view can be faster than recomputing the view, es-pecially when computing the view is expensive [9]. Materi-alized views for expensive queries would not work very wellif you needed to recompute the materialized view wheneverrecords were added or deleted, so past work has addressedhow to efficiently maintain materialized views with incre-mental updates [14, 4, 9, 20, 21].

In general, materialized views relate to our work because theycreate and update copies of data to decrease response timefor expensive queries. While materialized views can be con-structed for arbitrary SQL query, PeerDB limits embeddingonly to fields for natural joins between documents. By this itsimplifies logic and addresses the most common use case.

Column store architecture and database crackingAs we are restructuring our data to optimize read time, col-umn store databases and database cracking are related to ourproject. A column store relational database stores the valuesfor each attribute contiguously, unlike traditional row storedatabases that store attributes of a record contiguously [16].Such column store databases allow the DBMS to read onlythe values of the columns required for processing a givenquery so that they perform better in read-mostly applications.Products such as Synbase IQ and KDB have demonstratedthat this architecture improves performance [16, 7].

Idreos et al. provide an optimization for column stores calleddatabase cracking [13]. In database cracking, a column Ais copied as ACRK when A is first queried. Then, ACRK

is physically organized so that the values that satisfy thequery are stored in contiguous space. Thus, database crack-ing speeds up subsequent queries for similar values. Followup work proposed algorithms for updating cracked databasesunder high-volume insertions/deletions [11], and algorithmsfor increasing efficiency of tuple reconstruction for multi-attribute queries [12].

We are inspired by these successful methods for copying andreorganizing data to speed up read-heavy applications, butthose methods target optimizations at the database level it-self in a way how data is stored on permanent storage, whilewe are addressing how data itself is structured and send to theclient to lower number of queries and query time.

Adding functionality to non-relational databasesNoSQL-style databases sacrifice “one-size-fits-all” function-ality for speed [17]. Thus, programmers build extra func-tionality on top of the simple database to satisfy application-specific needs. Many research papers detail systems that sup-plement NoSQL databases to support complex queries, ACIDproperties, and SLAs [6, 5, 3, 2].

To our knowledge, no existing academic work addresses howto decrease the number of database round trips required to re-solve recursive object relations in document store databasessuch as MongoDB. The MongoDB manual [15] and a NoSQLsurvey paper [17] notes application categories that informwhen to (A) embed related objects and when to (B) referencerelated objects. However, many applications do not com-fortably fit either category. So, application developers wroteguides for normalizing objects so that you can both embedand reference objects [18]. Unfortunately, following theseguides requires a lot of effort and introduces opportunity forerror. Further, no one has quantified benefits or downsides ofdenormalizing objects in MongoDB.

Automatically tuning databasesOur work provides an algorithm to automatically suggestwhich fields to embed for a given workload. Therefore, workon automatically tuning databases is relevant.

Several past works suggest methods for automatically select-ing database indexes and materialized views [8, 22, 10, 19,1]. We were particularly inspired by the work of Agrawalet. al. which enumerated many configurations of material-ized views, then compared the configurations based on theirexpected impact on the sum of the cost of queries in a work-load [1]. Many follow up papers followed a similar method-ology [8, 22, 10].

As these works are intended for specific SQL configura-tions, we could not directly apply these methods. However,our algorithm is structured similarly to the algorithm pro-posed in [1]: we enumerate many configurations of embed-ded fields, then compare these configurations based on theirexpected impact on the sum of the cost of reads and writesof a given workload. We propose and evaluate a novel costmodel to evaluate proposed configurations of embedded vs.referenced fields based on a given workload.

PEERDB IMPLEMENTATIONPeerDB is implemented as a library for Meteor web frame-work. It provides a declarative way to define database schemafor documents and relations between document for your Me-teor web application. Together with specifying relations, youcan also specify which fields should be embedded as subdoc-uments for both forward and backwards relations. Currently,PeerDB requires programmer to decide if and which fields toembed. That decision can often be made on program’s logicand profiling of queries, in a similar way one would be decid-ing about existence of an index on a field. Both embeddingand indexes make a trade-off between faster read times onan expense of write times. In later sections we analyze thistrade-off and present some general guidelines and automaticalgorithm to help deciding if you want to embed fields or not.

Additionally, PeerDB provides an easy way to define client-side generator fields (fields which value is computed basedon values of other fields) and generic triggers, which canbe implemented in the client-side language and are run inPeerDB process instead of inside a database management sys-tem. This allows easy integration and code reuse with the restof the application.

It uses an abstraction over MongoDB oplog to implement allabove mentioned features. MongoDB oplog originally servesfor replication among multiple MongoDB instances, sendinga stream of all changes from master to slaves. We can con-nect to this same oplog to observe all changes in the databaseand determine if a change is connected to a field which is em-bedded in a related document, or otherwise observed as partof generated fields and triggers. If this is so, PeerDB issuesupdate queries which update all those fields in subdocumentsin related documents.

In this way original queries return immediately. Writes finishas soon as they would finish without using PeerDB. PeerDBthen in an asynchronous way make data consistent, or bettereventually consistent. So while writes themselves are still asfast as they can be under MongoDB, an important metric toobserve is time to consistency, when PeerDB stops sendingany more queries to update the data after a modification ismade.

Such asynchronous and decoupled architecture allows us touse PeerDB with other programs modifying the database. Wecan run PeerDB as a dedicated instance while the rest of theprogram can be written even in another language or system.Multiple different systems can all work on the same databaseand PeerDB will still make sure that data is consistent. Thisallows us to support legacy applications and modern web ap-plications in the cloud which are often created from multipleseparate services working together on one database throughvarious APIs.

Because PeerDB is decoupled from the database manage-ment system itself, it does not interfere with their internaloperations. This makes the whole system more robust be-cause interfaces between components have clean borders, butit leaves some optimizations opportunities for future workwhere PeerDB could be smarter in update queries it makesbased on information available to the database managementsystem internally.

Asynchronous operation addresses issues with possible infi-nite loops which might happen in some other implementation,for example where updates would be made synchronously in-side a post-query hook. Updates to data by PeerDB updatescan lead to further updates by PeerDB if they modify dataobserved in another relations, generated fields, or triggers.A loop could occur in a post-query hook implementation ifPeerDB updates would run synchronously unconditionally,running queries against other observed fields (but not neces-sary modifying them) which would then immediately triggermore queries, trigger more, and so on. By observing modi-fications of the data directly, instead of hooking into queries,PeerDB does not have to deal with parsing and understanding

of queries, and updates are run only when data really changes,not when query merely uses an observed field. Loops canstill occur if programmer links triggers in a way which mutu-ally modify the data, for example, fields which increment thevalue based on the another field which is itself generated byincrementing the value from the first field. Detection of suchkind of loops is left for future work.

As currently implemented, PeerDB issues updates queriesin a straightforward way, without any optimizations whichmight reduce the number of unnecessary updates queries.Currently queries are issued and are left to MongoDB to de-termine that a particular query does not have anything to up-date.

PeerDB runs as a background process inside Meteor applica-tion, observing changes in the database and issuing the up-dates. To scale, it provides a way to run as multiple separateprocesses/instances to distribute the load, each instance ob-serving and reacting to just a subset of all documents basedon their ID.

It is implemented in CoffeeScript and available asopen source library at https://github.com/peerlibrary/meteor-peerdb.

PERFORMANCE ANALYSISIn this section we present the performance analysis and com-parison of PeerDB-enabled queries vs. traditional approachesusing relational DBMS and NoSQL database.

SetupWe approached the performance analysis by designing a sim-ple data schema which consist of a one-to-many relation,many-to-many relation, and a reverse relation. To easier un-derstand it and discuss it, we assign to entities in the schemameaningful names of a simple blog application. Figure 2shows the entities and relations between them.

Person

NameBioPicture

Tag

NameDescription

Post

BodyAuthor (person_id)Tags (tag ids)

Comment

BodyPost (post_id)

Blog application schema

Figure 2. A blog post is the main entity we are querying. It has one-to-many author relation to person entity and many-to-many tags relationto tag entity. Comment entity has a one-to-many post relation to the postentity, for which this relation is a reverse relation where we are interestedin all comments made for a given post.

To measure performance we decided to use a query whichuses all relatons in the schema. By querying for a blog postdocument, in addition to the post data itself, we want to get:

• name and picture of the author

• name and description of all tags of the blog post

• body of all blog post’s comments



Entity Number of documentsPerson 100Tags 100Posts 1000Comments 10000

Table 1. Basic number of documents used in the benchmark.

Field Size in bytesPerson name 11Person bio 1000Person picture 10Tag name 11Tag description 10Post body 1000Comment body 10

Table 2. Basic size of fields in documents used in the benchmark. Em-phasized are fields which we varied in size.

The concrete query we used is as follows: based on a stringtag name, obtain all blog posts which are tagged with that tag,and for each blog post above mentioned additional data haveto be available.

To be able to compare the measurements we implemented thisschema and query in multiple systems:

• using PostgreSQL relational DBMS with low-level queriesin Python

• using PostgreSQL relational DBMS with high-levelqueries in Django, a Python based web framework

• using MongoDB with low-level queries in Python

• using MongoDB with high-level queries in Meteor

• using MongoDB with PeerDB in Meteor

• using MongoDB with PeerDB in Python

The motivation was to measure both low-level and high-leveldatabase interfaces to be able to compare difference betweenusing a high-level web framework and not. To measure bothtraditional relational DBMS and NoSQL one. And of courseto see how it works when using PeerDB and when not.

Basic number of documents used in the benchmark are shownin table 1, and basic size of fields in documents used in thebenchmark in table 2. Basic number of tags per post was10. We uniformly distributed all comments across all blogposts. For future work we could consider using a long-taileddistribution.

We scaled this basic numbers and sizes to vary:

• number of documents, we used multiplication factors 1, 2,4, 6, 8, 10

• size of person picture, tag description, and comment bodyfields in documents, we used sizes 10, 100, 1000, 10000,100000

When varying one dimension we kept the other at its basicvalue.

In addition, for two PeerDB based systems (Meteor andPython) we additionally vary how many PeerDB instances are

PostgreSQL Python DjangoPostgreSQL Python onlyMongoDB Meteor onlyMongoDB Python onlyMongoDB Python PeerDBMongoDB Meteor PeerDB

1 2 3 4 5 6 7 8 9 100

250

500

750

1000

1250

1500

1750

2000

2250

2500

2750

3000

Number of documents multiplier

Tim

e(s

)

Figure 3. Read time vs. number of documents. Number of documents isshown as a multiplier of basic numbers shown in table 1.

observing and handling the changes in the database, runningall above variations on 1, 2, 4, 6, 8 and 10 instances.

ExecutionFor execution of the benchmark we prepared a Docker im-age with Ubuntu 14.04.1 Linux as the base system, and Post-greSQL 9.3.5 and MongoDB 2.4.9. We run that image onthree Linux based 64bit systems, each with 32 GB of mem-ory and 2 Intel Xeon X5670 2.93GHz processors with 6 coreseach processor. We varied the type of a running program andparameters when running an image to obtain the followingmeasurements.

ResultsResults of measurements based on varying number of relateddocuments are shown in figures 3, 4, and 5. One can seethat reads are in general much faster in MongoDB than inPosgreSQL and that low-level approaches are faster as well.It is clear that reading is fastest using Meteor and PeerDBand stays constant no matter the number of related documentsas it is always needed to read only one document instead ofmultiple for relations.

Write times show that Django does not scale well as numberof inserts increases. There is unexplained growth for whenusing Meteor PeerDB once we grow over a certain threshold,but in general writes are quick.

On the other side figure 4 shows the downside of usingPeerDB. Convergence times are huge as PeerDB takes time tomake sure everything is consistent. More changes are made,longer this time is. In this case we made many changes atonce and then waited for all updates to finish.

Figures 6, 7, and 8 show measurements when we varied sizeof embedded documents. When reading, size matter ini-tially, but later all except Django converge to approximatelythe same times, when time to transfer overpowers everything

PostgreSQL Python DjangoMongoDB Meteor PeerDBPostgreSQL Python onlyMongoDB Meteor onlyMongoDB Python PeerDBMongoDB Python only

1 2 3 4 5 6 7 8 9 100

250

500

750

1000

1250

1500

1750

2000

2250

2500

2750

3000


Tim

e(s

)

Figure 4. Write time vs. number of documents. Number of documentsis shown as a multiplier of basic numbers shown in table 1.

MongoDB Python PeerDBMongoDB Meteor PeerDB

1 2 3 4 5 6 7 8 9 100

5000

10000

15000

20000

25000

30000

35000

40000


Tim

e(s

)

Figure 5. Convergence time vs. number of documents. Number of doc-uments is shown as a multiplier of basic numbers shown in table 1.

PostgreSQL Python DjangoPostgreSQL Python onlyMongoDB Python onlyMongoDB Meteor onlyMongoDB Meteor PeerDB

0 25000 50000 75000 1000000

50

100

150

200

Size of embedded fields

Tim

e(s

)

Figure 6. Read time vs. size of embedded fields. Size is shown in bytes.

MongoDB Meteor onlyMongoDB Meteor PeerDBPostgreSQL Python DjangoPostgreSQL Python onlyMongoDB Python only

0 25000 50000 75000 1000000

1000

2000

3000

4000

5000

6000


Tim

e(s

)

Figure 7. Write time vs. size of embedded fields. Size is shown in bytes.

MongoDB Meteor PeerDB

0 2500 5000 7500 100000

200

400

600

800

1000


Tim

e(s

)

Figure 8. Convergance time vs. size of embedded fields. Size is shown inbytes.

else. Similarly when writing, it seems that main time is takenwhen transmitting data once fields become large. It seemsthat writing is slow with Meteor, probably because of all theserialization and deserialization which is taking place inter-nally.

Convergence time grows, but not as drastically as whenwe are varying number of documents, because number ofPeerDB update queries is staying the same and it seems thatnumber of them is the main reason why convergence takestime.

AUTOMATED EMBEDDED FIELD SELECTIONPeerDB lets you embed fields to obtain faster reads on thosefields at the expense of more time for writes and highernetwork traffic. For the benchmarks, we manually selectedwhich fields to embed for the post document. Here we presenta general algorithm for picking fields to embed in documentbased on the expected impact of embedding vs referencing aset of fields under a given workload.

OverviewOur algorithm optimizes the configuration of embedded fieldsin a main document D according to a workload specification.For instance, if a post document D references an author doc-ument, we could choose to embed all of the author’s fields,none of the authors fields or some combination of these op-tions. To choose between many possible configurations, ouralgorithm takes into account the expected field read and fieldwrite workloads specified by the user along with expectedfield size. Then, the algorithm assigns a cost to each config-uration based on an approximate cost model. The algorithmreturns the user the lowest cost configuration. The user usesthis configuration to specify which fields to embed under theirexpected workload.

Workload specification

For queries to a given document, D, users need to sup-ply how often they will query each all document fields andfields in related documents DR. Specifically, for all fieldsfi ∈ D,DR users specify, an expected frequency of readsoriginating from D for each field, ER(fi), an expected fre-quency of writes for each field, EW (fi), and an expected sizeof each field ES(fi).

Cost modelOur cost model assigns a cost to a document configurationC ={c1, c2, ...cn|ci = 1 if field fi ∈ DR is embedded } con-sidering expected read time, R, write time W , and traffic Tunder a given workload specification, S.

Cost(C, S) = wR ∗R+ wW ∗W + wT ∗ TThe parameters wR and wW allow the user to set the impor-tance of the read-time and write-time respectively. For in-stance, the user may set the wR higher than wW in the casethat website visitors read documents, but only admins writedocuments. As we do not consider the network traffic of callsother than those for the given document and it’s fields, wT

can be used to adjust the importance of keeping the numberof messages passed as a result of document calls to a mini-mum.

Computing read-time term, RThe read-time term takes into account the read-times for alldocument fields and referenced document fields in a givenconfiguration, C, for a given workload, S. F means all fieldsfi. We define F0 as all fields fi where ci = 0 and F1 asall fields fi where ci = 1. FD encompasses all fields in thebase document which are not assigned in the configuration.The equation for R also takes into account the time to requestdata, M , and the time to read one KB of data K:

R =[M +max(ER(fi)∀fi ∈ F1 ∪ FD) ∗∑

fi∈F1∪FD

K ∗ ES(fi)

+∑

DRi∈DR

{M +max(ER(fi)∀fi ∈ DRi∩ F0)

∗∑

fi∈DRi∩F0

K ∗ ES(fi)}+ P ]

The first summation calculates the expected read time for alldocument fields (FD) and embedded fields (F1). The secondsummation adds the expected read time for all secondary re-quests for referenced documents and all time required to readthe fields (F0) of those documents. Because we assume weread all data from each document when we fetch it, the maxterms calculate how many times we need to fetch each docu-ment to match the required field reads. Overall, this equationshows that embedding a field saves read time for secondaryrequests incurred by referenced documents and their fields.We can see that embedding fields will decrease read-time incases where the cost of a secondary request is high, or wherethe referenced documents are large.

Finally, P is the penalty for inconsistency. If a read occursbefore the write data is consistent, the read must recursivelyreference the referred document. P accounts for the addi-tional time for recursively referencing the referred document

as it was not covered in the other summations. For each em-bedded document, we add back on the time to retrieve thereferenced document multiplied by the chance that the fieldis read while the data is in consistent, I(fi), which dependsthe frequency of writes and reads. Here DRF1

refers to alldocuments that have an embedded field.

P = I ∗∑

DRi∈DRF1

[M +∑

fi∈DRi

K ∗ ES(fi) ∗ ER(fi)]

In our implementation we set I(fi) = 0∀fi ∈ F . In futurework, we will use detailed analysis to figure out how to relatethe frequency of writes and reads to the chance that the readdata is inconsistent. Like M and K, this term will be uniqueto each system.

Computing the write-time term WThe write-time term takes into account the write times for allfi ∈ D,DR for a given configuration C and workload S.

W =[∑fi∈F1

K ∗ ES(fi) ∗ EW (fi)

+∑fi∈F

K ∗ ES(fi) ∗ EW (fi)]

The first summation counts time for embedded fields, and thesecond summation counts time for all fields. This means em-bedded fields require two writes to update the referenced doc-ument whereas referenced documents only require one write.The write-time term W displays that embedding fields willnot be beneficial in write-heavy workloads.

Computing the traffic term, TWe add the term T with its weight wT to allow the user toadjust how important it is to keep the network traffic fromPeerDB low. T is the number of messages passed based onreads and writes.

T =2 ∗ |D reads|+ 2 ∗ |DR reads|+ 4 ∗ |embedded writes|+ 2 ∗ |referenced writes|

Picking a configurationThe total number of possible configurations for any schema is2n where n represents the number of fields in referenced doc-uments. In our implementation, n is small so we use a bruteforce approach to find the configuration with the lowest cost.In cases where n is large, we could apply a greedy algorithmor simulated annealing to find a near-optimal answer.

EvaluationWe implemented our model and tested it on one schema (Fig-ure 9) under several different workload specifications and pa-rameter weightings.

First, we ran our algorithm with the illustrated schema (Fig-ure 9) and workload specification (Table 3) with the parame-ter settings wR = 1, wW = 0, wT = 0. Under these settings,the algorithm tells us that the lowest cost configuration em-beds the author name and tag name, but no other fields. Thisis not surprising: only considering the read-time term, we ex-pect fields often read with post that are not large to be embed-ded. The author picture field is large and only occasionally

AuthorBodyTags

PostNameDescription

TagNameBioPicture

Author

Figure 9. We tested our algorithm with a schema where post was themain document D, and the post referenced related documents authorand tags. All possible configurations of post are embedding all relateddocument fields, none of the related document fields, or any combina-tions of the related document fields.

Field ES (KB) ER (reads/day) EW (writes/day)Author name .1 1000 1Author bio 1 0 5Author picture 1000 50 5Tag name .1 1000 1Tag description 1 0 2Post body 10 1000 1

Table 3. Here is a sample workload specification for our schema. Wequantify expected reads and writes per day over fields from all post doc-uments. In this case, we have a read-heavy workload where most viewsinclude the author name, post body, and tag names with the post, butonly some views include the author picture.

read with D. As a result, the algorithm does not embed theauthor picture to avoid fetching it on every post query.

Setting the parameters to wR = 0, wW = 1, wT = 0 (onlyconsidering writes), the algorithm tells us to not embed anyfields, because every embedded field incurs extra write time.Finally, setting the parameters to only pay attention to themessages passed, wR = 0, wW = 0, wT = 1 we get that weshould embed all fields except author bio and tag description.The algorithm picks this configuration because we provided aread-heavy workload, and each non-embedded field requiresextra messages to read.

For this schema and workload, if we set the parameters towR = 1, wW = 1, wT = 0 we unsurprisingly get thesame configuration as suggested by considering the read timealone. However, if we switch the expected read values in theworkload specification with the expected write values in thespecification and leave the parameters as wR = 1, wW =1, wT = 0, the algorithm tells us to not embed any fields.

Overall, our algorithm agrees with intuition in these cases.Such an algorithm can pick the lowest cost fields to embedbased on a schema and an expected workload specification.In future work we will rigorously test our model to make sureit matches the benchmark expectations.

ImplementationThis algorithm was implemented in unoptimized Python codeon a MacBook Pro. Each run of the algorithm to considerall embeddings and return the lowest cost configuration tookaround 2ms.

DiscussionAlthough useful for many applications, our model makes sev-eral simplifications. For instance, our model does not takeinto account the reverse field feature of PeerDB. Further, ourmodel does not take into account the possibility of embeddingan entire referenced document instead of using PeerDB. Em-bedding without PeerDB the document instead of providing

a reference is helpful in cases where you only reference theembedded document from the parent document [15]. In thefuture, we will add this option by augmenting the model andworkload specification to include querying documents asidefrom the main document D. Our model also assumes thatyou read all information in a document when you fetch it. Inthe future we could let the user specify which combinationsof fields they would read and adjust our cost model to handlesuch specifications.

We also made other minor simplifications, such as using thesame constants K and M for all writes and reads. In reality,these numbers are different for writes and reads. In the future,we could provide an application to learn these constants forany given system.

FUTURE WORKOur work leads to three main directions for future work: im-provements of PeerDB, additional benchmarking, improve-ments to the automatic embedding algorithm and evaluation.This section will address each of these future work areas inturn.

Improving PeerDBOne way to improve PeerDB is to further optimize queries.As PeerDB is decoupled from the database management sys-tem, it is unable to take advantage of its information to op-timize queries. In the future, we could more closely inte-grate PeerDB with the underlying database to better optimizequeries. Further, PeerDB does not handle detection of updateloops. In the future, we could apply existing loop detectionalgorithms to figure out when a loop occurs in the system.

Benchmarking PeerDBWe benchmarked PeerDB and compared it to other databasesconsidering several variables and implementations. In the fu-ture, we will run more real-world benchmarks based on mod-eling our data to reflect real world trends (e.g. long tail com-ment distributions). We will also collect data from users ofPeerDB to conduct comparisons of PeerDB to MongoDb andPostgreSQL using real world data.

Improving the automatic algorithmOne direction for improving the automatic algorithm is ex-tending it’s applicability to more use cases. For instance,the algorithm does not currently consider all possible sets ofqueries (only those that strictly augment the smallest set). Infuture work, we extend the model to encompass more querysets. As mentioned before, future work may also extend thealgorithm to handle cases where we embed entire subdoc-uments without having the document separately referenced(no help from PeerDB). In addition, we want to extend it tohandle reverse queries. Another direction for improving theautomatic algorithm is verifying and enhancing the accuracyof the cost model and input parameters. Future work couldlearn parameters, simulate possible configurations for greateraccuracy, and compare our current cost model closely to ac-tual system performance. We could also evaluate low costconfigurations using our benchmarking system against highcost configurations to see if relative performance for these

configurations confirms intuition. One final direction for im-proving the automatic algorithm is assuring that it works atscale. Currently, our algorithm uses brute force to calculatethe lowest cost embeddings. In cases where there are morepossible combinations of embeddings we could use an exist-ing optimization technique to find an approximate configura-tion. For instance, we could apply simulated annealing or agreedy approach.

CONCLUSIONSWe have described and evaluated the PeerDB system. In par-ticular, we have shown that PeerDB outperforms MongoDBand PostgreSQL in read mostly workloads under some condi-tions. We also displayed that this performacne increase comesat the expense of higher write times and more network traf-fic. To help developers manage these trade offs, we have pre-sented and evaluated an automatic algorithm to suggest a lowcost configuration with a given workload. Overall, this pa-per contributes an evaluation which suggests benefts of em-bedding fields with PeerDB, and algorithms to automaticallypick which fields to embed.

REFERENCES1. Agrawal, S., Chaudhuri, S., and Narasayya, V.

Automated Selection of Materialized Views and Indexesfor SQL Databases. VLDB 26, Section 5 (2000).

2. Baker, J., Bond, C., Corbett, J. C., Furman, J. J., Khorlin,A., Larson, J., Jean-michel, L., Li, Y., Lloyd, A., andYushprakh, V. Megastore : Providing Scalable , HighlyAvailable Storage for Interactive Services. 223–234.

3. Beaver, D., Kumar, S., Li, H. C., Sobel, J., and Vajgel, P.Finding a needle in Haystack: Facebook’s photo storage.OSDI (2010).

4. Blakeley, J. A., Larson, P.-A., and Tompa, F. W.Efficiently Updating Materialized Views. ACM (1986),61–71.

5. Chang, F., Dean, J., Ghemawat, S., Hsieh, W. C.,Wallach, D. A., Burrows, M., Chandra, T., Fikes, A., andGruber, R. E. Bigtable: A Distributed Storage Systemfor Structured Data. OSDI (2006).

6. Decandia, G., Hastorun, D., Jampani, M., Kakulapati,G., Lakshman, A., Pilchin, A., Sivasubramanian, S.,Vosshall, P., and Vogels, W. Dynamo: Amazons HighlyAvailable Key-value Store. SOSP (2007), 205–220.

7. French, C. One Size Fits All Database Architectures DoNot Work for DSS. SIGMOD (1995), 449–450.

8. Goldstein, J., and Larson, P.-a. k. Optimizing QueriesUsing Materialized Views: A Practical, ScalableSolution. SIGMOD (2001), 331–342.

9. Gupta, A., and Mumick, I. S. Maintenance ofMaterialized Views : Problems , Techniques , andApplications. 1–16.

10. Gupta, H., and Mumick, I. Selection of views tomaterialize in a data warehouse. IEEE Transactions onKnowledge and Data Engineering 17, 1 (Jan. 2005),24–43.

11. Idreos, S., Kersten, M. L., and Manegold, S. Updating acracked database. SIGMOD (2007), 413–424.

12. Idreos, S., Kersten, M. L., and Manegold, S.Self-organizing tuple reconstruction in column-stores.SIGMOD 35 (2009), 297–309.

13. Idreos, S., Manegold, S., and Kersten, M. Databasecracking. CIDR 3 (2007), 1–8.

14. Larson, P.-A., and Yang, H. Z. Computing Queries fromDerived Relations *. VLDB (1985), 259–269.

15. MongoDB. Data Modeling Introduction, 2014.

16. Stonebraker, M., Abadi, D. J., Batkin, A., Chen, X.,Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden,S., Neil, E. O., Neil, P. O., Rasin, A., Tran, N., andZdonik, S. C-Store : A Column-oriented DBMS. VLDB31 (2005), 553–564.

17. Strauch, C. NoSQL Databases.

18. Wanschik, T. JOINs via denormalization for NoSQLcoders, 2010.

19. Yang, J., Karlapalem, K., and Li, Q. Algorithms forMaterialized View Design in Data WarehousingEnvironment. VLDB 23 (1997).

20. Zhou, J., Larson, P.-A., and Elmongui, H. G. LazyMaintenance of Materialized Views. VLDB (2007),231–242.

21. Zhou, J., Larson, P.-A., Goldstein, J., and Ding, L.Dynamic Materialized Views. ICDE 23 (2007),526–535.

22. Zilio, D. C., Zuzarte, C., Lohman, G. M., Cochrane,R. J., Lohman, M., Colby, J., Colby, L., and Valentin, G.Recommending Materialized Views and Indexes withthe IBM DB2 Design Advisor. IEEE ICAC (2004).

performance analysis of joins in reactive querieskubitron/courses/... · in a reactive manner. this...

Documents