postgresql - the buildpostgresql as a schemaless database. christophe pettus postgresql experts,...
TRANSCRIPT
![Page 1: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/1.jpg)
PostgreSQLas a Schemaless Database.
Christophe PettusPostgreSQL Experts, Inc.
OSCON 2013
![Page 2: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/2.jpg)
Welcome!
• I’m Christophe.
• PostgreSQL person since 1997.
• Consultant with PostgreSQL Experts, Inc.
• thebuild.com
• @xof on Twitter.
![Page 3: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/3.jpg)
What’s on the menu?
• What is a schemaless database?
• How can you use PostgreSQL to store schemaless data?
• How does do the various schemaless options perform?
![Page 4: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/4.jpg)
A note on NoSQL.
• Worst. Term. Ever.
• It’s true that all modern schemaless databases do not use SQL, but…
• Neither did Postgres before it became PostgreSQL. (Remember QUEL?)
• The defining characteristic is the lack of a fixed schema.
![Page 5: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/5.jpg)
Schematic.
• A schema is a fixed (although mutable over time) definition of the data.
• Database to schema (unfortunate term) to table to field/column/attribute.
• Individual fields can be optional (NULL).
• Adding new columns requires a schema change.
![Page 6: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/6.jpg)
Rock-n-Roll!
• Schemaless databases store “documents” rather than rows.
• They have internal structure, but…
• … that structure is per document.
• No fields! No schemas! Make up whatever you like!
![Page 7: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/7.jpg)
We are not amused.
• Culturally, very different from the glass house data warehouse model.
• Grew out of the need for persistent object storage…
• … and impatience with the (perceived) limitations of relational databases and object-relational managers.
![Page 8: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/8.jpg)
Let us never speak of this again.• There’s a lot to talk about in schemaless vs
traditional relational databases.
• But let’s not.
• Today’s topic: If you want to store schemaless data in PostgreSQL, how can you?
• And what can you expect?
![Page 9: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/9.jpg)
What is schemaless data?
• Schemaless does not mean unstructured.
• Each “document” (=record/row) is a hierarchical structure of arrays and key-value pairs.
• The application knows what to expect in one of these…
• … and how to react if it doesn’t get it.
![Page 10: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/10.jpg)
PostgreSQL has you covered.• Not one, not two, but three different
document types:
• XML
• hstore
• JSON
• Let’s see what they’ve got.
![Page 11: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/11.jpg)
XML
It seemed like a good idea at the time.
![Page 12: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/12.jpg)
XML
• Been around since the mid-1990s.
• Hierarchical structured data based on SGML.
• Underlying technology for SOAP and a lot of other stuff that was really popular for a while.
• Still super-popular in the Java world.
![Page 13: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/13.jpg)
XML, your dad’s document language.• Can specify XML schemas using DTDs.
• No one does this.
• Can do automatic transformations of XML into other markups using XSLT.
• Only the masochistic do this.
• Let’s not forget the most important use of XML!
![Page 14: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/14.jpg)
<Server port="8005" shutdown="SHUTDOWN" debug="0"> <Service name="Tomcat-Standalone"> <Connector className="org.apache.catalina.connector.http.HttpConnector" port="8080" minProcessors="5" maxProcessors="75" enableLookups="true" redirectPort="8443" acceptCount="10" debug="0" connectionTimeout="60000"/> <Engine name="Standalone" defaultHost="localhost" debug="0"> <Logger className="org.apache.catalina.logger.FileLogger" prefix="catalina_log." suffix=".txt" timestamp="true"/> <Realm className="org.apache.catalina.realm.MemoryRealm" /> <Host name="localhost" debug="0" appBase="webapps" unpackWARs="true"> <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" prefix="localhost_access_log." suffix=".txt" pattern="common"/> <Logger className="org.apache.catalina.logger.FileLogger" directory="logs" prefix="localhost_log." suffix=".txt" timestamp="true"/> <Context path="/examples" docBase="examples" debug="0" reloadable="true"> <Logger className="org.apache.catalina.logger.FileLogger" prefix="localhost_examples_log." suffix=".txt" timestamp="true"/> </Context> </Host> </Engine> </Service></Server>
Tomcat Configuration Files.
![Page 15: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/15.jpg)
• Built-in type.
• Can handle documents up to 2 gigabytes.
• A healthy selection of XML operators.
• xpath in particular.
• Very convenient XML export functions.
• Great for external XML requirements.
XML Support in PostgreSQL.
![Page 16: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/16.jpg)
XML Indexing.
• There isn’t any.
• Unless you build it yourself with an expression index.
• Functionality is great.
• Performance is… we’ll talk about this later.
![Page 17: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/17.jpg)
hstore
The hidden gem of contrib/
![Page 18: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/18.jpg)
hstore
• A key/value storage type specific to PostgreSQL.
• Maps string keys to string values.
• Contrib module; not part of the PostgreSQL core.
![Page 19: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/19.jpg)
hstore functions
• Lots and lots and lots of hstore functions.
• h->”a” (get value for key a).
• h?”a” (does h contain key a?).
• h@>”a->2” (does key a contain 2?).
• Many others.
![Page 20: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/20.jpg)
hstore indexing.
• Can create GiST and GIN indexes over hstore values.
• Indexes the whole hierarchy, not just one key.
• Accelerates @>, ?, ?& and ?| operators.
• Can also build expression indexes.
![Page 21: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/21.jpg)
JSON
All the cool kids are doing it.
![Page 22: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/22.jpg)
JSON
• JavaScript Object Notation.
• JavaScript’s data structure declaration format, turned into a protocol.
• Dictionaries, arrays, primitive types.
• Originally designed to just be passed into eval() in JavaScript.
• Please don’t do this.
![Page 23: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/23.jpg)
JSON, the new hotness
• The de facto standard API data format for REST web services.
• Very comfortable for Python and Ruby programmers.
• MongoDB’s native data storage type.
![Page 24: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/24.jpg)
JSON? Yeah, we got that.
• JSON type in core as of 9.2.
• Validates JSON going in.
• And… not much else right now.
• array_to_json, row_to_json.
• Lots more coming in 9.3 (offer subject to committer approval).
![Page 25: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/25.jpg)
JSON Indexing.
• Expression indexing.
• Can also treat as a text string for strict comparison…
• … which is kind of a weird idea and I’m not sure why you’d do that.
• But the coolest part of JSON in core is!
![Page 26: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/26.jpg)
PL/V8!
• The V8 JavaScript engine from Google is available as an embedded language.
• JavaScript deals with JSON very well, as you’d expect.
• Not part of core or contrib; needs to be built and installed separately.
![Page 27: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/27.jpg)
PL/V8 ProTips
• Use the static V8 engine that comes with PL/V8.
• Function is compiled by V8 on first use.
• Now that we got rid of SQL injection attacks, we now have JSON injection attacks.
• PL invocation overhead is non-trivial.
![Page 28: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/28.jpg)
Schemaless Strategies
• Create single-field tables with only a hierarchical type.
• Wrap up the (very simple) SQL to provide an object API.
• Create indexes to taste
• Maybe extract fields if you need to JOIN.
• Profit!
![Page 29: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/29.jpg)
CREATE OR REPLACE FUNCTION get_json_key(structure JSON, key TEXT) RETURNS TEXT AS $get_json_key$ var js_object = structure; if (typeof ej != 'object') return NULL; return JSON.stringify(js_object[key]);$get_json_key$ IMMUTABLE STRICT LANGUAGE plv8;
![Page 30: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/30.jpg)
CREATE TABLE blog { post json}
CREATE INDEX post_pk_idx ON blog((get_json_key(post, ‘post_id’)::BIGINT));
CREATE INDEX post_date_idx ON blog((get_json_key(post, ‘post_date’)::TIMESTAMPTZ));
![Page 31: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/31.jpg)
But but but…
• PostgreSQL was not designed to be a schemaless database.
• Wouldn’t it be better to use a bespoke database designed for this kind of data?
• Well, let’s find out!
![Page 32: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/32.jpg)
Some Numbers.
When all else fails, measure.
![Page 33: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/33.jpg)
Schemaless Shootout!
• A very basic document structure:
• id, name, company, address1, address2, city, state, postal code.
• address2 and company are optional (NULL in relational version).
• id 64-bit integer, all others text.
• 1,780,000 records, average 63 bytes each.
![Page 34: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/34.jpg)
The Competitors!
• Traditional relational schema.
• hstore (GiST and GIN indexes).
• XML
• JSON
• One column per table for these.
• MongoDB
![Page 35: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/35.jpg)
Timing Harness.
• Scripts written in Python.
• psycopg2 2.4.6 for PostgreSQL interface.
• pymongo 2.4.2 for MongoDB interface.
![Page 36: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/36.jpg)
The Test Track.
• This laptop.
• OS X 10.7.5.
• 2.8GHz Intel Core i7.
• 7200 RPM disk.
• 8GB (never comes close to using a fraction of it).
![Page 37: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/37.jpg)
Indexing Philosophy
• For relational, index on primary key.
• For hstore, index using GiST and GIN (and none).
• For JSON and XML, expression index on primary key.
• For MongoDB, index on primary key.
• Indexes created before records loaded.
![Page 38: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/38.jpg)
Your Methodology Sucks.
• Documents are not particularly large.
• No deep hierarchies.
• Hot cache.
• Only one index.
• No joins.
• No updates.
![Page 39: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/39.jpg)
The Sophisticated Database Tuning Philosophy.• None.
• Stock PostgreSQL 9.2.2, from source.
• No changes to postgresql.conf
• Stock MongoDB 2.2, from MacPorts.
• Fire it up, let it go.
![Page 40: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/40.jpg)
First Test: Bulk Load
• Scripts read a CSV file, parse it into the appropriate format, INSERT it into the database.
• We measure total load time, including parsing time.
• (COPY will be much much much faster.)
• mongoimport too, most likely.
![Page 41: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/41.jpg)
0
1500
3000
4500
6000
Relational hstore hstore (GiST) hstore (GIN) XML JSON MongoDB
Records/Second
![Page 42: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/42.jpg)
Observations.
• No attempt made to speed up PostgreSQL.
• Synchronous commit, checkpoint tuning, etc.
• GIN indexes are really slow to build.
• The XML xpath function is probably the culprit for its load time.
![Page 43: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/43.jpg)
Next Test: Disk Footprint.
• Final disk footprint once data is loaded.
• For PostgreSQL, reported database sizes from the pg_*_size functions.
• For MongoDB, reported by db.stats().
![Page 44: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/44.jpg)
0
750
1500
2250
Relational hstore hstore (GiST) hstore (GIN) XML JSON MongoDB
Disk Footprint in Megabytes
Data Index
![Page 45: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/45.jpg)
Observations.
• GIN indexes are really big on disk.
• PostgreSQL’s relational data storage is very efficient.
• None of these records are TOAST-able.
• MongoDB certain likes its disk space.
• padding factor was 1, so it wasn’t that.
![Page 46: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/46.jpg)
Next Test: Query on Primary Key• For a sample of 100 documents, query a
single document based on the primary key.
• Results not fetched.
• For PostgreSQL, time of .execute() method from Python.
• For MongoDB, time of .fetch() method.
![Page 47: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/47.jpg)
0
100
200
300
400
Relational hstore hstore (GiST) hstore (GIN) XML JSON MongoDB
Fetch Time in Milliseconds
![Page 48: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/48.jpg)
0
3.25
6.5
9.75
13
Relational XML JSON MongoDB
Fetch Time in Milliseconds (<100ms)
![Page 49: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/49.jpg)
0
100
200
300
400
hstore hstore (GiST) hstore (GIN)
Fetch Time in Milliseconds (>100ms)
![Page 50: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/50.jpg)
Observations.
• B-tree indexes kick ass.
• GiST and GIN not even in same league for simple key retrieval.
• Difference between relational, XML and JSON is not statistically significant.
• Wait, I thought MongoDB was supposed to be super-performant. Huh.
![Page 51: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/51.jpg)
Next Test: Query on Name
• For a sample of 100 names, query all documents with that name.
• Results not fetched.
• Required a full-table scan (except for hstore with GiST and GIN indexes).
• Same timing methodology.
![Page 52: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/52.jpg)
0
12500
25000
37500
50000
Relational hstore hstore (GiST) hstore (GIN) XML JSON MongoDB
Fetch Time in Milliseconds
![Page 53: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/53.jpg)
0
125
250
375
500
Relational hstore hstore (GiST) hstore (GIN) MongoDB
Fetch Time in Milliseconds (<500ms)
![Page 54: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/54.jpg)
0
12500
25000
37500
50000
XML JSON
Fetch Time in Milliseconds (>500ms)
![Page 55: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/55.jpg)
Observations.
• GiST and GIN accelerate every field, not just the “primary” key.
• Wow, executing the accessor function on each XML and JSON field is slow.
• MongoDB’s grotesquely bloated disk footprint hurts it here.
• Not that there’s anything wrong with that.
![Page 56: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/56.jpg)
Now that we know this, what do we
know?
![Page 57: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/57.jpg)
Some Conclusions.
• PostgreSQL does pretty well as a schemaless database.
• Build indexes using expressions on commonly-queried fields…
• … or use GiST and hstore if you want full flexibility.
• GIN might well be worth it for other cases.
![Page 58: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/58.jpg)
Some Conclusions, 2.
• Avoid doing full-table scans if you need to use an accessor function.
• Although hstore’s are not bad compared to xpath or a PL.
• Seriously consider hstore if you have the flexibility.
• It’s really fast.
![Page 59: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/59.jpg)
Flame Bait!
• MongoDB doesn’t seem to be more performant than PostgreSQL.
• And you still get all of PostgreSQL’s goodies.
• Larger documents will probably continue to favor PostgreSQL.
• As will larger tables.
![Page 60: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/60.jpg)
Fire Extinguisher.
• You can find workloads that “prove” any data storage technology is the right answer.
• dBase II included.
• Be very realistic about your workload and data model, now and in the future.
• Test, and test fairly with real-world data in real-world volumes.
![Page 61: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/61.jpg)
The Future is Now!
• Many more JSON functions in PostgreSQL 9.3, now in beta.
• Better, faster support for indexing.
• No GIST or GIN indexes yet, sadly.
![Page 62: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/62.jpg)
mongres!
• Direct Mongo wire-protocol support in PostgreSQL.
• Uses PostgresSQL’s custom background process feature.
• Prototype, but check it out.
![Page 63: PostgreSQL - The BuildPostgreSQL as a Schemaless Database. Christophe Pettus PostgreSQL Experts, Inc. OSCON 2013](https://reader035.vdocument.in/reader035/viewer/2022070217/612261748554f00d0274e0ea/html5/thumbnails/63.jpg)
Thank you!
thebuild.com@xof