practical json - pgconf india · 2020-03-16 · ivan panchenko postgres professional . about myself...
TRANSCRIPT
![Page 1: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/1.jpg)
Practical JSON
Ivan Panchenko
Postgres Professional
![Page 2: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/2.jpg)
About myself
• Astronomer
• Application developer since 1996
• Use PostgreSQL since 1999
• Use JSON in PostgreSQL since 2009
(yes, before it was supported)
• Use PL/Perl since 2009
• Cofounder of Postgres Professional Co.
![Page 3: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/3.jpg)
About this talk
• Too short time to cover full JSON
• So the talk does not cover JSON completely
• The documentation does not, either
• So I will give the main ideas
• And I hope it would be useful
![Page 4: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/4.jpg)
How to store polymorphic rows?
EAV and brothers? No!
2003 (Pg 8.2) : HStore extension (T.Sigaev, O.Bartunov)
– a date type to store sets of key-value pairs 'a=>1, b=>2'::hstore
Operators: • Retrieve value by key : hstore -> text
• Key exists? : hstore ? text
• Hstore contains other one ? : hstore @ hstore
Index support with GiST, GIN (since 2006)
![Page 5: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/5.jpg)
Jsonb comes to help
JSON:
• Standard
• Multi-level
• JSONB (better)
HStore:
• Fast
• Multi-level Hstore (proposal)
• 2011: Joseph Adams proposed a JSON extension for 9.1
(not accepted)
• 2012: 9.2: json data type (storage and verification)
• 2012: Nested HStore proposal
• 2014: 9.4 JSONB (Bartunov, Korotkov, Sigaev)
• 2016: SQL:2016 standard: JSON support (Technical Report)
• 2019: 12 SQL/JSON (Bartunov, Korotkov, Glukhov)
https://commitfest.postgresql.org/17/1471/
![Page 6: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/6.jpg)
Google trends:
jsonb vs hstore
![Page 7: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/7.jpg)
Db-Engines.com ranking
(relative)
![Page 8: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/8.jpg)
JSON or JSONB ?
JSON:
• Text
• Preserves key order
• Allows duplicate keys
• Keeps spaces
• Compact (if no spaces), May be
2 times smaller than JSONB, for
short values.
• Fast INSERT
• No @> (contains) operator
JSONB:
• Fast values retrieval (up to
1000 times!)
• Sorted and unique keys.
• Index support
JSON was not
designed for
storage !!!
![Page 9: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/9.jpg)
Start practicing….
![Page 10: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/10.jpg)
Creating JSON(B)
1. From textual representation: '{}'::jsonb
2. Constructing functions: jsonb_build_object(), array_to_json(), …
https://postgrespro.com/docs/postgresql/12/functions-json
3. From database objects:
SELECT row_to_json(pg_class) FROM pg_class;
![Page 11: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/11.jpg)
Values retrieval
1. Operators: json->text, json->>text:
SELECT row_to_json(pg_class)->>'relname'
FROM pg_class;
2. Operators: json #> text[], json #>>text[]
3. Iterator functions: json_each, json_each_text,
json_array_elements, json_array_elements_text
SELECT * FROM
json_each_text('{"a":"foo","b":"bar"}')
key | value
-----+-------
a | foo
b | bar
![Page 12: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/12.jpg)
Simple search
1. Операторы json->text, json->>text:
WHERE js ->>'key' = 'value';
2. Операторы json #> text[], json #>>text[]
WHERE js #>>ARRAY['key'] = 'value';
3. Функции json_each, json_each_text
WHERE EXISTS (
SELECT * FROM json_each_text(js)
WHERE key='key' AND value='value'
);
B
Tree
![Page 13: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/13.jpg)
Test data { "company_name": "Postgres Vr ̥ttipara", "offices": [
{ "name": "Main",
"city": "Bengaluru",
"area": 1000
},
{ "name": "Secondary",
"city": "Pune",
"area" : 200
}
]
}
CREATE TABLE tbl (js jsonb); -- Bad practice!
![Page 14: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/14.jpg)
“Contains” search (@>) Which companies have offices in Bengaluru?
WHERE js @> '{"offices":[ {"city" : "Bengaluru"} ]}'
CREATE INDEX idx ON tbl USING GIN (js);
Bitmap Heap Scan on tbl (cost=68.77..411.01 rows=100 width=194)
Recheck Cond: (js @> '{"offices": [{"city": "Bengaluru"}]}'::jsonb)
-> Bitmap Index Scan on idx (cost=0.00..68.75 rows=100 width=0)
Index Cond: (js @> '{"offices": [{"city": "Bengaluru"}]}'::jsonb)
GIN
![Page 15: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/15.jpg)
More complicated search
Which companies have offices in any of the two cities?
WHERE
js @>
'{"offices":[ {"city" : "Bengaluru"} ]}'
OR
js @>
'{"offices":[ {"city" : "Pune"} ]}'
Other (pure SQL) way:
WHERE EXISTS (
SELECT *
FROM jsonb_array_elements(js->'offices') t
WHERE t->>'city' IN ('Bengaluru', 'Pune')
);
GIN
![Page 16: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/16.jpg)
Even more complicated search
Which companies have MAIN office in any of the two cities?
WHERE
js @>
'{"offices":[
{"city": "Bengaluru", "name": "Main" }
]}'
OR
js @>
'{"offices":[
{"city": "Pune", "name": "Main" }
]}'
GIN
![Page 17: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/17.jpg)
SQL style…
Which companies have MAIN office in any of the two cities?
WHERE EXISTS(
SELECT *
FROM jsonb_array_elements(js->'offices') t
WHERE t->>'city' IN ('Bengaluru', 'Pune')
AND t->>'name' = 'Main'
);
No index acceleration …
![Page 18: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/18.jpg)
Most complicated search for today
Which companies have MAIN office with area > 500 sq.m in
any of the two cities?
WHERE EXISTS(
SELECT *
FROM jsonb_array_elements(js->'offices') t
WHERE t->>'city' IN ('Bengaluru', 'Pune')
AND t->>'name' = 'Main'
AND (t->>'area')::float > 500
);
@> operator cannot help for >…
(in fact, it can, if used for preselection)
![Page 19: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/19.jpg)
Some conclusions
Search with -> is good and can be fast with B-Tree
functional indexes, but has poor flexibility.
Search with @> is more flexible, and fast with GIN, but
not effective fo complicated queries.
SQL-style search is most flexible, but not effective.
???????
![Page 20: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/20.jpg)
Advanced JSONB search solutions
JSquery extension (Alexander Korotkov)
JSONPATH – SQL:2016 features in PostgreSQL 12
(O.Bartunov, N.Glukov, A.Korotkov, L.Mantrova,
T.Sigaev)
JSONTABLE – continuation in PostgreSQL 13
![Page 21: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/21.jpg)
JSQuery
Defines a data type for a language for JSON search queries
Jsonb @@ jsquery
GIN index support (2 types of GIN indexes optimized for
different types of queries)
![Page 22: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/22.jpg)
JSONPATH
• Defines a JSONPATH data type for a jsonb query
language.
• Defines functions and operators.
• Defines an opclass for index search.
• Implements SQL:2016.
Why a datatype?
• Because we can (this is a Postgres way)
• To enable operators and indexes, leveraging
PostgreSQL extendability
![Page 23: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/23.jpg)
JSONPATH: retrieval
City of the first office:
SELECT jsonb_path_query_first(
js, '$.offices[0].city'
) FROM tbl;
City of the Main office with area > 500
SELECT jsonb_path_query_first(
js,
'$.offices[*]
? (@.area > 500 && @.name == "Main")
.city'
) FROM tbl;
![Page 24: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/24.jpg)
JSONPATH: search
SELECT * FROM tbl WHERE
js @?
'$.offices[*]
? (@.area > 500 && @.name == "Main")
.city'
CREATE INDEX idx1 ON tbl
USING gin (js jsonb_path_ops);
GIN
Details on JSONPATH:
https://github.com/obartunov/sqljsondoc/blob/master/jsonpath.md
More opclasses in JSQuery
![Page 25: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/25.jpg)
What else can we do with JSON ?
![Page 26: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/26.jpg)
JSON enthusiast’s table:
-- ( not recommended! )
CREATE TABLE data (
fields jsonb
);
![Page 27: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/27.jpg)
JSON enthusiast’s table:
(next day)
-- ( not recommended! )
CREATE TABLE data (
fields jsonb
);
CREATE INDEX data_pk
ON data (fields->>'id');
![Page 28: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/28.jpg)
JSON enthusiast’s table:
(2 days later)
CREATE TABLE foo(
fields jsonb
);
CREATE UNIQUE INDEX data_pk
ON foo((fields->>'id'));
CREATE UNIQUE INDEX data_pki
ON foo (((fields->>'id'):int));
-- which is better?
INSERT INTO foo
SELECT jsonb_build_object('q',random(),'id', x)
FROM generate_series(10000000,20000000);
![Page 29: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/29.jpg)
Storing all fields in a single JSONB
1. Impossible to define referential constraints
CREATE TABLE data_smth (
data_id int4 REFERENCES ……… ???? WTF
);
Part 1. Storing primary key in JSON
2. Do not forget to check it for NOT NULL
CHECK (fields->>'id' IS NOT NULL);
3. Still it is not a PK. Define IDENTITY for logical
replcation…
( not recommended! )
![Page 30: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/30.jpg)
Storing all fields in a single JSONB
Part 2. Storing all other data in JSON
( not recommended! )
• Needs more space (keys + JSONB overhead) not always!!
• Slower retrieval (specially with TOAST)
• index scan ~ same performance, Seq scan is 2-3 times
slower even without TOAST
• No statistics: Planning is less effective.
• No types. Even dates missing!
• Less readability (sometimes)
A good article by Dan Robinson.
![Page 31: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/31.jpg)
How to save space with JSON ?
• Merge several rows into one row
• Which rows to merge?
• Close in space-time, or semantically
• Example: part of a time series.
![Page 32: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/32.jpg)
Which fields to store in JSON?
• Really polymorphic data
• Better to store schema metadata too
• Use something to restrict from chaos
• Data with internal structure
• Specially if you plan to use JSQuery or JSONPATH
• If you need to store history, regardless of the schema
changes
![Page 33: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/33.jpg)
Polymorphic structure with versions
CREATE TABLE foo (
id int primary key,
subclass … ,
schema_version …
/*
SUPERCLASS_FIELDS
*/
subclass_fields jsonb
);
CREATE TRIGGER …
-- ensure schema
foo
Subclass1 Subclass2 Subclass3
subclasses
Metadata
subclass_versions
subclass_fields
![Page 34: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/34.jpg)
Fields with internal structure
• Image
{
file_name: '...',
file_path: '...',
width:
height:
}
• Query plan
a big tree returned by
EXPLAIN (FORMAT JSON)
![Page 35: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/35.jpg)
Obtaining and storing query plans
• Cannot obtain it in pure SQL: SELECT FROM (EXPLAIN …)
• Also impossible: EXPLAIN INTO ….
• Need to make a function, like that:
CREATE FUNCTION explain (query text)
RETURNS jsonb
LANGUAGE plperl TRANSFORM FOR TYPE jsonb AS
$$
my ($sql) = @_;
my $res = spi_exec_query(
"EXPLAIN(FORMAT JSON) $sql", 1);
return $res->{rows}[0]->{"QUERY PLAN"};
$$;
![Page 36: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/36.jpg)
JSON aggregates
• Problem: Obtain a book list together with authors
book person
authorship
pos int
Field theory // Landau, L.D., Lifshitz, E.M.
The C Programming Language // Kernigan, B.V., Ritchie, D.M.
![Page 37: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/37.jpg)
JSON aggregates
Field Theory
Langau, L.D. Lifshitz, E.M.
The C Programming Language
Kernigan, B.V., Ritchie, D.M.
![Page 38: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/38.jpg)
JSON aggregates
SELECT book.*,
( SELECT json_agg(row_to_json(x))
FROM (
SELECT person.*
FROM person
JOIN authorship
ON person.id = person_id
WHERE book_id = book.id
ORDER BY pos
) x
) authors
FROM book;
![Page 39: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/39.jpg)
More JSON aggregates
• Expanding attributes from referenced tables
book publisher
series
SELECT book.*,
row_to_json(series) AS $series,
row_to_json(publisher) AS $publisher
FROM book
LEFT JOIN publisher
ON published.id = book.publisher
LEFT JOIN series
ON series.id = book.series
![Page 40: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/40.jpg)
More complex JSON aggregates
• Retrieve a book with hierarchical classification
CREATE TABLE sections ( /* Section tree*/
id int PRIMARY KEY,
parent int REFERENCES sections(id),
title text
);
CREATE TABLE book ( /* Books */
…
sections[] int
…
);
![Page 41: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/41.jpg)
More complex JSON aggregates
• Retrieve a book with hierarchical classification
Output:
{ sections: [ /* Paths of all sections of this
book*/
[ {id:..., title:...},
{id:....} …
], -- 1st section path
[ {id:...
...
] -- 2nd section path
...
}
![Page 42: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/42.jpg)
Test data
INSERT INTO section VALUES
(1, NULL, 'DBMS'),
(2, 1, 'PostgreSQL'),
(3, NULL, 'OS'),
(4, 3, 'Solaris');
INSERT INTO book VALUES (
'Postgres for Solaris OS',
ARRAY[2,4]
);
![Page 43: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/43.jpg)
Step 1: compute the section path
WITH RECURSIVE path AS (
SELECT s.id, s.title, s.parent , 0 AS level
FROM section s WHERE s.id = 4
UNION ALL
SELECT s.id, s.title, s.parent,
path.level+1 AS level
FROM section s
JOIN path ON s.id = path.parent
)
SELECT id,title FROM path ORDER BY level
DESC;
![Page 44: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/44.jpg)
Step 2: aggregate the section path
SELECT json_agg(row_to_json(path)) AS path
FROM ( WITH RECURSIVE path AS (
SELECT s.id, s.title, s.parent , 0 AS level
FROM section s WHERE s.id = 4
UNION ALL
SELECT s.id, s.title, s.parent, path.level+1 AS level
FROM section s JOIN path ON s.id = path.parent
) SELECT id,title FROM path ORDER BY level DESC
) path;
path
-----------------------------------------------------
[{"id":3,"title":"OS"}, {"id":4,"title":"Solaris"}]
(1 row)
![Page 45: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/45.jpg)
Step 3: aggregate several paths SELECT * FROM (
SELECT unnest(ARRAY[2,4]) node_id
) nodes
JOIN LATERAL ( SELECT json_agg(row_to_json(path)) AS path
FROM (
WITH RECURSIVE path AS ( SELECT s.id, s.title, s.parent , 0 AS level FROM section s WHERE s.id = node_id UNION ALL SELECT s.id, s.title, s.parent, path.level+1 AS level FROM section s JOIN path ON s.id = path.parent
) SELECT id,title FROM path ORDER BY level DESC
) path
) AS path ON true;
node_id | path
---------+----------------------------------------------------------
2 | [{"id":1,"title":"DBMS"}, {"id":2,"title":"PostgreSQL"}]
4 | [{"id":3,"title":"OS"}, {"id":4,"title":"Solaris"}]
(2 rows)
![Page 46: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/46.jpg)
Step 4: final result SELECT title, (
SELECT json_agg(path.path)
FROM ( SELECT unnest(book.sections) node_id
) nodes
JOIN LATERAL (
SELECT json_agg(row_to_json(path)) path
FROM (
WITH RECURSIVE path AS (
SELECT s.id, s.title, s.parent , 0 AS level
FROM section s WHERE s.id = node_id
UNION ALL
SELECT s.id, s.title, s.parent, path.level+1 AS level
FROM section s JOIN path ON s.id = path.parent
)
SELECT id,title FROM path ORDER BY level DESC
) path
) path ON true
) paths FROM book;
![Page 47: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/47.jpg)
The output -[ RECORD 1 ]-----------------------------------------------
title | Postgres for Solaris OS
paths |
[
[{"id":1,"title":"DBMS"}, {"id":3,"title":"PostgreSQL"}],
[{"id":2,"title":"OS"}, {"id":4,"title":"Solaris"}]
]
Disclaimer:
• The more complex is the query,
the more sick is the optimizer
Extracting too
much data
increases your
carbon footprint
![Page 48: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/48.jpg)
Several queries in one Problem: return several result sets as one.
SELECT
( SELECT json_agg(row_to_json(book)) books
FROM book )::jsonb ||
( SELECT json_agg(row_to_json(section)) sections
FROM section)::jsonb;
or
SELECT json_build_object(
'books',
(SELECT json_agg(row_to_json(book)) books FROM
book ),
'sections',
(SELECT json_agg(row_to_json(section)) sections
FROM section));
![Page 49: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/49.jpg)
Make a tree Problem: build a multi-level hierarchical tree in JSON [{ id: 1, title: 'DBMS', nodes: [
{ id: 2, title: 'Relational DBMS',
nodes: [
……
etc.
• Of course we need a recursive CTE, but…
– CTE adds rows, but we need to reduce number of
rows…
– Wow! Only aggregates do this! We need an
aggregate.
![Page 50: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/50.jpg)
Make a tree: plan
1) Preprocess the tree with CTE
2) Make a smart aggregate, which puts each node on its place
in the tree.
![Page 51: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/51.jpg)
Step 1: Compute node paths WITH RECURSIVE positioned_section AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY parent ORDER BY id) -1 AS pos FROM section ), t AS ( SELECT id,title, ARRAY['nodes', pos::text] AS path FROM positioned_section WHERE parent IS NULL
UNION ALL
SELECT s.id, s.title, t.path || ARRAY['nodes',pos::text] AS path FROM positioned_section s JOIN t ON s.parent = t.id
)
SELECT row_to_json(t)::jsonb node FROM t ORDER BY path;
![Page 52: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/52.jpg)
Step 1: Result
node
---------------------------------------------------------
{"id": 1, "path": ["nodes", "0"], "title": "DBMS"}
{"id": 3, "path": ["nodes", "0", "nodes", "0"], "title":
"PostgreSQL"}
{"id": 2, "path": ["nodes", "1"], "title": "ОС"}
{"id": 4, "path": ["nodes", "1", "nodes", "0"], "title":
"Solaris"}
(4 rows)
![Page 53: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/53.jpg)
Step 2: Aggreate CREATE OR REPLACE FUNCTION tree_agg_f (state jsonb, item jsonb)
RETURNS jsonb LANGUAGE sql IMMUTABLE STRICT AS $$
SELECT jsonb_set(
state,
(SELECT array_agg(x)
FROM jsonb_array_elements_text(item->'path') x
),
jsonb_build_object(
'id', item->'id',
'title', item->'title',
'nodes','[]'::jsonb
),
true);
$$ ;
CREATE AGGREGATE tree_agg (item jsonb) (
SFUNC=tree_agg_f, STYPE=jsonb, INITCOND='{"nodes":[]}');
![Page 54: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/54.jpg)
Step 3: 1+2 SELECT jsonb_pretty(tree_agg(x.node)) FROM (
WITH RECURSIVE
positioned_section AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY parent ORDER BY id)
-1 AS pos
FROM section
),
t AS (
SELECT id,title,
ARRAY['nodes', pos::text] AS path
FROM positioned_section WHERE parent IS NULL
UNION ALL
SELECT s.id, s.title,
t.path || ARRAY['nodes',pos::text] AS path
FROM positioned_section s JOIN t ON s.parent = t.id
)
SELECT row_to_json(t)::jsonb node FROM t ORDER BY path)
x;
![Page 55: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/55.jpg)
One more example: Histogram
A single aggregate for calculating complex statistics.
For example:
Compute frequences of random integers from 0 to 20.
Obtaining random numbers:
SELECT( random()*20)::int
FROM generate_series(1,20);
![Page 56: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/56.jpg)
Aggregate for frequencies
CREATE OR REPLACE FUNCTION freq_agg_f (
state jsonb, item int)
RETURNS jsonb LANGUAGE sql IMMUTABLE STRICT AS
$$
SELECT jsonb_set(
state,
ARRAY[item]::text[],
to_jsonb(
COALESCE((state->>(item::text))::int,0)+1
)
);
$$;
CREATE AGGREGATE freq_agg (int) (
SFUNC=freq_agg_f, STYPE=jsonb, INITCOND='{}');
![Page 57: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/57.jpg)
The result
SELECT freq_agg ( (random()*20)::int ) FROM
generate_series(1,20);
freq_agg
----------------------------------------------------
{"2": 1, "3": 2, "5": 1, "6": 2, "10": 2, "11": 2,
"13": 1, "15": 3, "16": 3, "17": 1, "18": 2}
(1 row)
![Page 58: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/58.jpg)
JSON in PL/* SQL and PL/PgSQL are not always effective with JSON.
• The code is not always readable
• The object is copied for any operation
What about PL/Perl, PL/Python, PL/v8 ?
• PL/Perl and PL/Python get JSON as text, unless
TRANSFORM is explicitly used
• PL/v8 gets JSON transformed into internal
representation automatically
• PL/v8 is not included in PostgreSQL distribution and has
a version lag
![Page 59: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/59.jpg)
Aggregate function in PL/v8
CREATE EXTENSION plv8;
CREATE FUNCTION v8_agg_f (state jsonb, item
int) RETURNS jsonb AS $$
state[item] ++;
return state;
$$
LANGUAGE plv8 IMMUTABLE STRICT;
CREATE AGGREGATE v8_agg (int)
(SFUNC=v8_agg_f, STYPE=jsonb,
INITCOND='{}');
-- 4 times faster!
![Page 60: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/60.jpg)
Aggregate function in PL/Perl
CREATE EXTENSION plperl;
CREATE EXTENSION jsonb_plperl; -- defines TRANSFORM
CREATE FUNCTION pl_agg_f (state jsonb, item int)
RETURNS jsonb AS $$
my ($state, $item) = @_;
$state->{$item} ++;
return $state;
$$
LANGUAGE plperl TRANSFORM FOR TYPE jsonb
IMMUTABLE STRICT;
CREATE AGGREGATE pl_agg (int)( SFUNC=pl_agg_f, STYPE=jsonb,
INITCOND='{}');
-- 2.5 times faster than SQL
![Page 61: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/61.jpg)
For a snack: NULL pitfall
• jsonb_set function is STRICT
SELECT jsonb_set (some_large_json, path, NULL)
gives NULL. Usually people do not expect this.
Correct way:
SELECT jsonb_set (some_large_json, path,
'NULL'::jsonb)
![Page 62: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/62.jpg)
Future
• Single data type for JSON
• Storage optimization (TOAST modification, compression, etc)
• Lazy transform
• Faster search inside
• JSONPATH search engine optimization
![Page 63: Practical JSON - PGConf India · 2020-03-16 · Ivan Panchenko Postgres Professional . About myself • Astronomer • Application developer since 1996 ... JSquery extension (Alexander](https://reader033.vdocument.in/reader033/viewer/2022042909/5f3c41d810a98b1e1a3d1f0e/html5/thumbnails/63.jpg)
More info
• Rethinking JSONB (PgCon 2015)
• JSON, JSONB, JSQuery (PgConf.Russia 2017)
• PL/{Perl,Python,V8}
• JSONB by example (PgConf.EU 2019)
• JSONPATH manual