postgresql worst practices, version fosdem pgday 2017 by ilya kosmodemiansky
TRANSCRIPT
![Page 2: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/2.jpg)
Best practices are just boring
• Never follow them, try worst practices• Only those practices can really help you to screw the things upmost effectively
• PostgreSQL consultants are nice people, so try to make themhappy
![Page 3: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/3.jpg)
How it works?
• I have a list, a little bit more than 100 worst practices• I do not make this stuff up, all of them are real-life examples• I reshuffle my list every time before presenting and extractsome amount of examples
• Well, there are some things, which I like more or less, so it isnot a very honest shuffle
![Page 4: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/4.jpg)
0. Do not use indexes (a test one!)
• Basically, there is no difference between full table scan andindex scan
• You can check that. Just insert 10 rows into a test table onyour test server and compare.
• Nobody deals with more than 10 row tables in production!
![Page 5: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/5.jpg)
1. Use ORM
• All databases share the same syntax• You must write database-independent code• Are there any benefits, which are based on database specificfeatures?
• It always good to learn a new complicated technology
![Page 6: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/6.jpg)
2. Move joins to your application
• Just select * a couple of tables into the application written inyour favorite programming language
• Than join them at the application level
• Now you only need to implement nested loop join, hash joinand merge join as well as query optimizer and page cache
![Page 7: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/7.jpg)
2. Move joins to your application
• Just select * a couple of tables into the application written inyour favorite programming language
• Than join them at the application level• Now you only need to implement nested loop join, hash joinand merge join as well as query optimizer and page cache
![Page 8: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/8.jpg)
3. Be in trend, be schema-less
• You do not need to design the schema• You need only one table, two columns: id bigserial and extra
jsonb• JSONB datatype is pretty effective in PostgreSQL, you cansearch in it just like in a well-structured table
• Even if you put a 100M of JSON in it• Even if you have 1000+ tps
![Page 9: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/9.jpg)
4. Be agile, use EAV
• You need only 3 tables: entity, attribute, value
• At some point add the 4th: attribute_type• Whet it starts to work slow, just call those four tables The
Core and add 1000+ tables with denormalized data• If it is not enough, you can always add value_version
![Page 10: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/10.jpg)
4. Be agile, use EAV
• You need only 3 tables: entity, attribute, value• At some point add the 4th: attribute_type
• Whet it starts to work slow, just call those four tables TheCore and add 1000+ tables with denormalized data
• If it is not enough, you can always add value_version
![Page 11: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/11.jpg)
4. Be agile, use EAV
• You need only 3 tables: entity, attribute, value• At some point add the 4th: attribute_type• Whet it starts to work slow, just call those four tables The
Core and add 1000+ tables with denormalized data
• If it is not enough, you can always add value_version
![Page 12: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/12.jpg)
4. Be agile, use EAV
• You need only 3 tables: entity, attribute, value• At some point add the 4th: attribute_type• Whet it starts to work slow, just call those four tables The
Core and add 1000+ tables with denormalized data• If it is not enough, you can always add value_version
![Page 13: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/13.jpg)
5. Try to create as many indexes as you can
• Indexes consume no disk space• Indexes consume no shared_bufers• There is no overhead on DML if one and every column in atable covered with bunch of indexes
• Optimizer will definitely choose your index once you created it• Keep calm and create more indexes
![Page 14: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/14.jpg)
6. Always keep all your time series data
• Time series data like tables with logs or session history shouldbe never deleted, aggregated or archived, you always need tokeep it all
• You will always know where to check, if you run out of diskspace
• You can always call that Big Data• Solve the problem using partitioning... one partition for anhour or for a minute
![Page 15: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/15.jpg)
6. Always keep all your time series data
• Time series data like tables with logs or session history shouldbe never deleted, aggregated or archived, you always need tokeep it all
• You will always know where to check, if you run out of diskspace
• You can always call that Big Data• Solve the problem using partitioning... one partition for anhour or for a minute
![Page 16: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/16.jpg)
6. Always keep all your time series data
• Time series data like tables with logs or session history shouldbe never deleted, aggregated or archived, you always need tokeep it all
• You will always know where to check, if you run out of diskspace
• You can always call that Big Data
• Solve the problem using partitioning... one partition for anhour or for a minute
![Page 17: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/17.jpg)
6. Always keep all your time series data
• Time series data like tables with logs or session history shouldbe never deleted, aggregated or archived, you always need tokeep it all
• You will always know where to check, if you run out of diskspace
• You can always call that Big Data• Solve the problem using partitioning... one partition for anhour or for a minute
![Page 18: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/18.jpg)
7. Turn autovacuum off
• It is quite auxiliary process, you can easily stop it• There is no problem at all to have 100Gb data in a databasewhich is 1Tb in size
• 2-3Tb RAM servers are cheap, IO is a fastest thing in moderncomputing
• Besides of that, everyone likes BigData
![Page 19: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/19.jpg)
8. Keep master and slave on different hardware
• That will maximize the possibility of unsuccessful failover
• To make things worser, you can change only slave-relatedparameters at slave, leaving defaults for shared_buffers etc.
![Page 20: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/20.jpg)
8. Keep master and slave on different hardware
• That will maximize the possibility of unsuccessful failover• To make things worser, you can change only slave-relatedparameters at slave, leaving defaults for shared_buffers etc.
![Page 21: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/21.jpg)
9. Put a synchronous replica to remote DC
• Indeed! That will maximize availability!
• Especially, if you put the replica to another continent
![Page 22: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/22.jpg)
9. Put a synchronous replica to remote DC
• Indeed! That will maximize availability!• Especially, if you put the replica to another continent
![Page 23: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/23.jpg)
10. Reinvent Slony
• If you need some data replication to another database, try toimplement it from scratch
• That allows you to run into all problems, PostgreSQL havehad since introducing Slony
![Page 24: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/24.jpg)
10. Reinvent Slony
• If you need some data replication to another database, try toimplement it from scratch
• That allows you to run into all problems, PostgreSQL havehad since introducing Slony
![Page 25: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/25.jpg)
11. Use as many count(*) as you can
• Figure 301083021830123921 is very informative for the enduser
• If it changes in a second to 30108302894839434020, it is stillinformative
• select count(*) from sometable is a quite light-weighted query• Tuple estimation from pg_catalog can never be preciseenough for you
![Page 26: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/26.jpg)
12. Never use graphical monitoring
• You do not need graphs• Because it is an easy task to guess what was happenedyesterday at 2 a.m. using command line and grep only
![Page 27: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/27.jpg)
13. Never use Foreign Keys(Use local produced instead!)
• Consistency control at application level always works asexpected
• You will never get data inconsistency without constraints• Even if you already have a bullet proof framework to maintainconsistency, could it be good enough reason to use it?
![Page 28: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/28.jpg)
14. Always use text type for all columns
• It is always fun to reimplement date or ip validation in yourcode
• You will never mistakenly convert ”12-31-2015 03:01AM” to”15:01 12 of undef 2015” using text fields
![Page 29: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/29.jpg)
15. Always use improved ”PostgreSQL”
• Postgres is not a perfect database and you are smart• All that annoying MVCC staff, 32 bit xid and autovacuumnightmare look like they look because hackers are oldschooland lazy
• Hack it in a hard way, do not bother yourself with submittingyour patch to the community, just put it into production
• It is easy to maintain such production and keep it compatiblewith ”not perfect” PostgreSQL upcoming versions
![Page 30: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/30.jpg)
16. Postgres likes long transactions
• Always call external services from stored procedures (likesending emails)
• Oh, it is arguable... It can be, if 100% of developers werefamiliar with word timeout
• Anyway, you can just start transaction and go away forweekend
![Page 31: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/31.jpg)
16. Postgres likes long transactions
• Always call external services from stored procedures (likesending emails)
• Oh, it is arguable... It can be, if 100% of developers werefamiliar with word timeout
• Anyway, you can just start transaction and go away forweekend
![Page 32: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/32.jpg)
16. Postgres likes long transactions
• Always call external services from stored procedures (likesending emails)
• Oh, it is arguable... It can be, if 100% of developers werefamiliar with word timeout
• Anyway, you can just start transaction and go away forweekend
![Page 33: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/33.jpg)
17. Load your data to PostgreSQL in a smart manner
• Write your own loader, 100 parallel threads minimum
• Never use COPY - it is specially designed for the task
![Page 34: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/34.jpg)
17. Load your data to PostgreSQL in a smart manner
• Write your own loader, 100 parallel threads minimum• Never use COPY - it is specially designed for the task
![Page 35: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/35.jpg)
18. Even if you want to backup your database...
• Use replication instead of backup
• Use pg_dump instead of backup• Write your own backup script• As complicated as possible, combine all external tools youknow
• Never perform a test recovery
![Page 36: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/36.jpg)
18. Even if you want to backup your database...
• Use replication instead of backup• Use pg_dump instead of backup
• Write your own backup script• As complicated as possible, combine all external tools youknow
• Never perform a test recovery
![Page 37: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/37.jpg)
18. Even if you want to backup your database...
• Use replication instead of backup• Use pg_dump instead of backup• Write your own backup script
• As complicated as possible, combine all external tools youknow
• Never perform a test recovery
![Page 38: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/38.jpg)
18. Even if you want to backup your database...
• Use replication instead of backup• Use pg_dump instead of backup• Write your own backup script• As complicated as possible, combine all external tools youknow
• Never perform a test recovery
![Page 39: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/39.jpg)
18. Even if you want to backup your database...
• Use replication instead of backup• Use pg_dump instead of backup• Write your own backup script• As complicated as possible, combine all external tools youknow
• Never perform a test recovery
![Page 40: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/40.jpg)
Do not forget
That was WORST practice talk
![Page 41: PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky](https://reader034.vdocument.in/reader034/viewer/2022042706/58ac00f31a28abb6718b5905/html5/thumbnails/41.jpg)
Questions or ideas? Share your story!
[email protected](I’am preparing this talk to be open sourced)