database wizardry for legacy applications
Post on 17-Jul-2015
834 Views
Preview:
TRANSCRIPT
Database Wizardry for Legacy Applications
Gabriela D’Ávila
gabriela.io#PHPUK15
FEBRUARY, 2015The Brewery At London 19th-20th
WHOAMI
Huge fan… really!
Almost 25k pieces. I told you I was a fan.
Majored in Digital Game Development
Sr. Software Engineer
x
What to expect?• Waging options
• Spaghetti code x Legacy code
• Real life examples
Respect
Legacy & Spaghetti
Legacy code zone
Spaghetti code zone
Turn programmer crazy zone
How can one be different from another?
Legacy x Spaghetti• No longer supported
• Usually old technology
• Development freeze
• Tangled structure
• May have current technology
• Too many bugs to track
How to optimize your database?
Wait 10 years.paraphrasing Joe Armstrong, erlang creator
Moore’s Law
“The number of transistors per square inch on integrated circuits doubles every year”.
Use structures correctly
JSON,Serialized Array/Objectoption_id option_name option_value option_value
1 24 time_format "g:i a" yes2 43 comment_whitelist 1 yes3 45 comment_registration 0 yes4 76 default_comments_page newest yes5 95 sidebars_widgets "a:3:{s:19:""wp_inactive_widgets"";a:0:{}s:9:""sidebar-1"";a:6:{i:0;s:8:""search-2"";i:1;s:14:""recent-posts-2"";i:2;s:17:""recent-comments-2"";i:3;s:10:""archives-2"";i:4;s:12:""categories-2"";i:5;s:6:""meta-2"";}s:13:""array_version"";i:3;}"yes6 96 cron "a:3:{i:1423456489;a:3:{s:16:""wp_version_check"";a:1:{s:32:""40cd750bba9870f18aada2478b24840a"";a:3:{s:8:""schedule"";s:10:""twicedaily"";s:4:""args"";a:0:{}s:8:""interval"";i:43200;}}s:17:""wp_update_plugins"";a:1:{s:32:""40cd750bba9870f18aada2478b24840a"";a:3:{s:8:""schedule"";s:10:""twicedaily"";s:4:""args"";a:0:{}s:8:""interval"";i:43200;}}s:16:""wp_update_themes"";a:1:{s:32:""40cd750bba9870f18aada2478b24840a"";a:3:{s:8:""schedule"";s:10:""twicedaily"";s:4:""args"";a:0:{}s:8:""interval"";i:43200;}}}i:1423468440;a:1:{s:20:""wp_maybe_auto_update"";a:1:{s:32:""40cd750bba9870f18aada2478b24840a"";a:3:{s:8:""schedule"";s:10:""twicedaily"";s:4:""args"";a:0:{}s:8:""interval"";i:43200;}}}s:7:""version"";i:2;}"yes7 100 _site_transient_update_plugins "O:8:""stdClass"":4:{s:12:""last_checked"";i:1423413299;s:8:""response"";a:0:{}s:12:""translations"";a:0:{}s:9:""no_update"";a:2:{s:19:""akismet/akismet.php"";O:8:""stdClass"":6:{s:2:""id"";s:2:""15"";s:4:""slug"";s:7:""akismet"";s:6:""plugin"";s:19:""akismet/akismet.php"";s:11:""new_version"";s:5:""3.0.4"";s:3:""url"";s:38:""https://wordpress.org/plugins/akismet/"";s:7:""package"";s:56:""https://downloads.wordpress.org/plugin/akismet.3.0.4.zip"";}s:9:""hello.php"";O:8:""stdClass"":6:{s:2:""id"";s:4:""3564"";s:4:""slug"";s:11:""hello-dolly"";s:6:""plugin"";s:9:""hello.php"";s:11:""new_version"";s:3:""1.6"";s:3:""url"";s:42:""https://wordpress.org/plugins/hello-dolly/"";s:7:""package"";s:58:""https://downloads.wordpress.org/plugin/hello-dolly.1.6.zip"";}}}"yes
Another exampleLet’s see another example
Binary Tree• Materialized Path
• Field size limit
• Low performance on search
• TEXT field Index:
• Length limited
• FULLTEXT only in MyISAM
parent_id user_id level asc_path node1 0 1 1 <null> <null>2 1 2 2 ,1, L3 1 13 2 ,1, R4 2 3 3 ,2,1, L5 2 61 3 ,2,1, R6 13 23 3 ,13,1, L7 13 22 3 ,13,1, R8 3 4 4 ,3,2,1, L9 3 156 4 ,3,2,1, R
10 22 1568 4 ,22,13,1, L11 22 26 4 ,22,13,1, R12 23 1476 4 ,23,13,1, L13 23 690716 4 ,23,13,1, R14 61 1051 4 ,61,2,1, L15 61 62 4 ,61,2,1, R
sql> SELECT b.parent_id, b.user_id, b.level, b.asc_path, b.node FROM binary_tree b JOIN users u ON b.user_id = u.id WHERE b.asc_path LIKE ‘%,1,%’
OR b.user_id = 1 ORDER BY b.level, b.parent_id, b.node LIMIT 0, 16
[2015-02-10 21:27:43] 15 row(s) retrieved starting from 1 in 16905/16937 ms
sql> SELECT b.parent_id, b.user_id, b.level, b.asc_path, b.node FROM binary_tree b JOIN users u ON b.user_id = u.id WHERE (b.asc_path LIKE ‘%,1,%’
OR b.user_id = 1) AND (b.level BETWEEN 1 AND 4)
ORDER BY b.level, b.parent_id, b.node
[2015-02-13 23:39:27] 15 row(s) retrieved starting from 1 in 326/348 ms
51x faster
Hierarchical Data Storage
Adjacent List (RDBMS)
user_id lft rgt parent_id1 1 2 13 <null>2 2 3 61 13 13 23 22 14 3 4 156 25 61 1052 62 26 23 1476 690716 137 22 1568 26 13
Graph Database
1
2
361
13
23 22
4
1561051
62
1476
690716 1568
26
What is to learn?• Use the right structures
• Avoid LIKE statements on a TEXT field
• Review your query, the problem may be easier to solve than it seems
• Different tools come with a trade-off
Let’s see a particular query
SELECT trans.id, trans.transaction_type, trans.value, trans.user_receiver_id, user_receiver.username, trans.user_sender_id, user_sender.username, trans.status, CASE trans.status WHEN 1 THEN 'COMPLETED' WHEN 2 THEN 'CANCELED' END AS transaction_status, trans.date_transactionFROM transactions AS trans INNER JOIN users AS user_receiver ON trans.user_receiver_id = user_receiver.id INNER JOIN users AS user_sender ON trans.user_sender_id = user_sender.idWHERE (trans.user_sender_id = 1 OR trans.user_receiver_id = 1) AND (trans.transaction_type IN ('withdraw', ‘cancelWithdraw', ‘transactionFee', 'refundOderPaidWithBalance', ‘transferBetweenUsers', ‘creditcardFee', ‘creditcardWithdraw', 'creditDeposit', ‘creditWithdraw', ‘orderPayment’, ‘monthlyPayment', 'shippingFee')) AND trans.date_transaction >= '2013-05-01 00:00:00' AND trans.date_transaction <= '2013-05-31 23:59:59'ORDER BY trans.date_transaction DESC LIMIT 100;
How long did that query take?
• 2 INNER JOIN
• Using foreign-key primary key relationship
• Using varchar field as filter
• This field is indexed
• ORDER BY using a datetime field DESC
• This field is a B-tree index, created DESC
DESC Index
http://dev.mysql.com/doc/refman/5.5/en/create-index.html
An index_col_name specification can end with ASC or DESC. These keywords are permitted for future extensions for specifying ascending or descending index value storage. Currently, they are parsed but ignored; index values are always stored in ascending order.
Same query in MongoDB
db.financial_report.find({ $or: [{ user_receiver_id: 1 }, { user_sender_id: 1 }], transaction_type: { $in: [ 'withdraw', 'cancelWithdraw', 'transactionFee', 'refundOderPaidWithBalance', 'transferBetweenUsers', 'creditcardFee', 'creditcardWithdraw', 'creditDeposit', 'creditWithdraw', 'orderPayment', 'monthlyPayment', 'shippingFee' ] }, transaction_date: { $gte: new ISODate('2013-03-01 00:00:00'), $lte: new ISODate('2013-03-31 23:59:59') } }).sort({ transaction_date: -1 }).limit(100)
SELECT [FIELDS] FROM transactions AS trans INNER JOIN users AS user_receiver ON trans.user_receiver_id = user_receiver.id INNER JOIN users AS user_sender ON trans.user_sender_id = user_sender.idWHERE (trans.user_sender_id = 1 OR trans.user_receiver_id = 1) AND (trans.transaction_type IN ('withdraw', ‘cancelWithdraw', ‘transactionFee', 'refundOderPaidWithBalance', ‘transferBetweenUsers', ‘creditcardFee', ‘creditcardWithdraw', 'creditDeposit', ‘creditWithdraw', ‘orderPayment’, ‘monthlyPayment', 'shippingFee')) AND trans.date_transaction >= '2013-05-01 00:00:00' AND trans.date_transaction <= '2013-05-31 23:59:59'ORDER BY trans.date_transaction DESC LIMIT 100;
db.financial_report.find({ $or: [{ user_receiver_id: 1 }, { user_sender_id: 1 }], transaction_type: { $in: [ 'withdraw', 'cancelWithdraw', 'transactionFee', 'refundOderPaidWithBalance', 'transferBetweenUsers', 'creditcardFee', 'creditcardWithdraw', 'creditDeposit', 'creditWithdraw', 'orderPayment', 'monthlyPayment', 'shippingFee' ] }, transaction_date: { $gte: new ISODate('2013-03-01 00:00:00'), $lte: new ISODate('2013-03-31 23:59:59') } }).sort({ transaction_date: -1 }).limit(100)
What is to learn?• MySQL has faulty DESC indexes
• Sometimes you need to use another tool to solve the problem
How does it work?
Passive version
App
datatransactions
pending_sync_transactions
App
pending_sync_transactions
transactions
data
App
pending_sync_transactions
transactions
Triggerdata
App
pending_sync_transactions
transactions
data
App
pending_sync_transactions
transactions
data
Active version
Queueing
App
data
Queueing
App
Queueing
Appdata
data
Queueing
data
App
data
data
Queueing
data
App
data
data
Queueing
App
data
data
Queueing
App
data
data
Queueing
App
data
data
Queueing
App
datadata
Queueing
App
What is to learn?• The best solution may not be the text-book solution
• The fancy solution may be overkill
Making data searchable
App
AWS DynamoDB
Getting from the database
App
Getting from a text file
App
S3
What is to learn?• Add layer for search purposes
• Using text-files as a manner of storing data to make it searchable
Summary
Summary• Always analyse the trade-off
• Use the structures correctly
• Simple solutions can be the perfect ones
Thank you!• Please rate this talk at joind.in: https://joind.in/13377
• Twitter: @gabidavila
• Blog: gabriela.io
top related