wringing performance out of perl

24

Upload: leonard-budney

Post on 15-Nov-2014

1.000 views

Category:

Technology


2 download

DESCRIPTION

I gave this lightning talk at Yapc 2011. My company uses Perl in a variety of products, some of which have serious performance implications. Here I give a quick overview of some of the tricks we use to squeeze extra performance out of Perl.

TRANSCRIPT

Page 1: Wringing Performance out of Perl
Page 2: Wringing Performance out of Perl

Wringing Performance out of Perl

Page 3: Wringing Performance out of Perl

Grant Street Group

• Began as a financial advisor group

Page 4: Wringing Performance out of Perl

Grant Street Group

• Discovered the Internet in 1997

Page 5: Wringing Performance out of Perl

Grant Street Group

• Online Auctions of Property Tax Liens• Web-Based billing system for tax collectors• Conversion of legacy tax-collector databases• Online license / vehicle tag renewals• Online payment processing• Auctions of all types of bonds• And lots, lots more!

Page 6: Wringing Performance out of Perl

Tax Lien Auctions

Page 7: Wringing Performance out of Perl

Tax Lien Auctions

• Absolute feeding frenzy– Our bidders threatened to exhaust TIN numbers– 20 million bidders in 2011– More than 30 billion bids altogether– Average was a 500,000-way tie– About 2,000 auctions closing simultaneously

Page 8: Wringing Performance out of Perl

Tax Lien Auctions

• How do we award auctions performantly?– Random tie-breaking with Crypt::Random– Random row-ID plus MySQL = S L O W– Turns out we can do it much faster in Perl

Page 9: Wringing Performance out of Perl

Tax Lien Auctions

• Net result: auction closing takes 20 seconds– Breaking 2,000 ties, each 500,000-way– Stress-testing indicates can scale by 4x– The IRS definitely cannot scale by 4x

Page 10: Wringing Performance out of Perl

Property Tax Online Payments

Page 11: Wringing Performance out of Perl

Property Tax Online Payments

• Florida residents can pay the property tax• Hosted, customized sites per county• Largest counties have ~1,000,000 parcels• Users are typical Florida residents

Page 12: Wringing Performance out of Perl

Property Tax Online Payments

Page 13: Wringing Performance out of Perl

Property Tax Online Payments

Page 14: Wringing Performance out of Perl

Property Tax Online Payments

Page 15: Wringing Performance out of Perl

Property Tax Online Payments

Page 16: Wringing Performance out of Perl
Page 17: Wringing Performance out of Perl

Property Tax Online Payments• Backend is MySQL and Sphinx• Lightning-fast searches with Perl– Mapping IDs to table, column, PK– Parsing SHOW STATUS LIKE ‘sphinx%’• Lots of useful metadata!

Page 18: Wringing Performance out of Perl

Property Tax Online Payments

• Net results:– Sub-second turnaround times– 9 minute average time on site by payers– 4 minute average time on site overall

Page 19: Wringing Performance out of Perl

Customer Data Conversion

Page 20: Wringing Performance out of Perl

Customer Data Conversion

• Largest county in FL is a customer– Population ~2.4M people– Tax roll of ~900K parcels– History of ~5.6M bills across 6 years

• Full database is large (by our standards)– Data files are ~30-50GB– Full conversion is ~160 hours, using Perl– Might be ~8 hours using pure SQL

Page 21: Wringing Performance out of Perl

Customer Data Conversion

• Problem is we can’t use pure SQL– Ridiculous amounts of business logic– Utterly different data models

• We’re a Perl shop; Perl is our hammer

Page 22: Wringing Performance out of Perl

Customer Data Conversion

• Hugely parallel data conversion– Subdivide conversion into smaller steps– Build hash of dependencies between steps– Construct DAG of work units in MongoDB

• Distribute the actual work– Run lots of Perl worker processes– Workers grab ready work units– Perform the work unit sequentially

Page 23: Wringing Performance out of Perl

Customer Data Conversion

• The end result– Total conversion time ~3 hours with 80 workers– Nightly reloads now very practical– Able to resume incomplete loads

Page 24: Wringing Performance out of Perl

We’re Hiring Telecommuters