Download - 20080410 Pf Congrez Presentation E Bay V0 2
A sneak preview at Marktplaats.nlPFCongrez
April 12th 2008JA. [email protected]
eBay Inc. Proprietary & Confidential
Who am I?
• Jilles Oldenbeuving• Working for Marktplaats since early 2003• Responsible for application development• Lot’s of fun:
– Great technical and infrastructural challenges– Top 3 Dutch website 59% reach!– Real business, where product drives success– World class team!
Content
• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP
eBay’s classifieds portfolio
Content
• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP
Marktplaats statistics
• At peak:– Over 71M page views/day– 10 new listings/s, 6M total listings– 600 search queries/s– 900 MB/s uplink traffic– 120 user generated emails/s (send from user to user)
• Collection of 20M user images (2TB)
• Utilizing 600+ servers across 3 datacenters
Content
• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP
Search Engine
Marktplaats Production Environment
LB/Firewall
LB
LB
TrackerMogile storage nodes
NetCaches
Application
Application
LB/Firewall
Ads/UsersHitcounters
etc.AdMarkt
Readslaves
Readslaves
Readslaves
Etc..
Readslaves
Memcache
CS Backend
Simplified
LAMP
MySQL
Content
• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP
MySQL Replication
• Slaves upon slaves doesn’t scale well…– Only spreads reads
500 reads/s
200 writes/s
250 reads/s
200 writes/s
250 reads/s
200 writes/s
w/ 1 server w/ 2 servers
As your site grows…
• Databases eventual consumed by writing
• Can not be solved by caching read actions
3 reads/s
400writes/s
3 reads/s
400writes/s
3 reads/s
400writes/s
3 reads/s
400writes/s
3 reads/s
400writes/s
3 reads/s
400writes/s
Generic MySQL database pool setup
Active Master
Failover Master
Access through VIP
SlavePool 1SlavePool 1SlavePool 1SlavePool 1SlavePool 1
SlavePool 1SlavePool 1SlavePool 1SlavePool 1SlavePool 2
LagSlave
Provides:• High availability for both writes and reads
• Scales reads
• Writes need to be scaled by partitioning (either functionally, or by modulo)
• Prevents human disasters
• Long term backups
• A way to change database schema’s without downtime
OffsiteBackup
Slaves can vary:• Different replication sets (For really high read:write ratios)
• Different indexes
• Different access patterns/impact seperation
(Ex. Cronjobs; for key buffers)
TIP: Abstract this in the code. Both configuration as well as physical vs logical mapping or look into MySQL
Proxy
How to manage database schema’s?
• The problem:– Hundreds of database instances across Marktplaats
• Each development environment it’s own database• Each QA and staging environments• Production environment• With 10-20 separate database pools each• In total 500+ databases
– Application needs to be consistent with the database version too!
Enter DBC
• In-house developed tool in PHP• Inspects your current database version and
application version and will bring those in synch
• Is aware of our database setup– Physical– Logical
• DBC is integrated with the build system
DBC
• Benefits:– Allows to “branch” database changes, but share
within project team until feature is finished– Leaves an audit trail of database changes– Allows review by DBA before propagating a
change into a release– Consistent and safe rollout of database
changes to production• Checks target system before and after
Trying to gauge interest in DBC to decide to open source it.
If you have interest, let me know.
Content
• eBay’s classifieds portfolio• Marktplaats statistics• Marktplaats production environment• Scaling databases• Marktplaats and PHP
Marktplaats and PHP
• Started out as a PHP-only shop in 1999
• PHP worked great, and scaled well up to a certain point
– Usage of Marktplaats keeps on growing– Application grew immensely in complexity– Number of developers quadrupled
• Java and SOA architecture gaining more ground
• One example of limits in PHP
APC Autofilter issue
• PHP’s speed can be improved by using an opcode cache like Zend, APC
• Examples to the right bypass this since the path is variable
• No constants in PHP!
Include (‘foo.php’)
Include ($path. ‘/foo.php’)
Include (MY_INCL. ‘/foo.php’)
If($a) include
(‘/path/foo.php’)
Include (‘/path/foo.php’)
APC Autofilter issue
• This file is cached, but parent.php is not since includes are only done at runtime.
• Child is actually created as an incomplete “mangled” class definition
• What if parent.php was already cached?
Include_once “parent.php”;
class Child extends Parent();
APC Autofilter issue
• By the time child.php is including parent.php, it is already cached• Zend changes the opcodes for child.php, removing the include of
parent.php this speeds up execution• APC can not use this version of child.php for caching• APC will stop caching child.php at all (called “Autofilter”)• …for ever!• One of the reasons why Java is gaining more traction within Marktplaats
Include_once “parent.php”;
Include_once “child.php”
$c = new Child();
Given the 1000’s of files in Marktplaats’ codebase this costs ~30% runtime performance!
Search Engine
Marktplaats Production Environment
LB/Firewall
LB
LB
TrackerMogile storage nodes
NetCaches
Application
Application
LB/Firewall
Ads/UsersHitcounters
etc.AdMarkt
Readslaves
Readslaves
Readslaves
Etc..
Readslaves
Memcache
CS Backend
Simplified
LAMP
MySQL