overview of ebay marketplace architecture
DESCRIPTION
A brief and high level overview of eBay marketplace architecture as of mid 2013TRANSCRIPT
Architecture Overview
Farhang Kassaei
fkassaei@eBay@FarhangKassaei
SoftwareForAllSeasons
• 120 Million Users• 1+ Million full time sellers• 2 Billion Page Views• 350 Million Searches• 10s of Millions of user sessions• 400 Million live listing, 21 Million new listings• 25 Million bids• $175B in GMV (~$5549/sec)• 9 Million of purchases• 180 Billion SQL executions• 14 Billion Service calls (internal and external)• 100s of Millions of internal asynchronous events• 10s of Petabytes of data• 50MM Line of Code• 100s of Service
eBay MP at a Glance
• 120 Million Users• 1+ Million full time sellers• 2 Billion Page Views• 350 Million Searches• 10s of Millions of user sessions• 400 Million live listing, 21 Million new listings• 25 Million bids• $175B in GMV (~$5549/sec)• 9 Million of purchases• 180 Billion SQL executions• 14 Billion Service calls (internal and external)• 100s of Millions of internal asynchronous events• 10s of Petabytes of data• 50MM Line of Code• 100s of Service
eBay MP at a Glance
Digitizing Commerce Value Chain
Merchant
Sourcing
InboundLogX
Warehouse Inventory Mgmt
Order Fulfillment
MarketResearch
Demand Analysis
Pricing
marketing Traffic
Distribution
Promotion Personalization
Engagement
Outbound LogX
BrandOrderMgmt
Post Sales
Markets
Global
Any Dev
ice
Legally
Compet
itive
`l
y
This Stuff Matters …
Economic Growth
Employment
Reducing Poverty
Creating Economic Opportunity
Real World Applications
• Classification and Taxonomy• Search and Discovery• Pricing & Incentives • Trend Analysis• Demand and Supply Analysis • Promotion and Discounting• Loyalty• Invoicing, Billing and
Subscription • Personalization • Merchandising • Payment Processing• Order Management
• Inbound Logistics • Outbound Logistics • Same Day Delivery • Delivery Valet • Distributed Inventory
Management • Ship Cost and Time estimation • Risk Management • Experimentation • Traffic Management &
Optimization • Demand Generation• Fraud Analysis • Fraud Detection • Customer Support
Real World Applications
• Classification and Taxonomy• Search and Discovery• Pricing & Incentives • Trend Analysis• Demand and Supply Analysis • Promotion and Discounting• Loyalty• Invoicing, Billing and
Subscription • Personalization • Merchandising • Payment Processing• Order Management
• Inbound Logistics • Outbound Logistics • Same Day Delivery • Delivery Valet • Distributed Inventory
Management • Ship Cost and Time estimation • Risk Management • Experimentation • Traffic Management &
Optimization • Demand Generation• Fraud Analysis • Fraud Detection • Customer Support
Challenging & Cutting Edge Tech• Data Center Design • Infrastructure Automation• Network and Storage design• Virtualization & Could Computing • Self Aware Computing & System
Mgmt• Distributed Systems, Software• Data Architecture and
Infrastructure• Processing very large data sets
and files• Event Driven Architecture• Commerce Search • Query Understanding, Expansions• Natural Language Processing• Domain Modeling and Service
Design • API Management • Application Framework Design• Big Data Processing Infrastructure • Big Data Analysis• Near Real Time Analytics
• Complex Event Processing • Decision Making• Contextual Computing • Image and Scene Analysis and
Construction• Meaning Extraction • Planning and Optimization• Federated Identity & Profile• Entity Resolution• Security and Secure Computing• Intrusion Detection • Human Computer Interaction• Software Development
Methodologies and Processes• Enterprise Integration• Content Management • Translation, I18N and L10N• Accessibility
Challenging & Cutting Edge Tech• Data Center Design • Infrastructure Automation• Network and Storage design• Virtualization & Could Computing • Self Aware Computing & System
Mgmt• Distributed Systems, Software• Data Architecture and
Infrastructure• Processing very large data sets
and files• Event Driven Architecture• Commerce Search • Query Understanding, Expansions• Natural Language Processing• Domain Modeling and Service
Design • API Management • Application Framework Design• Big Data Processing Infrastructure • Big Data Analysis• Near Real Time Analytics
• Complex Event Processing • Decision Making• Contextual Computing • Image and Scene Analysis
and Construction• Meaning Extraction • Planning and Optimization• Federated Identity & Profile• Entity Resolution• Security and Secure Computing• Intrusion Detection • Human Computer Interaction• Software Development
Methodologies and Processes• Enterprise Integration• Content Management • Translation, I18N and L10N• Accessibility
Network
Application
Web Service
Data
Operating System
Hardware
Events BES>
The Stack
Remember, We Are Simplifying
Inc.
Search
Cart MyeBay
Repl
icati
on
Write, Sharded Read
Sharded Read Sharded Write
Sharded Write, Read
Journaled WriteReplicated Read
Data Center A
Data Center B
MP, DC
DB
ETL
System BusinessScienceBehavior Learned State
Internet
ComGateway
Gateway
Presentation
Application
Services
DW
Initialization
Req/Res Framework
Security & Policy
Tracking and Exp
Content
Core UI Widget
Set
DAL/ORM Service Invocation
Identification & Config & Metadata
Management
ApplicationEvent BusAdaptor
Kernel & Core
Deliverable
DAL>
Presentation in browsers
Application
Data Access and ORM
Service
Language
JVM
OS
API
Registry
MetaData
ID
BuildTimeMeta Data
ReleaseCore Services
App Services
Infrastructure Services
Build
Source
Source
Source
BuildReleaseWeb Apps
Device Apps
Source
ID
BuildTimeMeta Data
Release
API
Core Services
Source
Source
Build
App Services
Release
Registry
RunTimeMetaData
Infrastructure Services
Build
Web Apps
Device Apps
Checkout Return Claim Selling MerchXing Reg
Seller Buyer Dev CSAffilat
eTraffic ALX
OMS Search Identity Inventory
Monitoring
I18N
Security
…
Bridge
Tracking
Application Services
Core Services
Infrastructure Services
APIs
Intermediation
Merchant
Data
…
…
Cache
Real Time Analytics &Monitoring
Integration
BES
HBASE
Caty’s
QueryAnnotator
s/w basedLoad
balancer
Top LevelAggregator
Low LevelAggregator
Low LevelAggregator
Low LevelAggregator
cassini
mini-grid
mini-grid
mini-grid
Index Repo’s
Cassini Client
Search
Cassini Grid (m x n)
QN11 QN12 QN13 QN1n
QN21 QN22 QN23 QN2n
QNm1 QNm2 QNm3 QNmn
Agg1
Agg2
Aggm
columns
rows
aggregators
mini-grid
…
…
…
…
QN31 QN32 QN33 QN3nAgg3…row3
col2
Index
Traffic
Request
Search Grid
Cassini Function Super Title
NetworkBubble
NetworkBubble Cassini Search Grid – ( Equivalent to pool ex: Super Title )
QN
Agg
Mini-Grid – Row#1
QN
Agg
Mini-Grid – Row#2
QN
Agg
Mini-Grid – Row#3
QN
Agg
Mini-Grid – Row#N
Software Load Balancer
Top LevelAggregator
Cassini Search Grid – ( Equivalent to pool ex: Super Title )
QN
Agg
Mini-Grid – Row#1
QN
Agg
Mini-Grid – Row#2
QN
Agg
Mini-Grid – Row#3
QN
Agg
Mini-Grid – Row#N
Software Load Balancer
Top LevelAggregator
PHX
LVS
Search QueryService
SRP
Description SpecificsProcessors (2) 2.9 GHz 6 Core Westmere Disk (4) 300 GB SAS Drives;
RAID-10 (535GB Usable) RAM 72 GB RAM NICS 1GB eth0 = App/ILO, 1GB eth1 = App
Infrastructure
- Horizontal Scaling: Code per function, Data per nature- Prefer Asynchrony over synchronous- Most use of databases can be reduced to <K,V> pair access- Hardware fails, recover gracefully. - Cache Everything at all Levels- Operation of Software Systems is first class engineering
requirement- Event Stream is like blood stream
What We Learned
- Define, Declare, Defend and Reinforce Boundaries - Knowledge Sharing is the bottleneck to scalability- Use Open Source, Develop Open Source Style- Devices Change Everything. - Software fails, has bugs … recover gracefully.
$300B GMV
Where Are We Going?
2X Users
Global, Regional and Local
Devices and Pervasive Computing
Digitized/ing The Entire Value Chain
All Depends on Technology
Thank You!