hugh e. williams vice president, experience, search, and platforms @ hughewilliams , ...
DESCRIPTION
Challenges in Commerce Search. Hugh E. Williams Vice President, Experience, Search, and Platforms @ hughewilliams , [email protected]. eBay Today. 50+ petabytes. Of data in our Hadoop and Teradata clusters. 2+ billion . 250 million. Page views each day. 75+ billion. - PowerPoint PPT PresentationTRANSCRIPT
Hugh E. WilliamsVice President, Experience, Search, and Platforms@hughewilliams, [email protected]
Challenges in Commerce Search
eBay Today
Of data in our Hadoop and Teradata clusters
Page views each day
Database calls each day
50+ petabytes
2+ billion
75+ billion
250 million Queries per day
$1 trillion
Commerce
$10 trillionThe opportunity ahead is huge
Source: Economist Intelligence Unit, Morgan StanleyNote: Market sizes as of 2012, Compounded Annual Growth Rates from 2012 to 2015
Online Commerce
Today’s Search
Turnaround contributor
Series of improvements
Ten year old technology
Series1
Conversion up 13%
Better Search
2010 Simple Flows
Better Images
Merch’ing Other 2012
Improving Search from 2009 to 2012
– User experience changes• Imagery• Reorganization• Optimization• Major page refresh• Speed
– Search science• Query understanding and rewriting• Understanding user intent• Behavioral measurement• Substantial ranking improvements (particularly to Fixed Price ranking)
– And all on a 10+ year old platform named Voyager
Query Understanding and Rewriting
•Our search engine was literal•We’re on a journey to make it more intuitive•Idea: Mine our query-session data, look for patterns, and use these to map words in user queries to synonyms and structured data
Query RewriteSearch
User Query
eBay Results
Search Query
PATTERNS: QUERY REWRITES …
pilzlampe
How do buyers purchase the pilzlampe?
•It turns out, they do one of a few things:–Type pilzlampe, and purchase–Type pilzlampe, … , pilz lampe, and purchase–Type pilzlampe, … , pilzlampen, and purchase–Type pilz lampen, … , pilzlampe, and purchase–…
How do buyers purchase the pilzlampe?
•From our data mining:–We automatically discover that pilz lampe and pilzlampe are the
same–We also discover that pilz and pilze are the same, and lampe and
lampen are the same•From these patterns, we rewrite the user’s query pilzlampe as:
pilzlampe OR “pilz lampe” OR “pilz lampen” OR pilzlampen OR “pilze lampe” OR pilzelampe OR “pilze lampen” OR pilzelampen
Are Query Rewrites easy?
• Nothing is easy at scale– Incorrect strong signals:
• CMU is not Central Michigan University• Mariners is not the same as Marines
– Context matters• Correcting Seattle Marines to Seattle Mariners is (generally) right• Denver Nuggets is not Denver in the Jewelry & Watches category
An even bigger opportunity
Next Gen Search
Cassini: Reengineering eBay Search
Top-to-Bottom View
How hard is it to ship a new search engine?
• Voyager is used for much more than the obvious. It’s multi-tenant:– “Default Search” search (already migrated to Cassini in the US)– Completed, null and low (already migrated to Cassini worldwide)– Description search– Deterministic sorts– Query rewrite– Merchandizing– The Feed– Selling (for example, allowing sellers to create listings from similar items)– Category browsing– Motors and other verticals– Many fast “item lookup” scenarios for other teams– Many scenarios we don’t even know about…
19
What’s else is hard about eBay search?• eBay has over 400 million items listed in multiple languages• Our collection of items changes fast• You can find just about anything on eBay. We have to optimize for every type of item• Not everybody follows the same listing practices, or uses the same keywords or units
– Examples include:• Units of measure: centimeter versus cm, gigabytes versus gb• Colors: Blue versus Aqua, Rojo is the same as Red• Synonyms: laptop and notebook, mobile phone and cell phone• Abbreviations: SGA means Stadium Giveaway• Spelling errors
• Our goal is to help both buyers and sellers find items even when they use different ways of expressing the same things
Technology Deep dive: Infrastructure
• What’s hard at eBay?– Multi-tenant system– Document additions and deletions– Document modifications– Index updates– Result caching– Data center automation–…
Technology Deep dive: Ranking
• What’s hard at eBay?–Mix of items: good ’til canceled multi quantity vs. single quantity–Gaps in catalog data–A very different problem: different ranking signals to Web search–The deterministic sort:
• Recall versus precision• Consistency with best match
–Spam–Result blending
But What Comes Next?
21%of eBay multiscreenusers
44% of GMV share
Q&A?