search science @ ebay

43
Search Science @ eBay … from a Web Search perspective Tim Converse Senior Director Head of Search Science Engineering

Upload: samson-love

Post on 02-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Search Science @ eBay. … from a Web Search perspective Tim Converse Senior Director Head of Search Science Engineering. Outline. The eBay Search problem (starting from Web search) What’s the same? What’s different? Search Science @ eBay W ho we are W hat we do Current frontier. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Search Science @ eBay

Search Science @ eBay… from a Web Search perspective

Tim ConverseSenior Director

Head of Search Science Engineering

Page 2: Search Science @ eBay

Outline

• The eBay Search problem (starting from Web search)–What’s the same? What’s different?

• Search Science @ eBay–Who we are–What we do

• Current frontier

Page 3: Search Science @ eBay

My background• Inktomi => Yahoo! Web Search

– Spam detection / doc classification

• Powerset => Bing Web Search (Microsoft)– ML Ranking and metrics for NLP-driven engine– Led Search Summaries group for Bing

• Jybe => Yahoo!– Co-founded small mobile personalization co.– Personalization of main Yahoo! page

• Joined eBay November 2013

Page 4: Search Science @ eBay

The eBay Search Problem

Page 5: Search Science @ eBay

The eBay Search Results Page (SRP)

Page 6: Search Science @ eBay

The eBay Search Results Page (SRP)

Auction listing

Page 7: Search Science @ eBay

The eBay Search Results Page (SRP)

Fixed-price listing

Page 8: Search Science @ eBay

The eBay Search Results Page (SRP)Category refinements

Page 9: Search Science @ eBay

The eBay Search Results Page (SRP)

Aspect refinements

Page 10: Search Science @ eBay

The eBay Search Results Page (SRP)

Related Searches

Page 11: Search Science @ eBay

The eBay Search Results Page (SRP)

Sort type:

Page 12: Search Science @ eBay

Search Intents

• Informational– Info in document or

summary

• Navigational– Get me to a known

site or page

• Transactional– I want to buy

something

Web Search (Broder ’02)

• Transactional–Most common:

want to buy something specific

– Less common: browsing/window-shopping

eBay Search

Page 13: Search Science @ eBay

Search inputsWeb eBayDocuments from Web Listings from sellers (not

products!)Mostly unstructured text Semi-structuredLink graph N/AAnchortext N/AN/A Seller history, reputationUser clicks, dwell time ClicksN/A Conversions(!)

Page 14: Search Science @ eBay

Search challengesWeb eBayCrawling for discovery N/AIndex freshness Freshness + short-lived inventoryQuery understanding/rewriting

Query understanding/rewritingMapping to categories/aspects

Web graph analysis N/AAuthoritativeness Seller trustworthinessResult ranking Auction ranking

Pricing and value

Page 15: Search Science @ eBay

eBay Scale![Numbers from 2012 / 2013]• ~130 million buyers and sellers• 250 million queries/day• ~1 billion listings• 100 petabytes of data (Hadoop / Terabyte)• ~2 billion pageviews / day• $75 billion of merchandise sold (2012)

(mostly as a result of searches)

Page 16: Search Science @ eBay

Search ecosystemWeb eBayPublishers mostly passively crawled, indexed

Sellers must actively list“Paying customer”Implications: eBay can impose requirements Sellers can complain

Advertising is primary model Transaction fees primaryPolicies enforced by removal or demotion

Policies enforced at listing time and by later sanctions

Page 17: Search Science @ eBay

Types of spamWeb eBayKeyword stuffing Keyword stuffing (title)

Duplication Duplication

Link spam Shill bidding and buying

Cloaking/redirects/invisible text N/A (no links, eBay controls presentation)

N/A Fitment spam

N/A Multi-SKU spam (targeting sorts)

Page 18: Search Science @ eBay

Unique e-commerce challengeseBay point of difference ConsequenceItems, not products • Ranking bags of words

• Hard to compare relative value of items

Open door selling policy • Limited standardisation of seller behaviour more dimensions of choice

New vs. Refurb vs. Used vs. Broken Shipping time/cost/CBT/custom/Returns/ Service C2C / b2C / B2C

• Incomplete/inaccurate structured data• Spam / Trust

Many options for selling : solving for edge cases

• Complexity in search. For example:• 1/3/5/7/10/30 day FP, 1/3/5/7/10 day auctions, ABIN,

BIN with Best offer, live auctions, pickup only• Auction ranking and intermingling is required

Broad and deep inventory Buyers need to refine / disambiguate before they buy

Long history / heavy buyers Existing workflows are deeply ingrained. Resistance to change

Page 19: Search Science @ eBay

Search Science @ eBay

Page 20: Search Science @ eBay

Search Science Mission• Mission : Help the user find the item they want as quickly

and easily as possible• Users typically care about three things in search results

– Relevance– Trust– Value

• Success Measures : – Revenue per user (primary)– Human-judged relevance tests

• Search is a product of Search Front End, Search Back End, and Search Science

Page 21: Search Science @ eBay

Search Science Areas

Recall• Which items match the

user query

Ranking• Which items are you most

likely to want to buy

Universal Search• Relevance of left nav,

related searches and inline elements

Metrics• Tracking performance,

delivering insights and developing internal tools

Spam• Spotting deliberate search

manipulation, actioning via CS & neutralizing on site

Page 22: Search Science @ eBay

Recall• Responsible for mapping a query to a set of items

to return• Primarily accomplished by query rewriting

– User query transformed into much larger back-end query– Terms mapped to: synonyms, plural/singular, categories,

aspects• e.g. [red dress] => [category = dresses, color = red]

– Phrases enforced ([iphone 5])– Whole-query expansions for popular queries– Process driven by data-mining on

queries/clicks/purchases

Page 23: Search Science @ eBay

Ranking• Given a query and a recall set, rank the set• Largely machine-learned (with business rules on top)

– Separate machine-learned models for auction and fixed-price– Auction / fixed-price interleaving handled separately

• A number of targets for machine learning– Clicks– Revenue per query/item (Why this is a good idea)

• Features:– Query/item match features, clicks & sales, seller metrics ….

Page 24: Search Science @ eBay

Refinements and Universal Search• Responsible for most non-item SRP elements:

– Aspect refinements– Category refinements– Related searches– Inline elements (universal search)– Snippets

• Mostly driven by offline analysis of user behavior

Page 25: Search Science @ eBay

Refinements and Universal Search

Category refinements

Page 26: Search Science @ eBay

Refinements and Universal Search

Aspect refinements

Page 27: Search Science @ eBay

Refinements and Universal Search

Related Searches

Page 28: Search Science @ eBay

Metrics, Tools, Monitoring• Human judgments for training sets, internal

metrics

• Tools for scraping, studying queries and result sets

• Intelligent alerting on relevance and system problems

Page 29: Search Science @ eBay

Spam• Effort broader than Search Science• Alternatives:

– Educate (explain to sellers why they shouldn’t game)– Block (don’t allow sellers to add spammy listings in first

place)– Neutralize (algorithmically remove advantage of abuse)– Enforce (remove bad listings and sellers)

• Search Science helps neutralize and flag for enforcement

Page 30: Search Science @ eBay

Details of our anti-spam algorithms

Page 31: Search Science @ eBay

Details of our anti-spam algorithms

[This page intentionally left blank]

Page 32: Search Science @ eBay

Current Frontier

Page 33: Search Science @ eBay

Query segmentation by frequency

HeadThousands

TorsoHundreds of thousands

Unique queries

TailTens of millions

Result set size

HeadMillions

TorsoTens of

thousands

TailHundreds

Immediate ConversionImmediate Filtering

Head

Torso(1/3 Head)

Tail(1/5 Head)

Head

Torso(3x Head)

Tail(5x Head)

Page 34: Search Science @ eBay

Different opportunities by segment

Head • Store per-query info (likely categories)• Browse-oriented experience• Forefront refinements

Torso • Store per-query info (likely categories)

Tail • No per-query info (match similar head/torso)?• Category prediction as classification?• Null/Low recovery (broaden result set)

Page 35: Search Science @ eBay

Query understanding (annotation)

• Query: [white gold hoop earrings]• Stage 1 (bag of words): hoop, gold, earrings,

white• Stage 2 (phrasing and rewriting):– “white gold” hoop earrings, or– color=gold, color=white, hoop, earrings

category=jewelry• Stage 3:– ProductType = earrings, style=hoop,

material=[white gold], color=white, category=jewelry

Page 36: Search Science @ eBay

Deterministic sorts• For sorting, we offer– Best Match (relevance sort)– Time: ending soonest– Time: newly listed– Price + Shipping: lowest first– Price + Shipping: highest first– Distance: nearest first

• Deterministic (non-best-match) sorts can surface very irrelevant items

Page 37: Search Science @ eBay

Wedding dress – Best Match

Page 38: Search Science @ eBay

Wedding dress – Price + shipping (low)

Page 39: Search Science @ eBay

Explicit result set construction for diversity

• “3rd-phase ranking” (after recall and ranking)

• Examine result set, enforce diverse mixes of products/interpretations

Page 40: Search Science @ eBay

Personalization and contextualization• Personalization

– What does this person like to {click, buy}– Product types, price ranges, condition, shipping

• Session contextualization– What [product type, price range, condition, etc]

item did this person {click, buy} in the immediately-preceding query?

• Contextualization > personalization

Page 41: Search Science @ eBay

Image understanding for ranking

Query

“red shoes”

Search Results

Page 42: Search Science @ eBay

Technologies we like and use• Hadoop ecosystem

• Scala (Scoobi, Scalding)

• R (gbm)

Page 43: Search Science @ eBay

Q & A