efficient instant-fuzzy search with proximity ranking authors: inci centidil, jamshid esmaelnezhad,...

Efficient Instant-Fuzzy Search with Proximity

Ranking

Authors:Inci Centidil, Jamshid Esmaelnezhad, Taewoo

Kim, and Chen Li

IDCE Conference 2014

Presented by: Priagung Khusumanegara

• System finds answers to a query instantly while user types in keywords character-by-character.

• Fuzzy search improves user search experiences by finding relevant answers with keywords similar to query keywords.

• A main computational challenge in this paradigm is the high speed requirement

• At the same time, we also need good ranking functions that consider the proximity of keywords to compute relevance scores

Abstract

Problem Statement & Proposed

Solution

• Problem Statement:o Achieving efficient time & space complexities.

• Solution: o Index phrases with proper indexing scheme and o Develop an incremental-computation algorithm for efficiently

segmenting a query into phrases and computing relevant answers.

• Result Metrics: Experimental study on real data sets to show the tradeoffs between time, space, and quality of these solutions.

General Idea of Instant Search

• Instant search returns the answers immediately based on a partial query a user has typed in

• Many users prefer the experience of seeing the search results instantly and formulating their queries accordingly instead of being left in the dark until they hit the search button

ArchitecturePhrase Validator: • When a search server

receives a request, it first identifies all the valid phrases in the query that are in the dictionary D, and intersects their inverted lists.

• The Phrase Validator identifies the phrases (called “valid phrases”) in the query that are similar to a term in the dictionary D.

Architecture (Cont’d)Query Plan Builder: • After identifying the valid

phrases, the Query Plan Builder generates a Query Plan Q, which contains all the possible valid segmentations in a specific order.

• The ranking of Q determines the order in which the segmentations will be executed.

Architecture (Cont’d)Index Searcher: • After Q is generated, the

segmentations are passed into the Index Searcher one by one until the top-k answers are computed, or all the segmentations in the plan are used.

Architecture (Cont’d)Cache Module: • The Phrase Validator uses

the Cache module to validate a phrase without traversing the trie from scratch,

• While the Index Searcher benefits from the Cache by being able to retrieve the answers to an earlier query to reduce the computational cost.

Computing Valid Phrases

Generating Valid Segmentations

Incremental Computation of Valid Phrases

Example Table for Architecture

Explanation

• This data is structured in indexed format. Two types of indices are used to structure this data

1. Trie Indices2. Forward Indices

Index Structure

• Indiceso TrieoForward

ExperimentsIn the experiments, they implemented the following method:1. FindAll (“FA”)2. QuerySegmentation (“QS”)3. Term Pair (“TP”)

Efficiency of Computing Valid Phrases

Query Time

Cache Hit Rate

Scalability

Conclusion• They studied how to improve ranking of an

instant-fuzzy search system by considering proximity information when we need to compute top-k answers

• They presented an incremental-computation algorithm for finding the indexed phrases in a query efficiently

• The experiments on real data showed the efficiency of the proposed technique for 2-keyword and 3-keyword queries that are common in search applications.

efficient instant-fuzzy search with proximity ranking authors: inci centidil, jamshid esmaelnezhad,...

query keywords

query plan q

earlier query

partial query

index phrases

efficient instantfuzzy

search server

search results

Documents

jamshid i an

ventura county leadership academy january 4 th , 2008...

crs piloting-idce-0113

jamshid damooei, ph.d. professor of economics & chair...

taewoo corporation company profile

caching geospatial objects in web browsers (demo...

idce presentation 012012

troublesome weeds of new...

taewoo jeon , sung jin kim , jisun yoon, jinwoo byun, hye...

speakers nazir alli, sanral dr ernest j barenberg,...

master’s degree - clark web viewa program in...

sang hyun parka, sung-gyu lee a, soojeong baek, taewoo ha

contribution of ghiyath al'din jamshid mas'ud al'kashi in...

image aided discrete element modeling (dem) for railroad...

1 feature selection jamshid shanbehzadeh, samaneh yazdani...

gsm network security ‘s research project by: jamshid...

virtual nanofab a silicon nanofabrication trainer nick...

original article antibiotic resistance of …...

self-similar solutions of viscous resistive accretion flows...

by dr jamshid najafian. minoxidil and hydralazine 1- potent...