fficiency-effectiveness trade offs in modern web …theses.di.unipi.it/slides/trani.pdf · lucchese...
TRANSCRIPT
Roberto Traniroberto.trani@{isti.cnr.it, di.unipi.it}
December 5th, 2019
EFFICIENCY-EFFECTIVENESS TRADE-OFFS IN MODERN WEB SEARCH
EFFICIENCY-EFFECTIVENESS TRADE-OFFS IN MODERN WEB SEARCH
http://hpc.lab.isti.cnr.it
Search…
Search…
1014 WEB PAGES 109 PRODUCTS
1011 NEWS 1012 POSTS
Search…
1014 WEB PAGES
+100ms ⇒ -1% sales
109 PRODUCTS
1011 NEWS 1012 POSTS
+500ms ⇒ -20% traffic+100ms ⇒ -0,6% profit
EFFICIENT SCORING
EFFICIENT SCORING
‣ State-of-the-art Learning to Rank models use forests of thousands decision trees e.g., 5000 trees, 32 leaves per tree, 700 features 34ms to score 1000 documents
F4 50.1
F1 0.57
F2 5.0
F3 -3.8
7.1
1.8 -3.1
0.4 -1.4
x[F3] ≤ − 3.8?
s1,w1 s2,w2 sN,wN Score = siwi!2
M=�
DOCUMENT
QUERY DOCUMENT FEATURE EXTRACTION DOCUMENT FEATURES
T1 T2 TN…
f1f0
increasing values
f|F|�1
num. leaves num. leaves num. leaves
num.leaves � num. trees
num. leaves
num. leaves
x[f0] �0
x[f0] �1 x[f0] �2 x[f0] �3
x[f1] �4
x[f1] �5
�
h
leafvalues
mask
leafindexes
EFFICIENT SCORING
Novel traversal of the entire forest: linear scan of thresholds and succinct storage of partial results of the computation
18x with SIMD 120x with Multithread 600x with GPU
up to 6 times faster than state-of-the-art competitors
Highly (auto) vectorizableLow branch misprediction ++Cache friendly
BEST FINDING
BEST FINDING
Several applications need to find the best element of a set
- Question-Answering
- Recommender systems
- Route suggestion
- Advertising
- Machine Translation
- Key term extraction
BEST FINDING
▸ ML algorithms for ranking to assign scores to each item individually
▸ ML pairwise classifier to perform a round robin tournament
3 3 2 14 4
4
3
1
2
1
2Find the
Champion!
New Machine Learning solutions to find the Best
Efficient algorithms to find a Tournament Champion
FILTERING OF ATTRIBUTE-SORTED RESULTS
FILTERING OF ATTRIBUTE-SORTED RESULTS
FILTERING OF ATTRIBUTE-SORTED RESULTS
up to 150 times faster than state-of-the-art competitor
Filter out some of the results to improve the quality of the list Preserving the sort by attribute
Dynamic ProgrammingCombinatorics ++Approximation algorithms
IMPACT IN INDUSTRY AND TECHNOLOGY TRANSFER
OUR TEAM
THANK YOU !ROBERTO TRANI [email protected]
RAFFAELE PEREGO (LAB HEAD) [email protected] http://hpc.isti.cnr.it
Beretta L., et al. "An Optimal Algorithm to Find Champions of Tournament Graphs”. International Symposium on String Processing and Information Retrieval (SPIRE). Springer, 2019.
Lucchese C., et al. “Quickscorer: A fast algorithm to rank documents with additive ensembles of regression trees”. Proceedings of the 38th International ACM Conference on Research and Development in Information Retrieval (SIGIR). ACM, 2015. [Best Paper Award]
Nardini F.M., et al. "Fast Approximate Filtering of Search Results Sorted by Attribute”. Proceedings of the 42nd International ACM Conference on Research and Development in Information Retrieval (SIGIR). ACM, 2019.