large-scale retrieval & language model · 2021. 7. 8. · training strategy for hdct...
TRANSCRIPT
![Page 1: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/1.jpg)
Large-scale Retrieval & Language Model
Jimblin
![Page 2: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/2.jpg)
Overview
• Context-Aware Document Term Weighting for Ad-Hoc Search(WWW2020 )
• Pre-training Tasks for Embedding-based Large-scale Retrieval(ICLR2020)
• ColBERT: Efficient and E�ffective Passage Search via Contextualized Late Interaction over BERT(SIGIR2020)
![Page 3: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/3.jpg)
![Page 4: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/4.jpg)
Motivation & Contribution
• Bag-of-words document representations is limited by the shallow frequency-based term weighting scheme. (tf.idf, bm25)
• This paper uses the contextual word representations from BERT to generate more effective document term weights.
![Page 5: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/5.jpg)
Methodology
![Page 6: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/6.jpg)
Passage-Level Term Weighting
一个doc分为多个passage,获取每个passage的向量表示。
![Page 7: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/7.jpg)
Document-Level Term Weighting
pw_i=1 :表示每个段落的重要性一致;pw_i=1/i:表示段落的重要性在递减,第一个段落比较重要,最后一个段落比较不重要。
![Page 8: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/8.jpg)
Retrieval with HDCT Index
将BM25算法中的idf权重替换为HDCT权重
![Page 9: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/9.jpg)
Training Strategy For HDCT
![Page 10: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/10.jpg)
Training Strategy For HDCT
1、Supervision from Document Content
![Page 11: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/11.jpg)
Training Strategy For HDCT
2、Supervision from Relevance
![Page 12: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/12.jpg)
Training Strategy For HDCT
3、Supervision from Pseudo-Relevance
Feedback
![Page 13: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/13.jpg)
Experiment
BM25FE is an ensemble of BM25 rankers on different document fields.BM25+RM3 is a popular query expansion technique
![Page 14: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/14.jpg)
Experiment
LeToR is a popular feature-based learning-to-rank method
BERT-FirstP is a neural BERT-based re-ranker
![Page 15: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/15.jpg)
Experiment
HDCT-PRFmarco was trained with the PRF-based weak supervision strategy using BM25FE.
![Page 16: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/16.jpg)
Conclusion
• HDCT better captures key terms in a passage than tf.
• HDCT allows efficient and effective retrieval from an inverted index.
• A content-based weak-supervision strategy is presented to train HDCT without using relevance labels.
• Widely used in online system.
![Page 17: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/17.jpg)
![Page 18: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/18.jpg)
Motivation & Contribution
• BERT-style model has succeeded in re-ranking the retrieved documents.
• Using BERT-style model in the retrieval phase remains less well studied.
![Page 19: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/19.jpg)
Methodology
Inference Learning
![Page 20: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/20.jpg)
three pre-training tasks
![Page 21: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/21.jpg)
three pre-training tasks
![Page 22: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/22.jpg)
three pre-training tasks
d1
d2
![Page 23: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/23.jpg)
Experiment
![Page 24: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/24.jpg)
Experiment
![Page 25: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/25.jpg)
Conclusion
• Models with random initialization (No Pretraining) or the unsuitable token-level pre-training task (MLM) are no better than the robust IR baseline BM-25 in most cases.
![Page 26: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/26.jpg)
![Page 27: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/27.jpg)
Motivation & Contribution
• A novel late interaction paradigm for estimating relevance between a query and a document
![Page 28: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/28.jpg)
Methodology
Late Interaction
Learning
![Page 29: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/29.jpg)
two-stage to retrieve the top-k documents
• the first is an approximate stage aimed at fi�ltering (faiss, facebook)
• the second is a refi�nement stage
![Page 30: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/30.jpg)
Experiment
![Page 31: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/31.jpg)
Experiment
![Page 32: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/32.jpg)
Experiment
![Page 33: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/33.jpg)
Conclusion
• ColBERT can leverage the expressiveness of deep LMs while greatly speeding up query processing.
• Easy to implement
• SOTA
![Page 34: Large-scale Retrieval & Language Model · 2021. 7. 8. · Training Strategy For HDCT 3、Supervision from Pseudo-Relevance Feedback. Experiment BM25FE is an ensemble of BM25 rankers](https://reader035.vdocument.in/reader035/viewer/2022071517/613bde54f8f21c0c82693ddc/html5/thumbnails/34.jpg)
END!