date : 2013/10/30 author : parvaz mahdabi , shima gerani ,
DESCRIPTION
Leveraging Conceptual Lexicon : Query Disambiguation using Proximity Information for Patent Retrieval. Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani , Jimmy Xiangji Huang and Fabio Crestani Source : SIGIR’13 Advisor : Jia -ling Koh Speaker : Yi- hsuan Yeh. Outline. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/1.jpg)
Leveraging Conceptual Lexicon:Query Disambiguation using
Proximity Information for Patent Retrieval
Date : 2013/10/30Author : Parvaz Mahdabi, Shima Gerani,
Jimmy Xiangji Huang and Fabio CrestaniSource : SIGIR’13Advisor : Jia-ling KohSpeaker : Yi-hsuan Yeh
![Page 2: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/2.jpg)
2
Outline Introduction Method Experiments Conclusion
![Page 3: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/3.jpg)
3
Introduction Patent prior art search is a task in patent
retrieval where the goal is to rank documents which describe prior art work related to a patent application.
Challenge:1. Find a focused information need and remove the
ambiguous and noisy terms.2. Query disambiguation. (ex: bus)
![Page 4: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/4.jpg)
4
Introduction Previous work has not fully studied the effect of
using proximity information and exploiting domain specific resources for performing query disambiguation.
1. Terms closer to query terms are more likely to be related to the query topic.
2. Using a domain dependent resource leads to the extraction of more relevant expansion concepts.
Propose a proximity based framework for query expansion which utilizes a conceptual lexicon for patent retrieval.
![Page 5: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/5.jpg)
5
FrameworkQuerypatent
document
Query
Query-specific lexicon
Proximity-based
method
Query expansion
termsRe-rank
result list
![Page 6: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/6.jpg)
6
Outline Introduction Method
Query document reduction Building conceptual lexicon Proximity-based framework Document relevance score Expansion concept selection strategies
Experiments Conclusion
![Page 7: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/7.jpg)
7
Query document reduction Query patent: title, abstract, description,
and claims Example:
1. A chair having only two legs. 2. The chair of claim , further comprising at least
one leg made of wood.
Claim is independent because it does not reference any other claim.
Use the items in the first independent claim as the initial query.
![Page 8: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/8.jpg)
8
Building conceptual lexicon Form: IPC (International Patent Classification)
definition pages Stop-words removal
Filter out document frequency > 10
The IPC class of the query is searched in the lexicon and the terms matching this class are considered as candidate expansion terms.
Candidate expansion terms
![Page 9: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/9.jpg)
9
Proximity-based framework Assume: An expansion term refer with
higher probability to the query terms closer to its position.
1
20
32
12
An expansion term()
Query term ()
: the query term at position in the document d
Document d
Position
![Page 10: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/10.jpg)
10
𝑝 (𝑞∨𝑡 𝑗 )=25
𝑑1𝑑2
𝑑3𝑑4 𝑑5
Query
Term
Query:Term : chair
![Page 11: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/11.jpg)
11
Gaussian kernel Laplace kernel
Rectangle kernel
𝑖𝑗
𝑘 ( 𝑗 ,𝑖 )
![Page 12: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/12.jpg)
12
Example: Rectangle kernel
𝜎: bandwidth parameterAssume:
𝑖𝑖−1 𝑖+1
0.144
𝑖+2𝑖−2 𝑖+3𝑖−3
𝑘 (𝑖 , 𝑗 )
𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛
Bandwidth = 2
![Page 13: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/13.jpg)
13
Document relevance score1. Avg position strategy
2. Max position strategy
expansion term
𝑡1 𝑡 2
𝑡 3
Documents
![Page 14: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/14.jpg)
14
Expansion concept selection strategies1. Explicit expansion concepts (EEC)
Restrict expansion term that appear in (query document).
2. Implicit expansion concepts (IEC) Use all expansion term.
![Page 15: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/15.jpg)
15
3. Combine search strategies (CSS) Linear combine query result lists and IPC
expansion concepts result list.
4. Proximity-based pseudo relevance feedback (PPRF)
Extracting expansion concepts form the feedback documents.
![Page 16: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/16.jpg)
16
Outline Introduction Method Experiments Conclusion
![Page 17: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/17.jpg)
17
Experiments Dataset:
CLEF-IP 2010, CLEF-IP 2011
Evaluation: Top 1000 results MAP, Recall and PRES(patent retrieval evaluation
score)
Baseline: Language modeling with Dirichlet smoothing + language model re-rank
![Page 18: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/18.jpg)
18
Motivation for Using Proximity Information CLEF-IP 2010 100 random queries, top 100 documents
![Page 19: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/19.jpg)
19
Effect of Density Kernel
![Page 20: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/20.jpg)
20
Comparison of Max and Avg Strategy CLEF-IP 2010 Gaussian kernel IEC
![Page 21: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/21.jpg)
21
Number of Expansion Terms
![Page 22: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/22.jpg)
22
Effect of Combination
λ=0: the query expansion model is used λ=1: the initial query is used.
![Page 23: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/23.jpg)
23
Effect of Query Reformulation Gaussian kernel Max strategy 40 expansion terms
![Page 24: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/24.jpg)
24
Outline Introduction Method Experiments Conclusion
![Page 25: Date : 2013/10/30 Author : Parvaz Mahdabi , Shima Gerani ,](https://reader036.vdocument.in/reader036/viewer/2022062811/56816056550346895dcf7f81/html5/thumbnails/25.jpg)
25
Conclusion Constructed a domain dependent conceptual
lexicon which can be used as an external resource for query expansion.
Proximity-based retrieval framework provides a principled way to calculate the importance weight for expansion terms selected from the conceptual lexicon.
We showed that proximity of expansion terms to query terms is a good indicator of the importance of the expansion terms.