experiments with the negotiated boolean queries of the trec 2007 legal discovery track stephen...
TRANSCRIPT
Experiments with the Negotiated Boolean Queries
of theTREC 2007
Legal Discovery Track
Stephen Tomlinson
Open Text Corporation
2007 Nov 8
Overview
• who won the boolean query “negotiations” ?• can dropping the boolean operators improve on
the boolean run’s Recall@B ?• did the boolean keywords (synonyms) improve
on the natural language request text ?• can just relaxing the proximity constraints
improve Recall@B ?• can blind feedback improve Recall@B ?• can a fusion of vector and boolean approaches
improve Recall@B ?
3 Boolean Queries
• Defendant – initial boolean query proposed by the
defendant
• Plaintiff– rejoinder boolean query from the plaintiff
• Final– final negotiated boolean query
Topic 74: “All scientific studies expressly referencing health effects
tied to indoor air quality.”
Defendant:"health effect!" w/10 "air quality"
Plaintiff:(scien! OR stud! OR research) AND ("air quality" OR health)
Final:(scien! OR stud! OR research) AND ("air quality" w/15 health)
Topic 74 Boolean Results
Defendant:"health effect!" w/10 "air quality"– 2691 matches, 82% precision, 3% recall
Plaintiff:(scien! OR stud! OR research) AND ("air quality" OR health)
– 858,700 matches, 64% precision@25000 (ranked), 25% recall@25000 (ranked)
Final:(scien! OR stud! OR research) AND ("air quality" w/15 health)
– 20,516 matches, 77% precision, 22% recall
Topic 74: Missed Relevant Documents
Final Boolean:(scien! OR stud! OR research) AND ("air quality" w/15 health)
Passages in Missed Relevant Documents:• “… Lowrey A.H. (1980). Indoor air pollution …”• “assessment … entitled “Respiratory Health
Effects of Passive Smoking …”• “study … funded by the Center for Indoor Air
Research”
Defendant vs. Final Boolean: Precision
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43
Prec
• Def. Boolean won 20• Boolean won 22• (1 tied)
Mean in (-0.09, 0.15)
Topic 63: 1.00 vs. 0.02 (sugar contract)
Topic 69: 0.00 vs. 0.97 (indoor smoke ventilation)
Defendant vs. Final Boolean: Recall
-1
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43
R@B
• Def. Boolean won 0• Boolean won 42• (1 tied)
Mean in (-0.27, -0.11)
Topic 77: 0.00 vs. 0.00 (smoke NOT tobacco)
Topic 52: 0.00 vs. 0.98 (boosting crop yields)
Plaintiff vs. Final Boolean: Recall@25000
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43
R25000
• Pl. Boolean won 35• Boolean won 6• (2 tied)
Mean in (0.03, 0.19)
Topic 59: 0.76 vs. 0.01 (limestone treatment)
Topic 58: 0.24 vs. 0.94 (phosphates and health)
Plaintiff vs. Final Boolean: Recall@B
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43
R@B
• Pl. Boolean won 15• Boolean won 27• (1 tied)
Mean in (-0.09, 0.04)
Topic 63: 0.73 vs. 0.27 (sugar contract)
Topic 58: 0.18 vs. 0.94 (phosphates and health)
Vector vs. Boolean (Example)
Boolean: (scien! OR stud! OR research) AND ("air quality" w/15 health)
Vector: scien! OR stud! OR research OR air OR quality OR health
Relevance Ranking
• term frequency dampening (BM25)– wildcard variants treated as same term– for boolean proximity constraints, only count
term occurrences satisfying proximity– metadata + ocr included in document length
• inverse document frequency (log)– based on most common variant for wildcards
Vector vs. Boolean: Recall@B
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43
R@B
• Vector won 16• Boolean won 26• (1 tied)
Mean in (-0.13, 0.02)
Topic 63: 0.79 vs. 0.27 (sugar contract)
Topic 58: 0.08 vs. 0.94 (phosphates and health)
Topic 58: “… health problems caused by HPF …”
Vector R@B=0.08, Boolean R@B=0.94 • (B=8183, estRel = 1151)
Phosphat! w/75 (caus! OR relat! OR assoc! OR derive! OR correlat!) w/75 (health OR disorder! OR toxic! OR "chronic fatigue" OR dysfunction! OR irregular OR memor! OR immun! OR myopath! OR liver! OR kidney! OR heart! OR depress! OR loss OR lost)
• vector matches often didn’t mention “Phospat!”
Topic 72: “… chemical process(es) which result in onions … making persons cry”
Vector R@B=0.03, Boolean R@B=0.78 • (B=119, estRel = 98)
((scien! OR research! OR chemical)
w/25 onion!)
AND (cries OR cry! OR tear!)
• proximity clause found some long documents with just one reference to onions’ effects
Topic 63: “… exclusivity clause in a sugar contract …”
Vector R@B=0.79, Boolean R@B=0.27
• (B=294, estRel = 18)
(Sugar w/20
(contract! OR agreement! OR deal!))
AND exclusiv!
• boolean missed “U.S. sugar quota law”
Request vs. Vector: R@25000
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43
R25000
• Req. Vector won 21• Vector won 22• (0 tied)
Mean in (0.00, 0.13)
Topic 87: 1.00 vs. 0.13 (SEC reporting)
Topic 84: 0.64 vs. 0.91 (1960s films)
Impact of Doubling Proximity Distances: Recall@B
-0.4
-0.35
-0.3
-0.25
-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43
R@B
• 2x-Prox Boolean won 14• Boolean won 8• (21 tied)
Mean in (-0.03, 0.02)
Topic 61: 0.49 vs. 0.44 (waste treatment)
Topic 72: 0.39 vs. 0.78 (onions effect)
Impact of Blind Feedback: Recall@B
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43
R@B
• Boolean+BF won 16• Boolean won 21• (6 tied)
Mean in (-0.12, 0.03)
Topic 90: 0.64 vs. 0.10 (sales in England)
Topic 58: 0.01 vs. 0.94 (phosphates and health)
Fusion of Boolean, Request and Vector: Recall@B
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43
R@B
• Fusion won 20• Boolean won 20• (3 tied)
Mean in (-0.08, 0.03)
Topic 65: 0.88 vs. 0.67 (candy packaging)
Topic 58: 0.10 vs. 0.94 (phosphates and health)
Conclusions
• final negotiated boolean query often had substantially lower recall than the plaintiff boolean query
• boolean operators (AND, proximity) often have value
• blind feedback and fusion did not improve the boolean run’s Recall@B (on average)