machine reading the web
TRANSCRIPT
![Page 1: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/1.jpg)
Machine Readingthe Web
Estevam R. Hruschka Jr. Federal University of São Carlos
![Page 2: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/2.jpg)
Disclaimers • A previous version of this tutorial was presented at
IBERAMIA2012 (h?p://iberamia2012.dsic.upv.es/tutorials/). • Feel free to e-‐mail me ([email protected]) with
quesKons about this tutorial or any feedback/suggesKons/criKcisms. Your feedback can help improving the quality of these slides, thus, they are very welcome.
• As in many tutorials’ slides, these slides were prepared to be presented, and la?er studied. Thus, they are meant to be more self-‐contained than slides from a paper presentaKon.
![Page 3: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/3.jpg)
Disclaimers • Due to Kme constraints, I do not intend to cover all the
algorithms and publicaKons related to YAGO, KnowItAll and NELL. What I do intend, instead, is to give an overview of all three projects and what is the main approach to “Read the Web”, used in each project.
• YAGO, KnowItAll and NELL are not the only research efforts focusing on “Reading the Web”. They were selected, to be presented in this tutorial, because they show three different and very relevant approaches to this problem, but it does not mean they are the best ones at all.
![Page 4: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/4.jpg)
Outline
• Machine Learning • Machine Reading • Reading the Web
– YAGO – KnowItAll – NELL
![Page 5: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/5.jpg)
Outline
• Machine Learning • Machine Reading • Reading the Web
– YAGO – KnowItAll – NELL
![Page 6: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/6.jpg)
Picture taken from [Fern, 2008]
![Page 7: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/7.jpg)
Outline
• Machine Learning • Machine Reading • Reading the Web
– YAGO – KnowItAll – NELL
![Page 8: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/8.jpg)
Outline
• Machine Learning • Machine Reading • Reading the Web
– YAGO – KnowItAll – NELL
![Page 9: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/9.jpg)
Picture taken from [DARPA, 2012]
![Page 10: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/10.jpg)
Picture taken from [DARPA, 2012]
![Page 11: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/11.jpg)
Outline
• Machine Learning • Machine Reading • Reading the Web
– YAGO – KnowItAll – NELL
![Page 12: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/12.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 13: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/13.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 14: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/14.jpg)
The YAGO-‐NAGA Project: Harves?ng, Searching, and Ranking
Knowledge from the Web
![Page 15: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/15.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 16: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/16.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 17: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/17.jpg)
KnowItAll
![Page 18: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/18.jpg)
KnowItAll: Open InformaKon ExtracKon
![Page 19: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/19.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 20: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/20.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 21: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/21.jpg)
NELL
![Page 22: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/22.jpg)
Outline
• Machine Learning • Machine Reading • Reading the Web
– YAGO – KnowItAll – NELL
![Page 23: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/23.jpg)
Machine Learning
• What is Machine Learning? The field of Machine Learning seeks to answer the quesKon “How can we build computer systems that automaKcally improve with experience, and what are the fundamental laws that govern all learning processes?” [Mitchell, 2006]
![Page 24: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/24.jpg)
Machine Learning
• What is Machine Learning? a machine learns with respect to a parKcular: -‐ task T -‐ performance metric P -‐ type of experience E if the system reliably improves its performance P at task T, following experience E. [Mitchell, 1997]
![Page 25: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/25.jpg)
Machine Learning
• Examples of Machine Learning approaches for different tasks (T), performance metrics (P) an experiences (E)
-‐ data mining -‐ autonomous discovery -‐ database updaKng -‐ programming by example -‐ Pa?ern recogniKon
![Page 26: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/26.jpg)
Machine Learning
• Supervised Learning; • Unsupervised Learning • Semi-‐Supervised Learning
![Page 27: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/27.jpg)
Supervised Learning
![Page 28: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/28.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 29: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/29.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 30: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/30.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 31: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/31.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 32: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/32.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 33: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/33.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 34: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/34.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 35: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/35.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 36: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/36.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 37: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/37.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 38: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/38.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 39: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/39.jpg)
Supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
![Page 40: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/40.jpg)
Unsupervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
![Page 41: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/41.jpg)
Unsupervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
![Page 42: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/42.jpg)
Unsupervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
![Page 43: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/43.jpg)
Unsupervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
![Page 44: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/44.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 45: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/45.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 46: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/46.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 47: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/47.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 48: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/48.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 49: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/49.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 50: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/50.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 51: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/51.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 52: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/52.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 53: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/53.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 54: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/54.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 55: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/55.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 56: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/56.jpg)
Semi-‐supervised Learning
0
5
10
15
20
25
0 5 10 15 20 25
Series1
Series2
Unlabeled
![Page 57: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/57.jpg)
Outline
• Machine Learning • Machine Reading • Reading the Web
– YAGO – KnowItAll – NELL
![Page 58: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/58.jpg)
Machine Reading
• “The autonomous understanding of text” [Etzioni et al., 2007]
• “One of the most important methods by which human beings learn is by reading” [Clark et al., 2007], thus why not building machines capable of learning by reading?
![Page 59: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/59.jpg)
Machine Reading
• “The problem of deciding what was implied by a wri?en text, of reading between the lines is the problem of inference.” [Norvig, 2007]
• Typically, Machine Reading is different from Natural Language Processing alone
![Page 60: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/60.jpg)
Machine Reading
![Page 61: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/61.jpg)
Machine Reading
![Page 62: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/62.jpg)
Machine Reading
![Page 63: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/63.jpg)
Machine Reading
![Page 64: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/64.jpg)
Machine Reading
• One important approach to machine reading is to extract facts from text and store them in a structured form.
• Facts can be seen as enKKes and their relaKons
• Ontology is one of the most common representaKon for the extracted facts
![Page 65: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/65.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading
This slide was adapted from [Hady et al., 2011]
![Page 66: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/66.jpg)
Machine Reading
same
This slide was adapted from [Hady et al., 2011]
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
![Page 67: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/67.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading
same same same
same same
same
This slide was adapted from [Hady et al., 2011]
![Page 68: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/68.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading
same same same
same same
same
uncleOf
owns
hires
headOf
This slide was adapted from [Hady et al., 2011]
![Page 69: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/69.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading
same same same
same same
same
uncleOf
owns
hires
headOf
affairWith
affairWith enemyOf
This slide was adapted from [Hady et al., 2011]
![Page 70: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/70.jpg)
Machine Reading
• Ontology RepresentaKon
• Named EnKty ResoluKon/ExtracKon
• RelaKon ExtracKon
![Page 71: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/71.jpg)
Machine Reading
• Ontology RepresentaKon
Facts (RDF triples) 1: (Jim, hasAdvisor, Mike) 2: (Surajit, hasAdvisor, Jeff) 3: (Madonna, marriedTo, GuyRitchie) 4: (Nicolas, marriedTo, Carla) 5: (ManchesterU, wonCup, ChampionsLeague)
ReificaKon: “Facts about Facts”: 6: (1, inYear, 1968) 7: (2, inYear, 2006) 8: (3, validFrom, 22-‐Dec-‐2000) 9: (3, validUnKl, Nov-‐2008) 10: (4, validFrom, 2-‐Feb-‐2008) 11: (2, source, SigmodRecord) 12: (5, inYear, 1999) 13: (5, locaKon, CampNou) 14: (5, source, Wikipedia)
![Page 72: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/72.jpg)
Machine Reading
• Named EnKty ResoluKon [Theobald & Weikum, 2012] – Which individual enKKes belong to which classes?
• instanceOf (Surajit Chaudhuri, computer scien>sts), • instanceOf (BarbaraLiskov, computer scien>sts), • instanceOf (Barbara Liskov, female humans), …
– Which names denote which enKKes? • means (“Lady Di“, Diana Spencer), • means (“Diana Frances MountbaGen-‐Windsor”, Diana Spencer),
… • means (“Madonna“, Madonna Louise Ciccone), • means (“Madonna“, Madonna(pain>ng by Edward Munch)), …
![Page 73: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/73.jpg)
Machine Reading
• RelaKon ExtracKon [Theobald & Weikum, 2012] – Which instances (pairs of individual enKKes) are there for given binary relaKons with specific type signatures? • hasAdvisor (JimGray, MikeHarrison) • hasAdvisor (HectorGarcia-‐Molina, Gio Wiederhold) • hasAdvisor (Susan Davidson, Hector Garcia-‐Molina) • graduatedAt (JimGray, Berkeley) • graduatedAt (HectorGarcia-‐Molina, Stanford) • hasWonPrize (JimGray, TuringAward) • bornOn (JohnLennon, 9Oct1940) • diedOn (JohnLennon, 8Dec1980) • marriedTo (JohnLennon, YokoOno)
![Page 74: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/74.jpg)
Machine Reading
• RelaKon Discovery – Which new relaKons are there for given pair of enKKes? • hasAdvisor (JimGray, MikeHarrison)
![Page 75: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/75.jpg)
Machine Reading
• RelaKon Discovery – Which new relaKons are there for given pair of enKKes? • hasAdvisor (JimGray, MikeHarrison) • hasCoAuthor(HectorGarcia-‐Molina, Gio Wiederhold)
![Page 76: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/76.jpg)
Machine Reading
• RelaKon Discovery – Which new relaKons are there for given pair of enKKes? • hasAdvisor (JimGray, MikeHarrison) • hasCoAuthor(HectorGarcia-‐Molina, Gio Wiederhold) • graduatedAt (JimGray, Berkeley)
![Page 77: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/77.jpg)
Machine Reading
• RelaKon Discovery – Which new relaKons are there for given pair of enKKes? • hasAdvisor (JimGray, MikeHarrison) • hasCoAuthor(HectorGarcia-‐Molina, Gio Wiederhold) • graduatedAt (JimGray, Berkeley) • studiedAt (HectorGarcia-‐Molina, Stanford) • bornOn (JohnLennon, 9Oct1940) • releasedAlbum (JohnLennon, 10Dec1965)
![Page 78: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/78.jpg)
Machine Reading • Named EnKty ResoluKon/ExtracKon and RelaKon ExtracKon – Semi-‐structured data
The “Low-‐Hanging Fruit” • Wikipedia infoboxes & categories • HMTL lists & tables, etc.
– Free text
• Hearst-‐pa?erns; clustering by verbal phrases • Natural-‐language processing • Advanced pa?erns & iteraKve bootstrapping (“Dual IteraKve Pa?ern RelaKon ExtracKon”)
– POS tagging and NP chunking:
![Page 79: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/79.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 80: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/80.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 81: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/81.jpg)
The YAGO-‐NAGA Project: Harves?ng, Searching, and Ranking
Knowledge from the Web
![Page 82: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/82.jpg)
The YAGO-‐NAGA Project: Harves?ng, Searching, and Ranking
Knowledge from the Web
![Page 83: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/83.jpg)
YAGO
• Yet Another Great Ontology -‐ YAGO • Main Goal: building a conveniently searchable, large-‐scale, highly accurate knowledge base of common facts in a machine-‐processable representaKon
![Page 84: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/84.jpg)
YAGO
• Turn Web into Knowledge Base [Weikum et al., 2009] – Building a comprehensive Knowledge Base of human knowledge
– knowledge from Wikipedia and WordNet – the ontology check itself for precision
![Page 85: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/85.jpg)
YAGO
• The knowledge base is automaKcally constructed from Wikipedia
• Each arKcle in Wikipedia becomes an enKty in the kb (e.g., since Leonard Cohen has an arKcle in Wikipedia, LeonardCohen becomes an enKty in YAGO).
![Page 86: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/86.jpg)
YAGO
![Page 87: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/87.jpg)
YAGO Free Text
![Page 88: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/88.jpg)
YAGO Free Text
![Page 89: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/89.jpg)
YAGO Free Text
InfoBox
![Page 90: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/90.jpg)
YAGO Wikipedia InfoBox
![Page 91: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/91.jpg)
YAGO Wikipedia InfoBox
Semi-‐structured data The “Low-‐Hanging Fruit”
![Page 92: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/92.jpg)
YAGO Wikipedia InfoBox
Semi-‐structured data The “Low-‐Hanging Fruit”
![Page 93: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/93.jpg)
YAGO
• Certain categories are exploited to deliver type informaKon (e.g., the arKcle about Leonard Cohen is in the category Canadian poets, so he becomes a Canadian poet).
![Page 94: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/94.jpg)
YAGO
![Page 95: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/95.jpg)
YAGO
![Page 96: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/96.jpg)
YAGO • For each category of a page [Hoffart et al., 2012]
– Using shallow parsing, determine the head word of the category name. In the example of Canadian poets, the head word is poets.
– If the head word is in plural, then proposes the category as a class and the arKcle enKty as an instance
– Link the class to the WordNet taxonomy (most frequent sense of the head word in WordNet)
• only countable nouns can appear in plural form • only countable nouns can be ontological classes • themaKc categories (such as Canadian poetry) are different from conceptual Categories
![Page 97: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/97.jpg)
YAGO
• head words that are not conceptual even though they appear in plural (such as stubs in Canadian poetry stubs) are in the first list of excepKons.
• words that do not map to their most frequent sense, but to a different sense are in the second excepKon list – The word capital, e.g., refers to the main city of a country in the majority of cases and not to the financial amount, which is the most frequent sense in WordNet.
![Page 98: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/98.jpg)
YAGO • About 100 manually defined relaKons
– wasBornOnDate – locatedIn – hasPopulaKon
• Categories and infoboxes are exploited to deliver facts (instances of relaKons).
• Manually defined pa?erns that map categories and infobox a?ributes to fact templates – infobox a?ribute born=Montreal, thus wasBornIn(LeonardCohen, Montreal)
• Pa?ern-‐based extracKons resulted in 2 million extracted enKKes and 20 million facts
![Page 99: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/99.jpg)
YAGO • Based on declaraKve rules (stored in text files) • The rules take the form of subject-‐ predicate-‐object triples, so that they are basically addiKonal facts
• There are different types of rules
![Page 100: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/100.jpg)
YAGO • Factual rules: definiKon of all relaKons, their domains and
ranges, and the definiKon of the classes that make up the YAGO hierarchy of literal types.
• Implica?on rules: express that if certain facts appear in the knowledge base, then another fact shall be added. Horn clause rules.
• Replacement rules: for interpreKng micro-‐formats, cleaning up HTML tags, and normalizing numbers.
• Extrac?on rules: apply primarily to pa?erns found in the Wikipedia infoboxes, but also to Wikipedia categories, arKcle Ktles, and even other regular elements in the source such as headings, links, or references.
![Page 101: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/101.jpg)
YAGO • AutomaKcally verifies consistency
– Check uniqueness of funcKonal arguments • spouse(x,y) ∧ diff(y,z) ⇒ ¬spouse(x,z)
– Check domains and ranges of relaKons • spouse(x,y) ⇒ female(x) • spouse(x,y) ⇒ male(y) • spouse(x,y) ⇒ (f(x)∧m(y)) ∨ (m(x)∧f(y))
![Page 102: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/102.jpg)
YAGO • AutomaKcally verifies consistency
– Hard Constraint • hasAdvisor(x,y) ∧ graduatedInYear(x,t) ∧ graduatedInYear(y,s) ⇒ s < t
– Sor Constraint • firstPaper(x,p) ∧ firstPaper(y,q) ∧ author(p,x) ∧ author(p,y) ) ∧
inYear(p) > inYear(q) + 5years ⇒ hasAdvisor(x,y) [0.6]
![Page 103: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/103.jpg)
YAGO
• Ontology RepresentaKon – EnKKes and RelaKons of public interest – Format: TSV, RDF, XML, N3, Web Interface – Learns
• Instances and pa?erns from Wikipedia; • Taxonomy from WordNet; • Geotags informaKon from Geonames.
![Page 104: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/104.jpg)
YAGO
• Named EnKty ResoluKon/ExtracKon [Theobald & Weikum, 2012] – Based on rules and pa?erns extracted from Wikipedia
– DisambiguaKon is a relevant issue – Semi-‐structured data
The “Low-‐Hanging Fruit” • Wikipedia infoboxes & categories • HMTL lists & tables, etc.
![Page 105: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/105.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading
This slide was adapted from [Hady et al., 2011]
![Page 106: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/106.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading
This slide was adapted from [Hady et al., 2011]
![Page 107: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/107.jpg)
YAGO
• RelaKon ExtracKon [Theobald & Weikum, 2012] – Based on rules and pa?erns extracted from Wikipedia
– Semi-‐structured data The “Low-‐Hanging Fruit” • Wikipedia infoboxes & categories • HMTL lists & tables, etc.
![Page 108: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/108.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading
This slide was adapted from [Hady et al., 2011]
![Page 109: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/109.jpg)
Machine Reading
same
This slide was adapted from [Hady et al., 2011]
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
![Page 110: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/110.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading
same same same
same same
same
This slide was adapted from [Hady et al., 2011]
![Page 111: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/111.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading
same same same
same same
same
uncleOf
owns
hires
headOf
This slide was adapted from [Hady et al., 2011]
![Page 112: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/112.jpg)
YAGO
• YAGO2: Exploring and Querying World Knowledge in Time, Space, Context, and Many Languages – New relaKons specifically designed to cover Kme, space and context
– Wikipedia translated pages as sources for other languages
![Page 113: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/113.jpg)
YAGO
• More on YAGO: – Very nice tutorials:
• "SemanKc Knowledge Bases from Web Sources" at IJCAI 2011, Barcelona, July 2011 "HarvesKng Knowledge from Web Data and Text" at CIKM 2010, Toronto, October 2010 "From InformaKon to Knowledge: HarvesKng EnKKes and RelaKonships from Web Sources" at PODS 2010, Indianapolis, June 2010
– Project Website: • hWp://www.mpi-‐inf.mpg.de/yago-‐naga/
![Page 114: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/114.jpg)
YAGO • More on YAGO (hWp://www.mpi-‐inf.mpg.de/yago-‐naga/)
![Page 115: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/115.jpg)
YAGO • More on YAGO (hWp://www.mpi-‐inf.mpg.de/yago-‐naga/)
![Page 116: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/116.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 117: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/117.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 118: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/118.jpg)
KnowItAll
![Page 119: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/119.jpg)
KnowItAll: Open InformaKon ExtracKon
![Page 120: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/120.jpg)
KnowItAll: Open InformaKon ExtracKon
![Page 121: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/121.jpg)
KnowItAll
• MoKvaKon: New Paradigm for Search [Etzioni, 2008]
– The future of Web Search – Read the Web instead of retrieving Web pages to perform Web Search
![Page 122: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/122.jpg)
KnowItAll
• InformaKon ExtracKon (IE) + tractable inference
– IE(sentence) = who did what? • speaker(P. Smith, ECMLPKDD2012)
– Inference = uncover implicit informaKon • Will Pi?sburgh Steelers be champions again?
• Open InformaKon ExtracKon [Banko et al., 2007]
![Page 123: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/123.jpg)
Open InformaKon ExtracKon [Banko et al., 2007]
• Open IE systems avoid specific nouns and verbs • Extractors are unlexicalized—formulated only in terms of:
– syntacKc tokens (e.g., part-‐of-‐speech tags) – closed-‐word classes (e.g., of, in, such as).
• Open IE extractors focus on generic ways in which relaKonships are expressed in English
– naturally generalizing across domains.
![Page 124: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/124.jpg)
Open InformaKon ExtracKon
• Open IE systems are tradiKonally based on three steps [Etzioni et al., 2011]: – 1. Label: Sentences are automaKcally labeled with extracKons using heurisKcs or distant supervision.
– 2. Learn: A relaKon phrase extractor is learned using a sequence-‐labeling graphical model (e.g., CRF).
– 3. Extract: given a sentence as input, idenKfies a candidate pair of NP arguments (Arg1, Arg2) from the sentence, and then uses the learned extractor to label each word between the two arguments as part of the relaKon phrase or not.
![Page 125: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/125.jpg)
Open InformaKon ExtracKon
• TextRunner [Banko et al., 2007] was the first OIE system;
• OIE became the main focus of the KnowItAll project;
• Two main problems: – incoherent extracKons; – uninformaKve relaKons
![Page 126: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/126.jpg)
Open InformaKon ExtracKon
• incoherent extracKons
![Page 127: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/127.jpg)
Open InformaKon ExtracKon
• uninformaKve relaKons
![Page 128: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/128.jpg)
Open InformaKon ExtracKon • TextRunner was based on
![Page 129: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/129.jpg)
OIE: the second generaKon • New syntacKc constraint based on POS tag pa?erns
• simple verb phrase (e.g., invented) • verb phrase followed immediately by a preposiKon or
parKcle (e.g., located in) • verb phrase followed by a simple noun phrase and ending
in a preposiKon or parKcle (e.g., has atomic weight of) • mulKple possible matches, then the longest possible match
is chosen.
![Page 130: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/130.jpg)
OIE: the second generaKon • New lexical constraint to separate valid relaKon phrases from over-‐specified relaKon phrases
• The lexical constraint is based on the intuiKon that a valid relaKon phrase should take many disKnct arguments in a large corpus.
![Page 131: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/131.jpg)
OIE: the second generaKon • New OIE System: ReVerb [Fader et al., 2011]
– Input: a POS-‐tagged and NP-‐chunked sentence – Output: a set of (x,r,y) extracKon triples – Based on two extracKon algorithm:
• 1. RelaKon ExtracKon: based on the new constraints • 2. Argument ExtracKon: For each relaKon phrase r iden-‐ Kfied in Step 1, find the nearest noun phrase x to the ler and the nearest noun phrase y to the right of r in s.
![Page 132: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/132.jpg)
OIE: the second generaKon • New OIE System: ReVerb [Fader et al., 2011]
![Page 133: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/133.jpg)
OIE: the second generaKon
![Page 134: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/134.jpg)
OIE: the second generaKon Table extracted from [Etzioni et al., 2011]
![Page 135: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/135.jpg)
OIE: the second generaKon • New OIE System: ArgLearner [Etzioni et al., 2011]
![Page 136: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/136.jpg)
OIE: the second generaKon • New OIE System: • ReVerb + ArgLearner = R2A2 [Etzioni et al., 2011]
![Page 137: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/137.jpg)
OIE: the second generaKon • New OIE System: • ReVerb + ArgLearner = R2A2 [Etzioni et al., 2011] Free text
Hearst-‐paWerns; clustering by verbal phrases Natural-‐language processing Advanced paWerns & itera?ve bootstrapping
(“Dual Itera?ve PaWern Rela?on Extrac?on”)
POS tagging and NP chunking:
![Page 138: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/138.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading with OIE
This slide was adapted from [Hady et al., 2011]
![Page 139: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/139.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading with OIE
same same same
same same
same
This slide was adapted from [Hady et al., 2011]
![Page 140: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/140.jpg)
Machine Reading with OIE
same same same
same same
same
This slide was adapted from [Hady et al., 2011]
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
![Page 141: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/141.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading with OIE
same same same
same same
same
uncleOf
owns
hires
headOf
This slide was adapted from [Hady et al., 2011]
![Page 142: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/142.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading with OIE
same same same
same same
same
uncleOf
owns
hires
headOf
affairWith
affairWith enemyOf
This slide was adapted from [Hady et al., 2011]
![Page 143: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/143.jpg)
More on KnowItAll
WWW2013 Machine Reading the Web Estevam R. Hruschka Jr.
• h?p://homes.cs.washington.edu/~etzioni/index.html
![Page 144: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/144.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 145: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/145.jpg)
Outline
• Machine Learning • Machine Reading
• Reading the Web – YAGO – KnowItAll – NELL
![Page 146: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/146.jpg)
Never-‐Ending Learning Language
![Page 147: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/147.jpg)
![Page 148: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/148.jpg)
Never-‐Ending Learning • Main Task: acquire a growing competence without asymptote • over years • mulKple funcKons • where learning one thing improves ability to learn the next • acquiring data from humans, environment
• Many candidate domains: • Robots • Sorbots • Game players
![Page 149: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/149.jpg)
NELL: Never-‐Ending Language Learner
Inputs: l initial ontology l handful of examples of each predicate in ontology l the web l occasional interaction with human trainers
The task:
l run 24x7, forever • each day: 1. extract more facts from the web to populate the initial ontology 2. learn to read (perform #1) better than yesterday
![Page 150: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/150.jpg)
NELL: Never-‐Ending Language Learner
Goal: • run 24x7, forever • each day:
1. extract more facts from the web to populate given ontology 2. learn to read better than yesterday
Today... Running 24 x 7, since January, 2010 Input: • ontology defining ~800 categories and relations • 10-20 seed examples of each • 1 billion web pages (ClueWeb – Jamie Callan) Result: • continuously growing KB with +1,400,000 extracted beliefs
![Page 151: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/151.jpg)
h?p://rtw.ml.cmu.edu
![Page 152: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/152.jpg)
NELL: Never-‐Ending Language Learner
![Page 153: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/153.jpg)
The Problem with Semi-‐Supervised Bootstrap Learning
Paris Pi?sburgh Sea?le CuperKno
![Page 154: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/154.jpg)
The Problem with Semi-‐Supervised Bootstrap Learning
Paris Pi?sburgh Sea?le CuperKno
mayor of arg1 live in arg1
![Page 155: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/155.jpg)
The Problem with Semi-‐Supervised Bootstrap Learning
Paris Pi?sburgh Sea?le CuperKno
mayor of arg1 live in arg1
San Francisco AusKn denial
![Page 156: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/156.jpg)
The Problem with Semi-‐Supervised Bootstrap Learning
Paris Pi?sburgh Sea?le CuperKno
mayor of arg1 live in arg1
San Francisco AusKn denial
arg1 is home of traits such as arg1
![Page 157: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/157.jpg)
The Problem with Semi-‐Supervised Bootstrap Learning
Paris Pi?sburgh Sea?le CuperKno
mayor of arg1 live in arg1
…
San Francisco AusKn denial
arg1 is home of traits such as arg1
it’s underconstrained!!
![Page 158: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/158.jpg)
Key Idea 1: Coupled semi-supervised training of many functions
![Page 159: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/159.jpg)
Coupled Training Type 1: Co-training, Multiview, Co-regularization
![Page 160: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/160.jpg)
Coupled Training Type 1: Co-training, Multiview, Co-regularization
![Page 161: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/161.jpg)
Coupled Training Type 1: Co-training, Multiview, Co-regularization
![Page 162: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/162.jpg)
Type 1 Coupling Constraints in NELL
![Page 163: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/163.jpg)
Type 1 Coupling Constraints in NELL
Semi-‐structured data The “Low-‐Hanging Fruit”
![Page 164: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/164.jpg)
Type 1 Coupling Constraints in NELL
Semi-‐structured data The “Low-‐Hanging Fruit”
Free text Hearst-‐paWerns; clustering by verbal phrases Natural-‐language processing Advanced paWerns & itera?ve bootstrapping
(“Dual Itera?ve PaWern Rela?on Extrac?on”)
POS tagging and NP chunking:
![Page 165: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/165.jpg)
Coupled Training Type 2: Structured Outputs, MulKtask, Posterior RegularizaKon,
MulKlabel
Learn funcKons with the same input, different outputs, where we know some constraint
![Page 166: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/166.jpg)
Coupled Training Type 2: Structured Outputs, MulKtask, Posterior RegularizaKon,
MulKlabel
Learn funcKons with the same input, different outputs, where we know some constraint
![Page 167: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/167.jpg)
Coupled Training Type 2: Structured Outputs, MulKtask, Posterior RegularizaKon,
MulKlabel
Learn funcKons with the same input, different outputs, where we know some constraint
![Page 168: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/168.jpg)
Type 2 Coupling Constraints in NELL
![Page 169: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/169.jpg)
Multi-view, Multi-Task Coupling C categories, V views, CV ≈ 250*3=750 coupled functions pairwise constraints on functions ≈ 105
![Page 170: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/170.jpg)
Learning Relations between NP’s
![Page 171: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/171.jpg)
Learning Relations between NP’s
![Page 172: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/172.jpg)
Type 3 Coupling: Argument Types
![Page 173: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/173.jpg)
Pure EM Approach to Coupled Training
E: jointly estimate latent labels for each function of each
unlabeled example M: retrain all functions, based
on these probabilistic labels
Scaling problem: • E step: 20M NP’s, 1014 NP pairs to label • M step: 50M text contexts to consider for each function à 1010
parameters to retrain • even more URL-HTML contexts..
![Page 174: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/174.jpg)
NELL’s ApproximaKon to EM
E’ step: • Consider only a growing subset of the latent variable assignments
– category variables: up to 250 NP’s per category per iteration – relation variables: add only if confident and args of correct type – this set of explicit latent assignments *IS* the knowledge base
M’ step: • Each view-based learner retrains itself from the updated KB • “context” methods create growing subsets of contexts
![Page 175: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/175.jpg)
![Page 176: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/176.jpg)
Key Idea 2: Discover New Coupling Constraints
• first order, probabilistic horn clause constraints
0.93 athletePlaysSport(?x,?y) :- athletePlaysForTeam(?x,?z), teamPlaysSport(?z,?y)
– connects previously uncoupled relation predicates
– infers new beliefs for KB
![Page 177: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/177.jpg)
Example Learned Horn Clauses 0.95 athletePlaysSport(?x,basketball) :- athleteInLeague(?x,NBA) 0.93 athletePlaysSport(?x,?y) :- athletePlaysForTeam(?x,?z)
teamPlaysSport(?z,?y) 0.91 teamPlaysInLeague(?x,NHL) :- teamWonTrophy(?x,Stanley_Cup) 0.90 athleteInLeague(?x,?y):- athletePlaysForTeam(?x,?z),
teamPlaysInLeague(?z,?y) 0.88 cityInState(?x,?y) :- cityCapitalOfState(?x,?y),
cityInCountry(?y,USA) 0.62* newspaperInCity(?x,New_York) :- companyEconomicSector(?x,media),
generalizations(?x,blog)
![Page 178: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/178.jpg)
Learned ProbabilisKc Horn Clause Rules
![Page 179: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/179.jpg)
Learned ProbabilisKc Horn Clause Rules
![Page 180: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/180.jpg)
![Page 181: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/181.jpg)
Ontology Extension (1)
![Page 182: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/182.jpg)
OntExt (Ontology Extension)
Everything
Person Company City Sport
WorksFor PlayedIn
![Page 183: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/183.jpg)
OntExt (Ontology Extension)
Everything
Person Company City Sport
WorksFor PlayedIn Plays
![Page 184: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/184.jpg)
OntExt (Ontology Extension)
Everything
Person Company City Sport
WorksFor PlayedIn
LocatedIn
Plays
![Page 185: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/185.jpg)
[Mohamed & Hruschka, 2011]
Goal: • Discover frequently stated relations among
ontology categories Approach: • For each pair of categories C1, C2, • co-cluster pairs of known instances, and text
contexts that connect them
* additional experiments with Etzioni & Soderland using TextRunner
Ontology Extension (1)
![Page 186: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/186.jpg)
![Page 187: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/187.jpg)
Prophet
• Mining the Graph represenKng NELL’s KB to: 1. Extend the KB by predicKng new relaKons
(edges)that might exist between pairs of nodes;
2. Induce inference rules; 3. IdenKfy misplaced edges which can be used
by NELL as hints to idenKfy wrong connecKons between nodes (wrong fats);
•
Appel & Hruschka, 2012
![Page 188: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/188.jpg)
Prophet
• Find open triangles in the Graph
Appel & Hruschka
![Page 189: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/189.jpg)
Prophet
sport sportsLeague
sportsTeam
Appel & Hruschka
![Page 190: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/190.jpg)
Prophet
• If > ξ then create the new relaKon • ξ = 10 (empirically)
sport sportsLeague
sportsTeam
Appel & Hruschka
![Page 191: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/191.jpg)
Prophet
• If > ξ then create the new relaKon • ξ = 10 (empirically) • Name the new relaKon based on ReVerb
sport sportsLeague
sportsTeam
isPlayedIn
Appel & Hruschka
![Page 192: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/192.jpg)
Conversing Learning Pedro & Hruschka
![Page 193: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/193.jpg)
Conversing Learning
• Help to supervise NELL by automaKcally asking quesKons on Web CommuniKes
Pedro & Hruschka
![Page 194: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/194.jpg)
Conversing Learning
• Help to supervise NELL by automaKcally asking quesKons on Web CommuniKes
• Currently: validate First Order Rules coming from Rule Learner
Pedro & Hruschka
![Page 195: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/195.jpg)
Conversing Learning
• Help to supervise NELL by automaKcally asking quesKons on Web CommuniKes
• Currently: validate First Order Rules coming from Rule Learner
Pedro & Hruschka
![Page 196: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/196.jpg)
Conversing Learning
• Help to supervise NELL by automaKcally asking quesKons on Web CommuniKes
• Currently: validate First Order Rules coming from Rule Learner
Pedro & Hruschka
![Page 197: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/197.jpg)
Conversing Learning
• Uses an agent (SS-‐Crowd) capable of: – building quesKons; – PosKng quesKons in Web communiKes; – Fetch answers; – Understand the answers; – Decide on the truth of the first order rule
Pedro & Hruschka
![Page 198: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/198.jpg)
Conversing Learning Pedro & Hruschka
![Page 199: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/199.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading with NELL
This slide was adapted from [Hady et al., 2011]
![Page 200: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/200.jpg)
Machine Reading with NELL
same
This slide was adapted from [Hady et al., 2011]
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
![Page 201: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/201.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading with NELL
same same same
same same
same
This slide was adapted from [Hady et al., 2011]
![Page 202: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/202.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading with NELL
same same same
same same
same
uncleOf
owns
hires
headOf
This slide was adapted from [Hady et al., 2011]
![Page 203: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/203.jpg)
It’s about the disappearance forty years ago of Harriet Vanger, a young scion of one of the wealthiest families in Sweden, and about her uncle, determined to know the truth about what he believes was her murder. Blomkvist visits Henrik Vanger at his estate on the tiny island of Hedeby. The old man draws Blomkvist in by promising solid evidence against Wennerström. Blomkvist agrees to spend a year writing the Vanger family history as a cover for the real assignment: the disappearance of Vanger's niece Harriet some 40 years earlier. Hedeby is home to several generations of Vangers, all part owners in Vanger Enterprises. Blomkvist becomes acquainted with the members of the extended Vanger family, most of whom resent his presence. He does, however, start a short lived affair with Cecilia, the niece of Henrik. After discovering that Salander has hacked into his computer, he persuades her to assist him with research. They eventually become lovers, but Blomkvist has trouble getting close to Lisbeth who treats virtually everyone she meets with hostility. Ultimately the two discover that Harriet's brother Martin, CEO of Vanger Industries, is secretly a serial killer. A 24-year-old computer hacker sporting an assortment of tattoos and body piercings supports herself by doing deep background investigations for Dragan Armansky, who, in turn, worries that Lisbeth Salander is “the perfect victim for anyone who wished her ill."
Machine Reading with NELL
same same same
same same
same
uncleOf
owns
hires
headOf
affairWith
affairWith enemyOf
This slide was adapted from [Hady et al., 2011]
![Page 204: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/204.jpg)
More on NELL • h?p://rtw.ml.cmu.edu/rtw/publicaKons
WWW2013 Machine Reading the Web Estevam R. Hruschka Jr.
![Page 205: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/205.jpg)
[email protected] Thank you very much! and thanks to all people from NELL, KnowItAll and YAGO projects for very nice discussions and suggestions to this tutorial.
![Page 206: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/206.jpg)
References • [Fern, 2008] Xiaoli Z. Fern, CS 434: Machine Learning and Data Mining, School of Electrical Engineering
and Computer Science, Oregon State University, Fall 2008. • [DARPA, 2012] DARPA Machine Reading Program, h?p://www.darpa.mil/Our_Work/I2O/Programs/
Machine_Reading.aspx. • [Mitchell, 2006] Tom M. Mitchell, The Discipline of Machine Learning, my perspecKve on this research
field, July 2006 (h?p://www.cs.cmu.edu/~tom/pubs/MachineLearning.pdf). • [Mitchell, 1997] Tom M. Mitchell, Machine Learning. McGraw-‐Hill, 1997. • [Etzioni et al., 2007] Oren Etzioni, Michele Banko, and Michael J. Cafarella, Machine Reading.The 2007
AAAI Spring Symposium. Published by The AAAI Press, Menlo Park, California, 2007. • [Clark et al., 2007] Peter Clark, Phil Harrison, John Thompson, Rick Wojcik, Tom Jenkins, David Israel,
Reading to Learn: An InvesKgaKon into Language Understanding. The 2007 AAAI Spring Symposium. Published by The AAAI Press, Menlo Park, California, 2007.
• [Norvig, 2007] Peter Norvig, Inference in Text Understanding. The 2007 AAAI Spring Symposium. Published by The AAAI Press, Menlo Park, California, 2007.
• [Wang & Cohen, 2007] Richard C. Wang and William W. Cohen: Language-‐Independent Set Expansion of Named EnKKes using the Web. In Proceedings of IEEE Interna>onal Conference on Data Mining (ICDM 2007), Omaha, NE, USA. 2007.
• [Etzioni, 2008] Oren Etzioni. 2008. Machine reading at web scale. In Proceedings of the interna>onal conference on Web search and web data mining (WSDM '08). ACM, New York, NY, USA, 2-‐2.
• [Banko, et al., 2007] Michele Banko, Michael J. Cafarella, Stephen Soderland, Ma?hew Broadhead, Oren Etzioni: Open InformaKon ExtracKon from the Web. IJCAI 2007: 2670-‐2676
IBERAMIA2012 Machine Learning, Machine Reading and the Web Estevam R. Hruschka Jr.
![Page 207: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/207.jpg)
References • [Weikum et al., 2009] G. Weikum, G., Kasneci, M. Ramanath, F. Suchanek. DB & IR methods for • knowledge discovery. CommunicaKons of the ACM 52(4), 2009. • [Theobald & Weikum, 2012] MarKn Theobald and Gerhard Weikum. From InformaKon to Knowledge:
HarvesKng EnKKes and RelaKonships from Web Sources. Tutorial at PODS 2012 • [Hoffart et al., 2012] Johannes Hoffart, Fabian Suchanek, Klaus Berberich, Gerhard Weikum. YAGO2: A
SpaKally and Temporally Enhanced Knowledge Base from Wikipedia. Special issue of the ArKficial Intelligence Journal, 2012
• [Etzioni et al., 2011] Oren Etzioni, Anthony Fader, Janara Christensen, Stephen Soderland, and Mausam "Open InformaKon ExtracKon: the Second GeneraKon“. Proceedings of the 22nd Interna>onal Joint Conference on Ar>ficial Intelligence (IJCAI 2011).
• [Hady et al., 2011] Hady W. Lauw, Ralf Schenkel, Fabian Suchanek, MarKn Theobald, and Gerhard Weikum, "SemanKc Knowledge Bases from Web Sources" at IJCAI 2011, Barcelona, July 2011
• [Fader et al., 2011] Anthony Fader, Stephen Soderland, and Oren Etzioni. "IdenKfying RelaKons for Open InformaKon ExtracKon”. Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP 2011)
• Se?les, B.: Closing the loop: Fast, interacKve semi-‐supervised annotaKon with queries on features and instances. In: Proc. of the EMNLP’11, Edinburgh, ACL (2011) 1467–1478 5.
• Carlson, A., Be?eridge, J., Kisiel, B., Se?les, B., Jr., E.R.H., Mitchell, T.M.: Toward an architecture for never-‐ending language learning. In: Proceedings of the Twenty-‐Fourth Conference on ArKficial Intelligence (AAAI 2010).
• Pedro, S.D.S., Hruschka Jr., E.R.: CollecKve intelligence as a source for machine learning self-‐supervision. In: Proc. of the 4th InternaKonal Workshop on Web Intelligence and CommuniKes. WIC12, NY, USA, ACM (2012) 5:1–5:9
IBERAMIA2012 Machine Learning, Machine Reading and the Web Estevam R. Hruschka Jr.
![Page 208: Machine Reading the Web](https://reader034.vdocument.in/reader034/viewer/2022042817/55a9657c1a28ab5a108b479a/html5/thumbnails/208.jpg)
References • [Appel & Hruschka Jr., 2011] Appel, A.P., Hruschka Jr., E.R.: Prophet – a link-‐predictor to learn new
rules on Nell. In: Proceedings of the 2011 IEEE 11th InternaKonal Conference on Data Mining Workshops. pp. 917–924. ICDMW ’11, IEEE Computer Society, Washington, DC, USA (2011)
• [Mohamed et al., 2011] Mohamed, T.P., Hruschka, Jr., E.R., Mitchell, T.M.: Discovering relaKons between noun categories. In: Proceedings of the Conference on Empirical Methods in Nat-‐ ural Language Processing. pp. 1447–1455. EMNLP ’11, AssociaKon for Computa-‐ Konal LinguisKcs, Stroudsburg, PA, USA (2011)
• [Pedro & Hruschka Jr., 2012] Saulo D.S. Pedro and Estevam R. Hruschka Jr., Conversing Learning: acKve learning and acKve social interacKon for human supervision in never-‐ending learning systems. Xiii Ibero-‐american Conference On ArKficial Intelligence, IBERAMIA 2012, 2012.
• Krishnamurthy, J., Mitchell, T.M.: Which noun phrases denote which concepts. In: Proceedings of the Forty Ninth Annual MeeKng of the AssociaKon for Compu-‐ taKonal LinguisKcs (2011)
• Lao, N., Mitchell, T., Cohen, W.W.: Random walk inference and learning in a large scale knowledge base. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. pp. 529–539. Associa-‐ Kon for ComputaKonal LinguisKcs, Edinburgh, Scotland, UK. (July 2011), h?p://www.aclweb.org/anthology/D11-‐1049
• E. R. Hruschka Jr. and M. C. Duarte and M. C. Nicole�. Coupling as Strategy for Reducing Concept-‐Drir in Never-‐ending Learning Environments. Fundamenta InformaKcae, IOS Press, 2012.
IBERAMIA2012 Machine Learning, Machine Reading and the Web Estevam R. Hruschka Jr.