learning probabilistic models of link structure
DESCRIPTION
Learning Probabilistic Models of Link Structure. Getoor, Friedman, Koller, Taskar. Example Application: WebKB. Classify web page as course, student, professor, project, none using… Words on the web page Links from other web pages (and the class of those pages, recursively) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/1.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Learning Probabilistic Models of
Link Structure
Getoor, Friedman, Koller, Taskar
![Page 2: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/2.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Example Application: WebKB
Classify web page as course, student, professor, project, none using… Words on the web page Links from other web pages (and the class
of those pages, recursively) Words in the “anchor text” from the other
page <a href=“url”>anchor text</a>. Web pages obtained from Cornell,
Texas, Washington, and Wisconsin
![Page 3: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/3.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Example Application: CORA
Classify documents according to topic (7 levels) using… words in the document papers cited by the document papers citing the document
![Page 4: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/4.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Standard PRM
parents(Doc.class) = {MODE(Doc.citers.class),MODE(Doc.cited.class)}
Document
class
words Document
class
words
Document
class
words
Document
class
words
Document
class
words
Document
class
words
Document
class
words
Document
class
words
citers
cited
MODE
MODE
![Page 5: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/5.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms
Problem: The Citation Structure is Fixed
The existence (or non-existence) of a link cannot serve as evidence
Individually-linked papers only influence the class through the MODE.
![Page 6: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/6.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms
Possible Solution: Link Uncertainty
Model the existence of links as random variables
Create a Link instance for each pair of possibly-linked objects
![Page 7: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/7.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Unrolled Network
Document
class
words
Document
class
words
Document
class
wordsCites
Exists
Cites
Exists
Cites
Exists
![Page 8: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/8.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Getoor’s Diagram
Entity classes (Paper) Relation classes (Cites) Technically, every instance has an Exists
variable which is true for all Entity instances.
![Page 9: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/9.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Semantics
P is the basic CPT P* will be the equivalent unrolled CPT Require that an object does not exist if
any of the objects it points to do not exist
![Page 10: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/10.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms WebKB Network
![Page 11: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/11.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Experimental Results
Cora and WebKB
![Page 12: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/12.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms WebKB with various features
![Page 13: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/13.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms
A Second Approach:Reference Uncertainty
Treat reference attributes as random variables Each reference attribute takes as value an
object of the indicated class
Citation Citing: reference attribute, value is a Paper Cited: reference attribute, value is a Paper
![Page 14: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/14.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Problems
How many citation objects exist? Consequently, how many reference random variables exist?
How do we represent P(Citation.cites | …)? Citation.cites could take on thousands of possible values. Huge conditional probability table Costly inference at run time
![Page 15: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/15.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms
SolutionsProblem 1: How many citations?
Fix the number of Citation objects This gives the “object skeleton”
![Page 16: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/16.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms
Problem 2: Too many potential values for a reference attribute
Attach to each reference attribute a set of partition attributes The reference attribute chooses a partition A Paper is then chosen uniformly at random from
the partition
Citation
CitingCited
PaperPaper
Paper
Theory
PaperPaper
Paper
GraphicsPaper
PaperPaper
Learning
![Page 17: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/17.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms
Representing Constraints Between Citing and Cited Papers
Parents(Cites.Cited) = {Cites.Citing.Topic}
![Page 18: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/18.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Details
Each reference attribute has a selector attribute S that chooses the partition.
Citation
PaperPaper
Paper
Learning
PaperPaper
Theory
Paper
Paper
Graphics
PaperPaper
Sciting
Citing
Scited
Cited
![Page 19: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/19.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Class-level Dependency Graph
Five types of edges Type I: edges within a single object Type II: edges between objects Type III: edges from every reference attribute along
any reference paths Type IV: edges from every partition attribute to the
selector attributes that use those partition attributes to choose a partition
Type V: edge from selector attributes to their corresponding reference attributes
![Page 20: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/20.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Movie Theater Example
Type I: Genre Popularity Type II: Shows.Movie.Genre Shows.Profit
Shows.Theater.Type SMovie
Type III: Move Profit; Theater Smovie
Type IV: Genre SMovie
Type V: STheater Theater; SMovie Movie
![Page 21: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/21.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Unrolled Graph?
The Unrolled Graph can have a huge number of edges
Is learning and inference really feasible?
![Page 22: Learning Probabilistic Models of Link Structure](https://reader036.vdocument.in/reader036/viewer/2022062802/568145f5550346895db2fdd8/html5/thumbnails/22.jpg)
Ore
gon
Sta
te U
nive
rsit
y –
CS
539
PR
Ms Homework Exercise
Construct the dependency graph for the citation example
Construct an unrolled network for a reference uncertainty example