intelligent database systems lab presenter : chuang, kai-ting authors : rafael odon de alencar,...
DESCRIPTION
Intelligent Database Systems Lab Motivation Geography-related terms are often used in Web search queries.TRANSCRIPT
Intelligent Database Systems Lab
Presenter : Chuang, Kai-Ting
Authors : Rafael Odon de Alencar, Clodoveu Augusto Davis Jr.,
Marcos André Gonçalves
2010, ACM
Geographical classification of documents using evidence from Wikipedia
Intelligent Database Systems Lab
Outlines Motivation Objectives Methodology Experiments Conclusions Comments
Intelligent Database Systems Lab
Motivation• Geography-related terms are often used in Web
search queries.
Intelligent Database Systems Lab
Objectives• It is important to recognize the association of
documents to places in order to adequately respond
to such queries.
Intelligent Database Systems Lab
Methodology• This paper shows a technique for classifying
documents according to their association to places,
based on the occurrence of terms that coincide with
Wikipedia entry titles.
Intelligent Database Systems Lab
Methodology
Intelligent Database Systems Lab
Methodology
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments
Intelligent Database Systems Lab
Experiments• We defined 100 place names to be removed from the
documents.• 10-fold cross validation was used.• Impact in precision:
– Wikipedia Model: more than 30% of loss.– TF-IDF Bag-of-words model: about 6% of loss.
Intelligent Database Systems Lab
Conclusions• Experiments showed that a high level of precision can
be achieved with this approach.
Intelligent Database Systems Lab
Comments• Advantages– The approach is helpful.
• Applications– Geographic information retrieval.