forecite update emma – 12 november 2010. outline dblp ingestion google scholar ingestion ...
TRANSCRIPT
![Page 1: ForeCite update Emma – 12 November 2010. Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc Bibutils integration](https://reader035.vdocument.in/reader035/viewer/2022080917/56649ee55503460f94bf43aa/html5/thumbnails/1.jpg)
ForeCite update
Emma – 12 November 2010
![Page 2: ForeCite update Emma – 12 November 2010. Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc Bibutils integration](https://reader035.vdocument.in/reader035/viewer/2022080917/56649ee55503460f94bf43aa/html5/thumbnails/2.jpg)
Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc
Bibutils integration
![Page 3: ForeCite update Emma – 12 November 2010. Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc Bibutils integration](https://reader035.vdocument.in/reader035/viewer/2022080917/56649ee55503460f94bf43aa/html5/thumbnails/3.jpg)
DBLP ingestion dblp.xml
Stores the complete publications’ metadata in DBLP – ~1.4M records<inproceedings key="conf/icadl/HanseKK10" mdate="2010-06-22">
<author>Markus Hänse</author>
<author>Min-Yen Kan</author>
<author>Achim Karduck</author>
<title>Kairos: Proactive Harvesting of Research Paper Metadata from Scientific Conference Web Sites.</title>
<pages>226-235</pages>
<year>2010</year>
<booktitle>ICADL</booktitle>
<ee>http://dx.doi.org/10.1007/978-3-642-13654-2_28</ee>
<crossref>conf/icadl/2010</crossref>
<url>db/conf/icadl/icadl2010.html#HanseKK10</url>
</inproceedings>
![Page 4: ForeCite update Emma – 12 November 2010. Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc Bibutils integration](https://reader035.vdocument.in/reader035/viewer/2022080917/56649ee55503460f94bf43aa/html5/thumbnails/4.jpg)
DBLP Ingestion
Author Page
Paper Page
![Page 5: ForeCite update Emma – 12 November 2010. Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc Bibutils integration](https://reader035.vdocument.in/reader035/viewer/2022080917/56649ee55503460f94bf43aa/html5/thumbnails/5.jpg)
DBLP ingestion
dblp_bht.xml (bht stands for “bibliography hypertext”) Is a HTML with additional customized elements: cite, ref Provides information for navigation page, TOC page
<bht key="/db/conf/3dica/3dica1998.bht" title="Three-Dimensional Image Capture and Applications 1998”>
<logo/> <h1> <ref href="db/conf/3dica/index.html"> Three-Dimensional Image Capture and Applications</ref>1998: San Jose, CA, USA </h1> <hr/> <cite key="conf/3dica/1998"/> <h2> Multiple Image Methods </h2> <ul> <li> <cite key="conf/3dica/HebertR98"/> </li>
![Page 6: ForeCite update Emma – 12 November 2010. Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc Bibutils integration](https://reader035.vdocument.in/reader035/viewer/2022080917/56649ee55503460f94bf43aa/html5/thumbnails/6.jpg)
DBLP Ingestion
![Page 7: ForeCite update Emma – 12 November 2010. Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc Bibutils integration](https://reader035.vdocument.in/reader035/viewer/2022080917/56649ee55503460f94bf43aa/html5/thumbnails/7.jpg)
Google Scholar ingestion Define the xml format in forecite.dtd
<misc key="Automaticpartitioningoffullmotionvideo-HJZhangAKankanhalliSW-Readingsin
<title>Automatic partitioning of full-motion video</title> <url>http://books.google.com.sg/books?
hl=en&lr=&id=3UftvGv2g3QC&oi=fnd&pg=PA321& <author>Zhang, HJ</author> <author>Kankanhalli, A</author> <booktitle>Readings in multimedia ...</booktitle> <year>2002</year> </misc>
Crawlers such as Google Scholar or Kairos generate xml files that conform with forecite.dtd put the xml files into a specified folder X
ForeCite look at folder X for xml files ingest the metadata in the xml files to databases
![Page 8: ForeCite update Emma – 12 November 2010. Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc Bibutils integration](https://reader035.vdocument.in/reader035/viewer/2022080917/56649ee55503460f94bf43aa/html5/thumbnails/8.jpg)
Bimaple (www.bimaple.com) Specializes in improving user experiences of
Web search Founded by Prof Chen Li (UC Irvine)
![Page 9: ForeCite update Emma – 12 November 2010. Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc Bibutils integration](https://reader035.vdocument.in/reader035/viewer/2022080917/56649ee55503460f94bf43aa/html5/thumbnails/9.jpg)
Bimaple integrated in ForeCite
Author
Paper
![Page 10: ForeCite update Emma – 12 November 2010. Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc Bibutils integration](https://reader035.vdocument.in/reader035/viewer/2022080917/56649ee55503460f94bf43aa/html5/thumbnails/10.jpg)
Bibutils integration Bibutils is a set of programs that inter-converts
between various bibliography formats using a common XML intermediate (MODS)
Bibtutils integration Convert the papers’ metadata into MODS format Call the corresponding command to convert MODS to
specific bibliography format ADS BibTeX EndNote ISI RIS Word
![Page 11: ForeCite update Emma – 12 November 2010. Outline DBLP ingestion Google Scholar ingestion Bimaple fuzzy search Misc Bibutils integration](https://reader035.vdocument.in/reader035/viewer/2022080917/56649ee55503460f94bf43aa/html5/thumbnails/11.jpg)
Bibutils integrations