project gutenberg as information retrieval system

15
Project Gutenberg as an Information Retrieval System Kai Li IST616 Final Assignment 2012.11

Upload: kai-li

Post on 28-Nov-2014

493 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

  • 1. Project Gutenberg as an Information Retrieval System Kai Li IST616 Final Assignment 2012.11
  • 2. Introduction to Project Gutenberg The first digital library project in the world, initiated by the late Michael Hart in 1971. Project Gutenberg currently offers more than 41,000 public domain eBooks (in more than 50 languages) as well as other resources (like scientific data). Website: http://www.gutenberg.org/
  • 3. Intended Audience and Functionalities Intended audience: eBook readers and general users. Functionalities: portal of the project, eBook repository and discovery system.
  • 4. Mobile Site There are two kinds of interfaces of this website based on the device one uses. Only the traditional nonmobile interface will be examined in this presentation due to the limited scope of the assignment.
  • 5. Indexing System
  • 6. Issues of Indexing/Tag System There is a searching box as well as a tag called Search Catalog; The searching box is too small to be noticed; The tag Search Catalog actually leads users to a page where one cannot find the searching box, but only some browsing selections; There are a number of repetitive tags on the left-hand bar and on the top of the page; For example, the tag Book Categories.
  • 7. Means To Find a Book Searching Browsing By categories
  • 8. Searching
  • 9. Issues of Searching The display is different from most of the interfaces one can see on the Internet, which may result some difficulties for new users; Due to a lack of navigation mechanism and the function to refine the result by facets, its extremely inconvenient to locate a resource if the result is big.
  • 10. Precision and Recall The retrieval method used by this website is a string-matching method, which matches the string inputted by the user with the full-text of all the resources. Or relationship used for multiple words. Because the scope of the index is the full-text, the recall is higher than traditional library catalogs; however, since it is still a string-matching method, the precision is still not very good.
  • 11. Browsing
  • 12. Issues of Browsing There are three searching tools offered on this page, which should have been offered on the searching page rather than this one. Only one standard can be used to limit the resources at the same time. And after one chooses a certain standard, there is no other way to further limit the result.
  • 13. Categories/Classification There are two tiers of the classification on this website: Subcategories: 23 These subcategories are called bookshelf too, which is confusing. Bookshelves: 133 Which can be seen as a lower level than subcategories. However, not all bookshelves are linked to a given subcategory.
  • 14. Overall Evaluation Advantages: Mobile functionalities: Mobile site QR codes Disadvantages: Poorly organized and designed; Failing to display the full richness of the metadata on the website: LoC classification and subject headings The interface being lack of communication with the users;
  • 15. Thanks!