Project Gutenberg as an Information Retrieval System Kai Li IST616 Final Assignment 2012.11
Transcript
1. Project Gutenberg as an Information Retrieval System Kai Li
IST616 Final Assignment 2012.11
2. Introduction to Project Gutenberg The first digital library
project in the world, initiated by the late Michael Hart in 1971.
Project Gutenberg currently offers more than 41,000 public domain
eBooks (in more than 50 languages) as well as other resources (like
scientific data). Website: http://www.gutenberg.org/
3. Intended Audience and Functionalities Intended audience:
eBook readers and general users. Functionalities: portal of the
project, eBook repository and discovery system.
4. Mobile Site There are two kinds of interfaces of this
website based on the device one uses. Only the traditional
nonmobile interface will be examined in this presentation due to
the limited scope of the assignment.
5. Indexing System
6. Issues of Indexing/Tag System There is a searching box as
well as a tag called Search Catalog; The searching box is too small
to be noticed; The tag Search Catalog actually leads users to a
page where one cannot find the searching box, but only some
browsing selections; There are a number of repetitive tags on the
left-hand bar and on the top of the page; For example, the tag Book
Categories.
7. Means To Find a Book Searching Browsing By categories
8. Searching
9. Issues of Searching The display is different from most of
the interfaces one can see on the Internet, which may result some
difficulties for new users; Due to a lack of navigation mechanism
and the function to refine the result by facets, its extremely
inconvenient to locate a resource if the result is big.
10. Precision and Recall The retrieval method used by this
website is a string-matching method, which matches the string
inputted by the user with the full-text of all the resources. Or
relationship used for multiple words. Because the scope of the
index is the full-text, the recall is higher than traditional
library catalogs; however, since it is still a string-matching
method, the precision is still not very good.
11. Browsing
12. Issues of Browsing There are three searching tools offered
on this page, which should have been offered on the searching page
rather than this one. Only one standard can be used to limit the
resources at the same time. And after one chooses a certain
standard, there is no other way to further limit the result.
13. Categories/Classification There are two tiers of the
classification on this website: Subcategories: 23 These
subcategories are called bookshelf too, which is confusing.
Bookshelves: 133 Which can be seen as a lower level than
subcategories. However, not all bookshelves are linked to a given
subcategory.
14. Overall Evaluation Advantages: Mobile functionalities:
Mobile site QR codes Disadvantages: Poorly organized and designed;
Failing to display the full richness of the metadata on the
website: LoC classification and subject headings The interface
being lack of communication with the users;