Show simple item record

dc.contributor.advisor Leake, David B. en
dc.contributor.advisor Gasser, Michael en
dc.contributor.author Scherle, Ryan en
dc.date.accessioned 2010-06-01T21:59:27Z
dc.date.available 2010-10-19T17:51:35Z
dc.date.issued 2010-06-01T21:59:27Z
dc.date.submitted 2006 en
dc.identifier.uri http://hdl.handle.net/2022/7473
dc.description Thesis (PhD) - Indiana University, Computer Sciences, 2006 en
dc.description.abstract The Internet contains billions of documents and thousands of systems for searching over these documents. Searching for a useful document can be as difficult as the proverbial search for a needle in a haystack. Each search engine provides access to a different collection of documents. Collections may be large or small, focused or comprehensive. Focused collections may be centered on any possible topic, and comprehensive collections typically have particular topical areas with higher concentrations of documents. Some of these collections overlap, but many documents are available from only a single collection. To find the most needles, one must first select the best haystacks. This dissertation develops a framework for automatic selection of search engines. In this framework, the collection underlying each search engine is examined to determine how properties such as central topic, size, and degree of focus affect retrieval performance. When measured with appropriate techniques, these properties may be used to predict performance. A new distributed retrieval algorithm that takes advantage of this knowledge is presented and compared to existing retrieval algorithms. en
dc.language.iso EN en
dc.publisher [Bloomington, Ind.] : Indiana University en
dc.subject distributed information retrieval en
dc.subject collection selection en
dc.subject search engine en
dc.subject metasearch en
dc.subject database selection en
dc.subject information retrieval en
dc.subject.classification Computer Science (0984) en
dc.subject.classification Information Science (0723) en
dc.title Looking for a Haystack: Selecting Data Sources in a Distributed Retrieval System en
dc.type Doctoral Dissertation en


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search IUScholarWorks


Advanced Search

Browse

My Account

Statistics