Information retrieval is the foundation for modern search engines. This text offers an introduction to the core topics underlying modern search technologies, including algorithms, data structures, indexing, retrieval, and evaluation. The emphasis is on implementation and experimentation; each chapter includes exercises and suggestions for student projects. ... The modular structure of the book allows instructors to use it in a variety of graduate-level courses, including courses taught from a database systems perspective, traditional information retrieval courses with a focus on IR theory, and courses covering the basics of Web retrieval.
Reviews
"This book is a must-read for all search academics and practitioners!" from the foreword by Amit Singhal
"This book is a fine addition to the growing literature on information retrieval (IR)." Donald H. Kraft Computing Reviews
Table of Contents (incl. sample chapters)
Foundations; Basic Techniques; Tokens and Terms; Static Inverted Indices; Query Processing; Index Compression; Retrieval And Ranking; Experimental Comparison; Evaluation; Applications And Extensions; Computer Performance.
On the same shelf:
On the same shelf:
- Probabilistic Question Answering on the Web Dragomir R. Radev, Weiguo Fan, Hong Qi, Harris Wu, Amardeep Grewal, Journal of the American Society for Information Science and Technology (impact factor: 2.01). 12/2012;
ABSTRACT Web-based search engines such as Google and NorthernLight return documents that are relevant to a user query, not answers to user questions. We have developed an architecture that augments existing search engines so that they support natural language question answering. The process entails five steps: query modulation, document retrieval, passage extraction, phrase extraction, and answer ranking. In this article, we describe some probabilistic approaches to the last three of these stages. We show how our techniques apply to a number of existing search engines, and we also present results contrasting three different methods for question answering. Our algorithm, probabilistic phrase reranking (PPR), uses proximity and question type features and achieves a total reciprocal document rank of .20 on the TREC8 corpus. Our techniques have been implemented as a Web-accessible system, called NSIR. Peer Reviewed
No comments:
Post a Comment