A Generative Theory of Relevance (eBook)

(Autor)

eBook Download: PDF
2008 | 2009
XX, 197 Seiten
Springer Berlin (Verlag)
978-3-540-89364-6 (ISBN)

Lese- und Medienproben

A Generative Theory of Relevance - Victor Lavrenko
Systemvoraussetzungen
96,29 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

A modern information retrieval system must have the capability to find, organize and present very different manifestations of information - such as text, pictures, videos or database records - any of which may be of relevance to the user. However, the concept of relevance, while seemingly intuitive, is actually hard to define, and it's even harder to model in a formal way.

Lavrenko does not attempt to bring forth a new definition of relevance, nor provide arguments as to why any particular definition might be theoretically superior or more complete. Instead, he takes a widely accepted, albeit somewhat conservative definition, makes several assumptions, and from them develops a new probabilistic model that explicitly captures that notion of relevance. With this book, he makes two major contributions to the field of information retrieval: first, a new way to look at topical relevance, complementing the two dominant models, i.e., the classical probabilistic model and the language modeling approach, and which explicitly combines documents, queries, and relevance in a single formalism; second, a new method for modeling exchangeable sequences of discrete random variables which does not make any structural assumptions about the data and which can also handle rare events.

Thus his book is of major interest to researchers and graduate students in information retrieval who specialize in relevance modeling, ranking algorithms, and language modeling.



Victor Lavrenko is a lecturer at the School of Informatics at the University of Edinburgh, Scotland, UK. He received his Ph.D. in Computer Science from the University of Massachusetts Amherst in 2004. His dissertation focused on a generative framework for modeling relevance in Information Retrieval. In 2005 he joined the Center for Intelligent Information Retrieval at UMass as a post-doctoral research associate, working on statistical models for searching large semi-structured databases. From 2006 Victor worked as a language technology consultant for the Credit Suisse Group. Since 2000, he has served as a reviewer for SIGIR, CIKM, NAACL/HLT, IJCAI and NIPS conferences.

Victor's current research interests include formal models for searching text in multiple languages, annotating and retrieving images, and detecting and tracking novel events in the news.

Victor Lavrenko is a lecturer at the School of Informatics at the University of Edinburgh, Scotland, UK. He received his Ph.D. in Computer Science from the University of Massachusetts Amherst in 2004. His dissertation focused on a generative framework for modeling relevance in Information Retrieval. In 2005 he joined the Center for Intelligent Information Retrieval at UMass as a post-doctoral research associate, working on statistical models for searching large semi-structured databases. From 2006 Victor worked as a language technology consultant for the Credit Suisse Group. Since 2000, he has served as a reviewer for SIGIR, CIKM, NAACL/HLT, IJCAI and NIPS conferences. Victor's current research interests include formal models for searching text in multiple languages, annotating and retrieving images, and detecting and tracking novel events in the news.

Introduction 17
Contributions 18
A new model of relevance 18
A new generative model 18
Minor contributions 19
Overview 19
Relevance 22
The many faces of relevance 22
A simple definition of relevance 22
User-oriented views of relevance 23
Logical views of relevance 24
The binary nature of relevance 25
Dependent and independent relevance 25
Attempts to Construct a Unified Definition of Relevance 27
Relevance in this book 30
Existing Models of Relevance 30
The Probability Ranking Principle 30
The Classical Probabilistic Model 32
The Language Modeling Framework 41
Contrasting the Classical Model and Language Models 47
A Generative View of Relevance 51
An Informal Introduction to the Model 51
Representation of documents and requests 52
Advantages of a common representation 53
Information retrieval under the generative hypothesis 56
Formal Specification of the Model 58
Representation of Documents and Queries 59
Document and query generators 59
Relevant documents 59
Relevance in the information space 60
Relevant queries 60
Summary of representations 61
Probability Measures 61
Distribution over the representation space 62
Distribution over documents and queries 63
Significance of our derivations 66
Summary of probability measures 66
Relevance Models 67
Frequentist interpretation: a sampling game 68
Bayesian interpretation: uncertainty about relevance 69
Multi-modal domains 69
Summary of relevance models 71
Ranked Retrieval 71
Probability ranking principle 71
Retrieval as hypothesis testing 73
Probability ratio or KL-divergence? 79
Summary of ranking methods 81
Discussion of the Model 82
Generative Density Allocation 85
Problem Statement 85
Objective 86
Existing Generative Models 86
The Unigram model 87
The Mixture model 88
The Dirichlet model 90
Probabilistic Latent Semantic Indexing (pLSI) 92
Latent Dirichlet Allocation 93
A brief summary 94
Motivation for a new model 95
A Common Framework for Generative Models 96
Unigram 97
Dirichlet 98
Mixture 99
pLSI 100
LDA 102
A note on graphical models 104
Kernel-based Allocation of Generative Density 105
Delta kernel 106
Dirichlet kernel 108
Advantages of kernel-based allocation 110
Predictive Effectiveness of Kernel-based Allocation 113
Summary 115
Retrieval Scenarios 117
Ad-hoc Retrieval 118
Representation 118
Examples of Relevance Models 121
Experiments 122
Relevance Feedback 130
Representation 130
Experiments 132
Cross-Language Retrieval 133
Representation 133
Example of a cross-lingual relevance model 138
Experiments 139
Significance of the cross-language scenario 145
Handwriting Retrieval 145
Definition 146
Representation 147
Experiments 150
Image Retrieval 152
Representation 154
Experiments 157
Video Retrieval 161
Representation 162
Experiments 163
Structured search with missing data 165
Representation of documents and queries 167
Probability distribution over documents and queries 167
Structured Relevance Model 168
Retrieving Relevant Records 168
Experiments 169
Topic Detection and Tracking 173
Definition 173
Representation 176
Link detection algorithm 177
Experiments 179
Conclusion 189
Limitations of our Work 193
Closed-universe approach 193
Exchangeable data 193
Computational complexity 194
Directions for Future Research 195
Relevance-based indexing 195
Hyper-linked and relational data 195
Order-dependent data 197
Dirichlet kernels 197
References 198
Index 208

Erscheint lt. Verlag 14.11.2008
Reihe/Serie The Information Retrieval Series
Zusatzinfo XX, 197 p. 31 illus.
Verlagsort Berlin
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Programmiersprachen / -werkzeuge
Schlagworte Algorithm analysis and problem complexity • algorithms • Database • Information Retrieval
ISBN-10 3-540-89364-4 / 3540893644
ISBN-13 978-3-540-89364-6 / 9783540893646
Haben Sie eine Frage zum Produkt?
Wie bewerten Sie den Artikel?
Bitte geben Sie Ihre Bewertung ein:
Bitte geben Sie Daten ein:
PDFPDF (Wasserzeichen)
Größe: 2,2 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Entwicklung von GUIs für verschiedene Betriebssysteme

von Achim Lingott

eBook Download (2023)
Carl Hanser Verlag GmbH & Co. KG
39,99
Das Handbuch für Webentwickler

von Philip Ackermann

eBook Download (2023)
Rheinwerk Computing (Verlag)
49,90