Advances in Information Retrieval (eBook)
779 Seiten
Springer-Verlag
978-3-540-71496-5 (ISBN)
This book constitutes the refereed proceedings of the 29th annual European Conference on Information Retrieval Research, ECIR 2007, held in Rome, Italy in April 2007.
The 42 revised full papers and 19 revised short papers presented together with 3 keynote talks and 21 poster papers were carefully reviewed and selected from 220 article submissions and 72 poster paper submissions. The papers are organized in topical sections on theory and design, efficiency, peer-to-peer networks, result merging, queries, relevance feedback, evaluation, classification and clustering, filtering, topic identification, expert finding, XML IR, Web IR, and multimedia IR.
Written for: Researchers and professionals
Keywords: IR, Web query mining, Web search, XML retrieval, classification, clustering, collaborative Web searches, collaborative filtering, cross-language retrieval, distributed IR, document retrieval, image retrieval, information extraction, information retrieval, multimedia retrieval, question answering, results merging, semantic orientation, similarity search, text mining
Preface 6
Organization 8
Table of Contents 14
The Next Generation Web Search and the Demise of the Classic IR Model 21
The Last Half-Century: A Perspective on Experimentation in Information Retrieval 22
Learning in Hyperlinked Environments 23
A Parameterised Search System 24
Similarity Measures for Short Segments of Text 36
Multinomial Randomness Models for Retrieval with Document Fields 48
On Score Distributions and Relevance 60
Modeling Term Associations for Ad-Hoc Retrieval Performance Within Language Modeling Framework 72
Static Pruning of Terms in Inverted Files 84
Efficient Indexing of Versioned Document Sequences 96
Light Syntactically-Based Index Pruning for Information Retrieval 108
Sorting Out the Document Identifier Assignment Problem 121
Efficient Construction of FM-index Using Overlapping Block Processing for Large Scale Texts 133
Performance Comparison of Clustered and Replicated Information Retrieval Systems 144
A Study of a Weighting Scheme for Information Retrieval in Hierarchical Peer-to-Peer Networks 156
A Decision-Theoretic Model for Decentralised Query Routing in Hierarchical Peer-to-Peer Networks 168
Conclusion and Outlook 178
Central-Rank-Based Collection Selection in Uncooperative Distributed Information Retrieval 180
Results Merging Algorithm Using Multiple Regression Models 193
Segmentation of Search Engine Results for Effective Data-Fusion 205
Query Hardness Estimation Using Jensen-Shannon Divergence Among Multiple Scoring Functions 218
Query Reformulation and Refinement Using NLP-Based Sentence Clustering 230
Automatic Morphological Query Expansion Using Analogy-Based Machine Learning 242
Advanced Structural Representations for Question Classification and Answer Re-ranking 254
Incorporating Diversity and Density in Active Learning for Relevance Feedback 266
Relevance Feedback Using Weight Propagation Compared with Information-Theoretic Query Expansion 278
A Retrieval Evaluation Methodology for Incomplete Relevance Assessments 291
Evaluating Query-Independent Object Features for Relevancy Prediction 303
The Utility of Information Extraction in the Classification of Books 315
Combined Syntactic and Semantic Kernels for Text Classification 327
Fast Large-Scale Spectral Clustering by Sequential Shrinkage Optimization 339
A Probabilistic Model for Clustering Text Documents with Multiple Fields 351
Personalized Communities in a Distributed Recommender System 363
Information Recovery and Discovery in Collaborative Web Search 376
Collaborative Filtering Based on Transitive Correlations Between Items 388
Entropy-Based Authorship Search in Large Document Collections 401
Use of Topicality and Information Measures to Improve Document Representation for Story Link Detection 413
Ad Hoc Retrieval of Documents with Topical Opinion 425
Probabilistic Models for Expert Finding 438
Using Relevance Feedback in Expert Search 451
Using Topic Shifts for Focussed Access to XML Repositories 464
Feature- and Query-Based Table of ContentsGeneration for XML Documents 476
Setting Per-field NormalisationHyper-parameters for the Named-Page FindingSearch Task 488
Combining Evidence for Relevance Criteria: A Framework and Experiments in Web Retrieval 501
Classifier Fusion for SVM-Based Multimedia Semantic Indexing 514
Search of Spoken Documents Retrieves Well Recognized Transcripts 525
Natural Language Processing for Usage Based Indexing of Web Resources 537
Harnessing Trust in Social Search 545
How to Compare Bilingual to Monolingual Cross-Language Information Retrieval 553
Multilingual Text Classification Using Ontologies 561
Using Visual-Textual Mutual Information and Entropy for Inter-modal Document Indexing 569
A Study of Global Inference Algorithms in Multi-document Summarization 577
Document Representation Using Global Association Distance Model 585
Sentence Level Sentiment Analysis in the Presence of Conjuncts Using Linguistic Analysis 593
PageRank: When Order Changes 601
Model Tree Learning for Query TermWeighting in Question Answering 609
Examining Repetition in User Search Behavior 617
Popularity Weighted Ranking for Academic Digital Libraries 625
Naming Functions for the Vector Space Model 633
Effective Use of Semantic Structure in XML Retrieval 641
Searching Documents Based on Relevance and Type 649
Investigation of the Effectiveness of Cross-Media Indexing 657
Improve Ranking by Using Image Information 665
N-Step PageRank for Web Search 673
Authorship Attribution Via Combination of Evidence 681
Cross-Document Entity Tracking 690
Enterprise People and Skill Discovery Using Tolerant Retrieval and Visualization 694
Experimental Results of the Signal Processing Approach to Distributional Clustering of Terms on Reuters-21578 Collection 698
Overall Comparison at the Standard Levels of Recall of Multiple Retrieval Methods with the Friedman Test 702
Building a Desktop Search Test-Bed 706
Hierarchical Browsing of Video Key Frames 711
Active Learning with History-Based Query Selection for Text Categorisation 715
Fighting Link Spam with a Two-Stage Ranking Strategy 719
Improving Naive Bayes Text Classifier Using Smoothing Methods 723
Term Selection and Query Operations for Video Retrieval 728
An Effective Threshold-Based Neighbor Selection in Collaborative Filtering 732
Combining Multiple Sources of Evidence in XML Multimedia Documents: An Inference Network Incorporating Element Language Models 736
Language Model Based Query Classification 740
Integration of Text and Audio Features for Genre Classification in Music Information 744
Retrieval Method for Video Content in Different Format Based on Spatiotemporal Features 748
Combination of Document Priors in Web Information Retrieval 752
Enhancing Expert Search Through Query Modeling 757
A Hierarchical Consensus Architecture for Robust Document Clustering 761
Summarisation and Novelty: An Experimental Investigation 765
A Layered Approach to Context-Dependent User Modelling 769
A Bayesian Approach for Learning Document Type Relevance 773
Author Index 777
The Next Generation Web Search and the Demise of the Classic IR Model (p. 19)
Abstract. The classic IR model assumes a human engaged in activity that generates an "information need". This need is verbalized and then expressed as a query to search engine over a defined corpus. In the past decade, Web search engines have evolved from a first generation based on classic IR algorithms scaled to web size and thus supporting only informational queries, to a second generation supporting navigational queries using web specific information (primarily link analysis), to a third generation enabling transactional and other "semantic" queries based on a variety of technologies aimed to directly satisfy the unexpressed "user intent", thus moving further and further away from the classic model.
What is coming next? In this talk, we identify two trends, both representing "short-circuits" of the model: The first is the trend towards context driven Information Supply (IS), that is, the goal of Web IR will widen to include the supply of relevant information from multiple sources without requiring the user to make an explicit query. The information supply concept greatly precedes information retrieval, what is new in the web framework, is the ability to supply relevant information specific to a given activity and a given user, while the activity is being performed.
Thus the entire verbalization and query-formation phase are eliminated. The second trend is "social search" driven by the fact that the Web has evolved to being simultaneously a huge repository of knowledge and a vast social environment. As such, it is often more e.ective to ask the members of a given web milieu rather than construct elaborate queries. This short-circuits only the query formulation, but allows information finding activities such as opinion elicitation and discovery of social norms, that are not expressible at all as queries against a fixed corpus.
The Last Half-Century: A Perspective on Experimentation in Information Retrieval
Abstract. The experimental evaluation of information retrieval systems has a venerable history. Long before the current notion of a search engine, in fact before search by computer was even feasible, people in the library and information science community were beginning to tackle the evaluation issue. Sometimes it feels as though evaluation methodology has become fixed (stable or frozen, according to your viewpoint). However, this is far from the case. Interest in methodological questions is as great now as it ever was, and new ideas are continuing to develop. This talk will be a personal take on the field.
Learning in Hyperlinked Environments
Abstract. A remarkable number of important problems in different domains (e.g. web mining, pattern recognition, biology . . . ) are naturally modeled by functions de.ned on graphical domains, rather than on traditional vector spaces. Following the recent developments in statistical relational learning, in this talk, I introduce Diffusion Learning Machines (DLM) whose computation is very much related to Web ranking schemes based on link analysis. Using arguments from function approximation theory, I argue that, as a matter of fact, DLM can compute any conceivable ranking function on the Web.
Erscheint lt. Verlag | 1.1.2007 |
---|---|
Sprache | englisch |
Themenwelt | Informatik ► Datenbanken ► Data Warehouse / Data Mining |
Sozialwissenschaften ► Kommunikation / Medien ► Buchhandel / Bibliothekswesen | |
ISBN-10 | 3-540-71496-0 / 3540714960 |
ISBN-13 | 978-3-540-71496-5 / 9783540714965 |
Haben Sie eine Frage zum Produkt? |
Größe: 15,1 MB
DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.
Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich