Automatic Indexing and Abstracting of Document Texts -  Marie-Francine Moens

Automatic Indexing and Abstracting of Document Texts (eBook)

eBook Download: PDF
2005 | 1. Auflage
284 Seiten
Springer US (Verlag)
978-0-306-47017-2 (ISBN)
Systemvoraussetzungen
127,95 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
Automatic Indexing and Abstracting of Document Texts summarizes the latest techniques of automatic indexing and abstracting, and the results of their application. It also places the techniques in the context of the study of text, manual indexing and abstracting, and the use of the indexing descriptions and abstracts in systems that select documents or information from large collections. Important sections of the book consider the development of new techniques for indexing and abstracting. The techniques involve the following: using text grammars, learning of the themes of the texts including the identification of representative sentences or paragraphs by means of adequate cluster algorithms, and learning of classification patterns of texts. In addition, the book is an attempt to illuminate new avenues for future research. Automatic Indexing and Abstracting of Document Texts is an excellent reference for researchers and professionals working in the field of content management and information retrieval. 
Automatic Indexing and Abstracting of Document Texts summarizes the latest techniques of automatic indexing and abstracting, and the results of their application. It also places the techniques in the context of the study of text, manual indexing and abstracting, and the use of the indexing descriptions and abstracts in systems that select documents or information from large collections. Important sections of the book consider the development of new techniques for indexing and abstracting. The techniques involve the following: using text grammars, learning of the themes of the texts including the identification of representative sentences or paragraphs by means of adequate cluster algorithms, and learning of classification patterns of texts. In addition, the book is an attempt to illuminate new avenues for future research. Automatic Indexing and Abstracting of Document Texts is an excellent reference for researchers and professionals working in the field of content management and information retrieval.

CONTENTS 7
PREFACE 11
ACKNOWLEDGEMENTS 15
PART I THE INDEXING AND ABSTRACTING ENVIRONMENT 18
Chapter 1 THE NEED FOR INDEXING AND ABSTRACTING TEXTS 19
1. INTRODUCTION 19
2. ELECTRONIC DOCUMENTS 20
3. COMMUNICATION THROUGH NATURAL LANGUAGE TEXT 21
4. UNDERSTANDING OF NATURAL LANGUAGE TEXT: THE COGNITIVE PROCESS 23
5. UNDERSTANDING OF NATURAL LANGUAGE TEXT: THE AUTOMATED PROCESS 24
6. IMPORTANT CONCEPTS IN INFORMATION RETRIEVAL AND SELECTION 26
7. GENERAL SOLUTIONS TO THE INFORMATION RETRIEVAL PROBLEM 33
8. THE NEED FOR BETTER AUTOMATIC INDEXING AND ABSTRACTING TECHNIQUES 38
Chapter 2 THE ATTRIBUTES OF TEXT 43
1. INTRODUCTION 43
2. THE STUDY OF TEXT 43
3. AN OVERVIEW OF SOME COMMON TEXT TYPES 45
4. TEXT DESCRIBED AT A MICRO LEVEL 46
5. TEXT DESCRIBED AT A MACRO LEVEL 54
6. CONCLUSIONS 63
Chapter 3 TEXT REPRESENTATIONS AND THEIR USE 65
1. INTRODUCTION 65
2. DEFINITIONS 65
3. REPRESENTATIONS THAT CHARACTERIZE THE CONTENT OF TEXT 3.1 Set of Natural Language Index Terms 66
4. INTELLECTUAL INDEXING AND ABSTRACTING 4.1 Gene ral 71
5. USE OF THE TEXT REPRESENTATIONS 76
6. A NOTE ABOUT THE STORAGE OF TEXT REPRESENTATIONS 85
7. CHARACTERISTICS OF GOOD TEXT REPRESENTATIONS 86
8. CONCLUSIONS 89
PART II METHODS OF AUTOMATIC INDEXING AND ABSTRACTING 91
Chapter 4 AUTOMATIC INDEXING: THE SELECTION OF NATURAL LANGUAGE INDEX TERMS 93
1. INTRODUCTION 93
2. A NOTE ABOUT EVALUATION 94
3. LEXICAL ANALYSIS 94
4. USE OF A STOPLIST 96
5. STEMMING 97
6. THE SELECTION OF PHRASES 100
7. INDEX TERM WEIGHTING 105
8. ALTERNATIVE PROCEDURES FOR SELECTING INDEX TERMS 114
9. SELECTION OF NATURAL LANGUAGE INDEX TERMS: ACCOMPLISHMENTS AND PROBLEMS 117
10. CONCLUSIONS 118
Chapter 5 AUTOMATIC INDEXING: THE ASSIGNMENT OF CONTROLLED LANGUAGE INDEX TERMS 119
1. INTRODUCTION 119
2. A NOTE ABOUT EVALUATION 120
3. THESAURUS TERMS 122
4. SUBJECT AND CLASSIFICATION CODES 127
5. LEARNING APPROACHES TO TEXT CATEGORIZATION 131
6. ASSIGNMENT OF CONTROLLED LANGUAGE INDEX TERMS: ACCOMPLISHMENTS AND PROBLEMS 147
7. CONCLUSIONS 148
Chapter 6 AUTOMATIC ABSTRACTING: THE CREATION OF TEXT SUMMARIES 149
1. INTRODUCTION 149
2. A NOTE ABOUT EVALUATION 150
3. THE TEXT ANALYSIS STEP 152
4. THE TRANSFORMATION STEP 4.1 Selection and Generalization of the Content 164
5. GENERATION OF THE ABSTRACT 166
6. TEXT ABSTRACTING: ACCOMPLISHMENTS AND PROBLEMS 168
7. CONCLUSIONS 170
PART III APPLICATIONS 172
Chapter 7 TEXT STRUCTURING AND CATEGORIZATION WHEN SUMMARIZING LEGAL CASES 173
1. INTRODUCTION 173
2. TEXT CORPUS AND OUTPUT OF THE SYSTEM 174
3. METHODS: THE USE OF A TEXT GRAMMAR 177
4. RESULTS AND DISCUSSION 181
5. CONTRIBUTIONS OF THE RESEARCH 184
6. CONCLUSIONS 188
Chapter 8 CLUSTERING OF PARAGRAPHS WHEN SUMMARIZING LEGAL CASES 189
1. INTRODUCTION 189
2. TEXT CORPUS AND OUTPUT OF THE SYSTEM 190
3. METHODS: THE CLUSTERING TECHNIQUES 191
4. RESULTS AND DISCUSSION 197
5. CONTRIBUTIONS OF THE RESEARCH 204
6. CONCLUSIONS 206
Chapter 9 THE CREATION OF HIGHLIGHT ABSTRACTS OF MAGAZINE ARTICLES 207
1. INTRODUCTION 207
2. TEXT CORPUS AND OUTPUT OF THE SYSTEM 208
3. METHODS: THE USE OF A TEXT GRAMMAR 210
4. RESULTS AND DISCUSSION 217
5. CONTRIBUTIONS OF THE RESEARCH 220
6. CONCLUSIONS 221
Chapter 10 THE ASSIGNMENT OF SUBJECT DESCRIPTORS TO MAGAZINE ARTICLES 223
1. INTRODUCTION 223
2. TEXT CORPUS AND OUTPUT OF THE SYSTEM 224
3. METHODS: SUPERVISED LEARNING OF CLASSIFICATION PATTERNS 226
4. RESULTS AND DISCUSSION 233
5. CONTRIBUTIONS OF THE RESEARCH 240
6. CONCLUSIONS 241
SUMMARY AND FUTURE PROSPECTS 243
1. SUMMARY 243
2. FUTURE PROSPECTS 251
REFERENCES 253
SUBJECT INDEX 277

Erscheint lt. Verlag 27.12.2005
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Datenbanken
Sozialwissenschaften Kommunikation / Medien Buchhandel / Bibliothekswesen
Technik
ISBN-10 0-306-47017-9 / 0306470179
ISBN-13 978-0-306-47017-2 / 9780306470172
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 3,6 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Das umfassende Handbuch

von Wolfram Langer

eBook Download (2023)
Rheinwerk Computing (Verlag)
49,90
der Grundkurs für Ausbildung und Praxis

von Ralf Adams

eBook Download (2023)
Carl Hanser Fachbuchverlag
29,99
Das umfassende Lehrbuch

von Michael Kofler

eBook Download (2024)
Rheinwerk Computing (Verlag)
49,90