Designing and Evaluating Language Corpora - Jesse Egbert, Douglas Biber, Bethany Gray

Designing and Evaluating Language Corpora

A Practical Framework for Corpus Representativeness
Buch | Hardcover
250 Seiten
2022
Cambridge University Press (Verlag)
978-1-107-15138-3 (ISBN)
118,45 inkl. MwSt
The use of language corpora, or large samples of natural texts, has become ubiquitous in linguistic research. Yet, there are no conceptual or methodological frameworks for corpus representativeness. This book is the first to provide the field of linguistics with a comprehensive framework for corpus design, evaluation, and representativeness.
Corpora are ubiquitous in linguistic research, yet to date, there has been no consensus on how to conceptualize corpus representativeness and collect corpus samples. This pioneering book bridges this gap by introducing a conceptual and methodological framework for corpus design and representativeness. Written by experts in the field, it shows how corpora can be designed and built in a way that is both optimally suited to specific research agendas, and adequately representative of the types of language use in question. It considers questions such as 'what types of texts should be included in the corpus?', and 'how many texts are required?' – highlighting that the degree of representativeness rests on the dual pillars of domain considerations and distribution considerations. The authors introduce, explain, and illustrate all aspects of this corpus representativeness framework in a step-by-step fashion, using examples and activities to help readers develop practical skills in corpus design and evaluation.

Jesse Egbert is Associate Professor of Applied Linguistics at Northern Arizona University. He is a co-founding General Editor of Register Studies, and his recent books focus on online register variation (2018), methodogical triangulation (2016, 2020), and corpus linguistics methods (2020). Douglas Biber is Regents' Professor of Applied Linguistics at Northern Arizona University. Previous books include Register, Genre, and Style (2009/2019), Grammar of Spoken and Written English (2021), and studies of register variation (1988, 1995, 2018). Bethany Gray is Associate Professor of Applied Linguistics and Technology at Iowa State University. Her publications include monographs on academic research articles (2015), historical change in writing (2016). She is a co-founding General Editor of Register Studies.

1. Introduction; 2. Approaches to representativeness in previous corpus linguistic research; 3. Corpus representativeness: a conceptual and methodological framework; 4. Domain considerations; 5. Distribution considerations; 6. The influence of domain and distribution considerations on corpus representativeness – bringing it all together; 7. Corpus design and representativeness in practice; Glossary; Appendix A. Example articles documenting existing corpora; Appendix B. Survey of corpus design and compilation practices.

Erscheinungsdatum
Verlagsort Cambridge
Sprache englisch
Maße 158 x 235 mm
Gewicht 570 g
Themenwelt Geisteswissenschaften Sprach- / Literaturwissenschaft Sprachwissenschaft
Informatik Datenbanken Data Warehouse / Data Mining
ISBN-10 1-107-15138-4 / 1107151384
ISBN-13 978-1-107-15138-3 / 9781107151383
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Datenanalyse für Künstliche Intelligenz

von Jürgen Cleve; Uwe Lämmel

Buch | Softcover (2024)
De Gruyter Oldenbourg (Verlag)
74,95
Auswertung von Daten mit pandas, NumPy und IPython

von Wes McKinney

Buch | Softcover (2023)
O'Reilly (Verlag)
44,90