Python for Natural Language Processing - Pierre M. Nugues

Python for Natural Language Processing

Programming with NumPy, scikit-learn, Keras, and PyTorch
Buch | Hardcover
XXV, 520 Seiten
2024 | 3. Third Edition 2024
Springer International Publishing (Verlag)
978-3-031-57548-8 (ISBN)
64,19 inkl. MwSt

Since the last edition of this book (2014), progress has been astonishing in all areas of Natural Language Processing, with recent achievements in Text Generation that spurred a media interest going beyond the traditional academic circles. Text Processing has meanwhile become a mainstream industrial tool that is used, to various extents, by countless companies. As such, a revision of this book was deemed necessary to catch up with the recent breakthroughs, and the author discusses models and architectures that have been instrumental in the recent progress of Natural Language Processing.

As in the first two editions, the intention is to expose the reader to the theories used in Natural Language Processing, and to programming examples that are essential for a deep understanding of the concepts. Although present in the previous two editions, Machine Learning is now even more pregnant, having replaced many of the earlier techniques to process text. Many new techniques build on the availability of text. 

Using Python notebooks, the reader will be able to load small corpora, format text, apply the models through executing pieces of code, gradually discover the theoretical parts by possibly modifying the code or the parameters, and traverse theories and concrete problems through a constant interaction between the user and the machine. The data sizes and hardware requirements are kept to a reasonable minimum so that a user can see instantly, or at least quickly, the results of most experiments on most machines.

The book does not assume a deep knowledge of Python, and an introduction to this language aimed at Text Processing is given in Ch. 2, which will enable the reader to touch all the programming concepts, including NumPy arrays and PyTorch tensors as fundamental structures to represent and process numerical data in Python, or Keras for training Neural Networks to classify texts. Covering topics like Word Segmentation and Part-of-Speech and Sequence Annotation, the textbook also gives an in-depth overview of Transformers (for instance, BERT), Self-Attention and Sequence-to-Sequence Architectures. 

Pierre Nugues is a professor in the Dept. of Computer Science of Lund University. His research is focused on natural language processing for advanced user interfaces and spoken dialogue. This includes the design and the implementation of conversational agents within a multimodal framework and text visualization. He led the team that designed a navigation agent - Ulysse - that enables a user to navigate in a virtual reality environment using language, and the team that designed the CarSim system that generates animated 3D scenes from written texts. He has taught natural-language processing and computational linguistics at the following institutions: ISMRA, Caen, France; University of Nottingham, UK; Staffordshire University, UK; FH Konstanz, Germany; Lund University, Sweden and Ghent University, Belgium.

Preface to the third edition.- Preface to the second edition.- Preface to the first edition.- 1. An Overview of Language Processing.- 2. A Tour of Python.- 3. Corpus Processing Tools.- 4. Encoding and Annotation Scheme.- 5. Python for Numerical Computations.- 6. Topics in Information Theory and Machine Learning.- 7. Linear and Logistic Regression.- 8. Neural Networks.- 9. Counting and Indexing Words.- 10. Dense Vector Representations.- 11. Word Sequences.- 12. Words, Parts of Speech, and Morphology.- 13. Subword Segmentation.- 14. Part-of-Speech and Sequence Annotation.- 15. Self-Attention and Transformers.- 16. Pretraining an Encoder: The BERT Language Model.- 17. Sequence-to-Sequence Architectures: Encoder-Decoders and Decoders.- Index.- References.

Erscheinungsdatum
Reihe/Serie Cognitive Technologies
Zusatzinfo XXV, 520 p. 89 illus., 53 illus. in color.
Verlagsort Cham
Sprache englisch
Maße 155 x 235 mm
Themenwelt Mathematik / Informatik Informatik
Schlagworte Annotation schemes • Bert • Information Theory • Keras • machine learning • Machine Translation • Named Entity Recognition • Natural Language Processing • Neural networks • NumPy • Part-of-speech Tagging • Python • PyTorch • scikit-learn • Text Segmentation • Tokenization • Transformer • word2vec • Word Embeddings
ISBN-10 3-031-57548-2 / 3031575482
ISBN-13 978-3-031-57548-8 / 9783031575488
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
den digitalen Office-Notizblock effizient nutzen für PC, Tablet und …

von Philip Kiefer

Buch | Softcover (2023)
Markt + Technik Verlag
9,95
ein Bericht aus Digitalien

von Peter Reichl

Buch (2023)
Muery Salzmann (Verlag)
19,00