From Extractive to Abstractive Summarization: A Journey - Parth Mehta, Prasenjit Majumder

From Extractive to Abstractive Summarization: A Journey (eBook)

eBook Download: PDF
2019 | 1st ed. 2019
XI, 116 Seiten
Springer Singapore (Verlag)
978-981-13-8934-4 (ISBN)
Systemvoraussetzungen
96,29 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
This book describes recent advances in text summarization, identifies remaining gaps and challenges, and proposes ways to overcome them. It begins with one of the most frequently discussed topics in text summarization -  'sentence extraction' -, examines the effectiveness of current techniques in domain-specific text summarization, and proposes several improvements. 
In turn, the book describes the application of summarization in the legal and scientific domains, describing two new corpora that consist of more than 100 thousand court judgments and more than 20 thousand scientific articles, with the corresponding manually written summaries. The availability of these large-scale corpora opens up the possibility of using the now popular data-driven approaches based on deep learning. The book then highlights the effectiveness of neural sentence extraction approaches, which perform just as well as rule-based approaches, but without the need for any manual annotation. As a next step, multiple techniques for creating ensembles of sentence extractors - which deliver better and more robust summaries - are proposed. In closing, the book presents a neural network-based model for sentence compression. Overall the book takes readers on a journey that begins with simple sentence extraction and ends in abstractive summarization, while also covering key topics like ensemble techniques and domain-specific summarization, which have not been explored in detail prior to this.


Dr. Parth Mehta completed his M.Tech. in Machine Intelligence and his Ph.D. in Text Summarization at Dhirubhai Ambani Institute of ICT (DA-IICT), Gandhinagar, India. At the DA-IICT he was part of the Information Retrieval and Natural Language Processing Lab. He was also involved in the national project 'Cross Lingual Information Access', funded by the Govt. of India, which focused on building a cross-lingual search engine for nine Indian languages. 
Dr. Mehta has served as reviewer for the journals Information Processing and Management and Forum for Information Retrieval Evaluation. Apart from several journal and conference papers, he has also co-edited a book on text processing published by Springer. 
Prof. Prasenjit Majumder is an Associate Professor at Dhirubhai Ambani Institute of ICT (DA-IICT), Gandhinagar and a Visiting Professor at the Indian Institute of Information Technology, Vadodara (IIIT-V). Prof. Majumder completed his Ph.D. at Jadavpur University in 2008 and worked as a postdoctoral fellow at the University College Dublin, prior to joining the DA-IICT, where he currently heads the Information Retrieval and Language Processing Lab. His research interests lie at the intersection of Information Retrieval, Cognitive Science and Human Computing Interaction. He has headed several projects sponsored by the Govt. of India. 
He is one of the pioneers of the Forum for Information Retrieval Evaluation (FIRE), which assesses research on Information Retrieval and related areas for South Asian languages. Since being founded in 2008, FIRE has grown to become a respected conference, drawing participants from across the globe. Prof. Majumder has authored several journal and conference papers, and co-edited two special issues of Transactions in Information Systems (ACM). He has co-edited two books: 'Multi Lingual Information Access in South Asian Languages' and 'Text Processing,' both published by Springer.

This book describes recent advances in text summarization, identifies remaining gaps and challenges, and proposes ways to overcome them. It begins with one of the most frequently discussed topics in text summarization -  'sentence extraction' -, examines the effectiveness of current techniques in domain-specific text summarization, and proposes several improvements. In turn, the book describes the application of summarization in the legal and scientific domains, describing two new corpora that consist of more than 100 thousand court judgments and more than 20 thousand scientific articles, with the corresponding manually written summaries. The availability of these large-scale corpora opens up the possibility of using the now popular data-driven approaches based on deep learning. The book then highlights the effectiveness of neural sentence extraction approaches, which perform just as well as rule-based approaches, but without the need for any manual annotation. As a next step, multiple techniques for creating ensembles of sentence extractors - which deliver better and more robust summaries - are proposed. In closing, the book presents a neural network-based model for sentence compression. Overall the book takes readers on a journey that begins with simple sentence extraction and ends in abstractive summarization, while also covering key topics like ensemble techniques and domain-specific summarization, which have not been explored in detail prior to this.
Erscheint lt. Verlag 13.8.2019
Zusatzinfo XI, 116 p. 470 illus., 9 illus. in color.
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Datenbanken
Mathematik / Informatik Informatik Netzwerke
Mathematik / Informatik Informatik Theorie / Studium
Informatik Weitere Themen Hardware
Schlagworte Automatic Text Summarization • Ensemble Techniques • Legal document summarization • Neural summarization • Scientific document summarization
ISBN-10 981-13-8934-9 / 9811389349
ISBN-13 978-981-13-8934-4 / 9789811389344
Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 2,3 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Was Benutzer alles wissen sollten

von Claudio Franzetti

eBook Download (2023)
Springer Berlin Heidelberg (Verlag)
39,99