Statistical Methods for Imbalanced Data in Ecological and Biological Studies - Osamu Komori, Shinto Eguchi

Statistical Methods for Imbalanced Data in Ecological and Biological Studies (eBook)

eBook Download: PDF
2019 | 1st ed. 2019
VIII, 59 Seiten
Springer Tokyo (Verlag)
978-4-431-55570-4 (ISBN)
Systemvoraussetzungen
53,49 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
This book presents a fresh, new approach in that it provides a comprehensive recent review of challenging problems caused by imbalanced data in prediction and classification, and also in that it introduces several of the latest statistical methods of dealing with these problems. The book discusses the property of the imbalance of data from two points of view. The first is quantitative imbalance, meaning that the sample size in one population highly outnumbers that in another population. It includes presence-only data as an extreme case, where the presence of a species is confirmed, whereas the information on its absence is uncertain, which is especially common in ecology in predicting habitat distribution. The second is qualitative imbalance, meaning that the data distribution of one population can be well specified whereas that of the other one shows a highly heterogeneous property. A typical case is the existence of outliers commonly observed in gene expression data, and another is heterogeneous characteristics often observed in a case group in case-control studies. The extension of the logistic regression model, maxent, and AdaBoost for imbalanced data is discussed, providing a new framework for improvement of prediction, classification, and performance of variable selection. Weights functions introduced in the methods play an important role in alleviating the imbalance of data. This book also furnishes a new perspective on these problem and shows some applications of the recently developed statistical methods to real data sets.


Osamu Komori, The Institute of Statistical Mathematics, 
Shinto Eguchi, The Institute of Statistical Mathematics

This book presents a fresh, new approach in that it provides a comprehensive recent review of challenging problems caused by imbalanced data in prediction and classification, and also in that it introduces several of the latest statistical methods of dealing with these problems. The book discusses the property of the imbalance of data from two points of view. The first is quantitative imbalance, meaning that the sample size in one population highly outnumbers that in another population. It includes presence-only data as an extreme case, where the presence of a species is confirmed, whereas the information on its absence is uncertain, which is especially common in ecology in predicting habitat distribution. The second is qualitative imbalance, meaning that the data distribution of one population can be well specified whereas that of the other one shows a highly heterogeneous property. A typical case is the existence of outliers commonly observed in gene expression data, and another is heterogeneous characteristics often observed in a case group in case-control studies. The extension of the logistic regression model, maxent, and AdaBoost for imbalanced data is discussed, providing a new framework for improvement of prediction, classification, and performance of variable selection. Weights functions introduced in the methods play an important role in alleviating the imbalance of data. This book also furnishes a new perspective on these problem and shows some applications of the recently developed statistical methods to real data sets.
Erscheint lt. Verlag 2.7.2019
Reihe/Serie JSS Research Series in Statistics
JSS Research Series in Statistics
SpringerBriefs in Statistics
SpringerBriefs in Statistics
Zusatzinfo VIII, 59 p. 22 illus., 7 illus. in color.
Sprache englisch
Themenwelt Mathematik / Informatik Mathematik Statistik
Mathematik / Informatik Mathematik Wahrscheinlichkeit / Kombinatorik
Medizin / Pharmazie Allgemeines / Lexika
Naturwissenschaften Biologie
Sozialwissenschaften Soziologie Empirische Sozialforschung
Schlagworte Divergence and Entropy • Generalized Linear Model • Imbalanced Data • machine learning methods • Maxent
ISBN-10 4-431-55570-6 / 4431555706
ISBN-13 978-4-431-55570-4 / 9784431555704
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 1,5 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich