Data Wrangling with R (eBook)

eBook Download: PDF
2016 | 1st ed. 2016
XII, 238 Seiten
Springer International Publishing (Verlag)
978-3-319-45599-0 (ISBN)

Lese- und Medienproben

Data Wrangling with R - Ph.D. Boehmke  Bradley C.
Systemvoraussetzungen
85,59 inkl. MwSt
  • Download sofort lieferbar
  • Zahlungsarten anzeigen

This guide for practicing statisticians, data scientists, and R users and programmers will teach the essentials of preprocessing: data leveraging the R programming language to easily and quickly turn noisy data into usable pieces of information. Data wrangling, which is also commonly referred to as data munging, transformation, manipulation, janitor work, etc., can be a painstakingly laborious process. Roughly 80% of data analysis is spent on cleaning and preparing data; however, being a prerequisite to the rest of the data analysis workflow (visualization, analysis, reporting), it is essential that one become fluent and efficient in data wrangling techniques.

This book will guide the user through the data wrangling process via a step-by-step tutorial approach and provide a solid foundation for working with data in R. The author's goal is to teach the user how to easily wrangle data in order to spend more time on understanding the content of the data. By the end of the book, the user will have learned: 

  • How to work with different types of data such as numerics, characters, regular expressions, factors, and dates
  • The difference between different data structures and how to create, add additional components to, and subset each data structure
  • How to acquire and parse data from locations previously inaccessible
  • How to develop functions and use loop control structures to reduce code redundancy
  • How to use pipe operators to simplify code and make it more readable
  • How to reshape the layout of data and manipulate, summarize, and join data sets



Brad Boehmke, Ph.D., is an Operations Research Analyst at Headquarters Air Force Materiel Command, Studies and Analyses Division.  He is also Assistant Professor in the Operational Sciences Department at the Air Force Institute of Technology.  Dr. Boehmke's research interests are in the areas of cost analysis, economic modeling, decision analysis, and developing applied modeling applications through the R statistical language.

Brad Boehmke, Ph.D., is an Operations Research Analyst at Headquarters Air Force Materiel Command, Studies and Analyses Division.  He is also Assistant Professor in the Operational Sciences Department at the Air Force Institute of Technology.  Dr. Boehmke's research interests are in the areas of cost analysis, economic modeling, decision analysis, and developing applied modeling applications through the R statistical language.

1.Preface 2.Introduction  a.The Role of Data Wrangling i.Introduction to R 1.Open Source 2.Flexibility 3.Community ii.R Basics 1.Assignment & Evaluation 2.Vectorization  3.Getting help 4.Workspace 5.Working with packages 6.Style guide 3.Working with Different Types of Data in R a.Dealing with Numbers i.Integer vs. Double  ii.Generating sequence of non-random numbers  iii.Generating sequence of random numbers iv.Setting the seed for reproducible random numbers v.Comparing numeric values  vi.Rounding numbers b.Dealing with Character Strings i.Character string basics ii.String manipulation with base R  iii.String manipulation with stringr  iv.Set operatons for character strings  c.Dealing with Regular Expressions i.Regex Syntax ii.Regex Functions iii.Additional resources d.Dealing with Factors  i.Creating, converting & inspecting factors ii.Ordering levels iii.Revalue levels iv.Dropping levels e.Dealing with Dates i.Getting current date & time  ii.Converting strings to dates  iii.Extract & manipulate parts of dates iv.Creating date sequences  v.Calculations with dates vi.Dealing with time zones & daylight savings vii.Additional resources a.Data Structure Basics i.Identifying the Structure ii.Attributes  b.Managing Vectors i.Creating ii.Adding on to iii.Adding attributes  iv.Subsetting c.Managing Lists i.Creating iii.Adding attributes iv.Subsetting d.Managing Matrices i.Creating ii.Adding on to iii.Adding attributes iv.Subsetting e.Managing Data Frames i.Creating ii.Adding on to iii.Adding attributes iv.Subsetting f.Dealing with Missing Values i.Testing for missing values  ii.Recoding missing values iii.Excluding missing values  5.Importing, Scraping, and Exporting Data with R a.Importing Data  i.Reading data from text files ii.Reading data from Excel files iii.Load data from saved R object file  iv.Additional resources b.Scraping Data i.Importing tabular and Excel files stored online  ii.Scraping HTML text iii.Scraping HTML table data iv.Working with APIs v.Additional Resources c.Exporting Data  i.Writing data to text files ii.Writing data to Excel files iii.Saving data as an R object file iv.Additional resources  6.Creating Efficient & Readable Code in R a.Functions i.Function Components ii.Arguments iii.Scoping Rules  iv.Lazy Evaluation v.Returning Multiple Outputs from a Function  vi.Dealing with Invalid Parameters  vii.Saving and Sourcing Functions viii.Additional Resources b.Loop Control Statements  i.Basic control statements (i.e. if, for, while, etc.) ii.Apply family iii.Other useful “loop-like” functions iv.Additional Resources c.Simplify Your Code with %>%  i.Pipe (%>%) Operator ii.Additional Functions iii.Additional Pipe Operators iv.Additional Resources 7.Shaping & Transforming Your Data with R  a.Reshaping Your Data with tidyr i.Making wide data long  ii.Making long data wide iii.Splitting a single column into multiple columns iv.Combining multiple columns into a single column v.Additional tidyr functions vi.Sequencing your tidyr operations vii.Additional resources b.Transforming Your Data with dplyr  i.Selecting variables of interest ii.Filtering rows  iii.Grouping data by categorical variables iv.Performing summary statistics on variables v.Arranging variables by value vi.Joining datasets vii.Creating new variables viii.Additional resources

Erscheint lt. Verlag 17.11.2016
Reihe/Serie Use R!
Use R!
Zusatzinfo XII, 238 p. 24 illus., 10 illus. in color.
Verlagsort Cham
Sprache englisch
Themenwelt Mathematik / Informatik Informatik Datenbanken
Mathematik / Informatik Informatik Grafik / Design
Mathematik / Informatik Mathematik Statistik
Mathematik / Informatik Mathematik Wahrscheinlichkeit / Kombinatorik
Technik
Wirtschaft
Schlagworte Coding • curl/rvest • Data Analysis • data frames • Data Matrix • data structures • data wrangling • dplyr • Exporting • fuzzy string • importing • lubridate • PCRE • plyr • programming • R • Scraping • stringr • tidyr • xml2
ISBN-10 3-319-45599-0 / 3319455990
ISBN-13 978-3-319-45599-0 / 9783319455990
Haben Sie eine Frage zum Produkt?
PDFPDF (Wasserzeichen)
Größe: 7,3 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasser­zeichen und ist damit für Sie persona­lisiert. Bei einer missbräuch­lichen Weiter­gabe des eBooks an Dritte ist eine Rück­ver­folgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich
Das umfassende Handbuch

von Wolfram Langer

eBook Download (2023)
Rheinwerk Computing (Verlag)
49,90
der Grundkurs für Ausbildung und Praxis

von Ralf Adams

eBook Download (2023)
Carl Hanser Verlag GmbH & Co. KG
29,99