Python and HDF5 - Andrew Collete

Python and HDF5

Unlocking Scientific Data

(Autor)

Buch | Softcover
142 Seiten
2013
O'Reilly Media (Verlag)
978-1-4493-6783-1 (ISBN)
26,90 inkl. MwSt
Gain hands-on experience with HDF5 for storing scientific data in Python. This practical guide quickly gets you up to speed on the details, best practices, and pitfalls of using HDF5 to archive and share numerical datasets ranging in size from gigabytes to terabytes.

Through real-world examples and practical exercises, you’ll explore topics such as scientific datasets, hierarchically organized groups, user-defined metadata, and interoperable files. Examples are applicable for users of both Python 2 and Python 3. If you’re familiar with the basics of Python data analysis, this is an ideal introduction to HDF5.
  • Get set up with HDF5 tools and create your first HDF5 file
  • Work with datasets by learning the HDF5 Dataset object
  • Understand advanced features like dataset chunking and compression
  • Learn how to work with HDF5’s hierarchical structure, using groups
  • Create self-describing files by adding metadata with HDF5 attributes
  • Take advantage of HDF5’s type system to create interoperable files
  • Express relationships among data with references, named types, and dimension scales
  • Discover how Python mechanisms for writing parallel code interact with HDF5

Andrew Collette holds a Ph.D. in physics from UCLA, and works as a laboratory research scientist at the University of Colorado. He has worked with the Python-NumPy-HDF5 stack at two multimillion-dollar research facilities; the first being the Large Plasma Device at UCLA (entirely standardized on HDF5), and the second being the hypervelocity dust accelerator at the Colorado Center for Lunar Dust and Atmospheric Studies, University of Colorado at Boulder. Additionally, Dr. Collette is a leading developer of the HDF5 for Python (h5py) project.

Chapter 1 Introduction
Python and HDF5
What Exactly Is HDF5?
Chapter 2 Getting Started
HDF5 Basics
Setting Up
The HDF5 Tools
Your First HDF5 File
Chapter 3 Working with Datasets
Dataset Basics
Reading and Writing Data
Resizing Datasets
Chapter 4 How Chunking and Compression Can Help You
Contiguous Storage
Chunked Storage
Setting the Chunk Shape
Performance Example: Resizable Datasets
Filters and Compression
Other Filters
Third-Party Filters
Chapter 5 Groups, Links, and Iteration: The "H" in HDF5
The Root Group and Subgroups
Group Basics
Working with Links
Iteration and Containership
Multilevel Iteration with the Visitor Pattern
Copying Objects
Object Comparison and Hashing
Chapter 6 Storing Metadata with Attributes
Attribute Basics
Real-World Example: Accelerator Particle Database
Chapter 7 More About Types
The HDF5 Type System
Integers and Floats
Fixed-Length Strings
Variable-Length Strings
Compound Types
Complex Numbers
Enumerated Types
Booleans
The array Type
Opaque Types
Dates and Times
Chapter 8 Organizing Data with References, Types, and Dimension Scales
Object References
Region References
Named Types
Dimension Scales
Chapter 9 Concurrency: Parallel HDF5, Threading, and Multiprocessing
Python Parallel Basics
Threading
Multiprocessing
MPI and Parallel HDF5
Chapter 10 Next Steps
Asking for Help
Contributing
Index
Colophon

Erscheint lt. Verlag 10.12.2013
Zusatzinfo illustrations
Verlagsort Sebastopol
Sprache englisch
Maße 178 x 233 mm
Gewicht 259 g
Einbandart kartoniert
Themenwelt Informatik Datenbanken Data Warehouse / Data Mining
Informatik Programmiersprachen / -werkzeuge Python
Mathematik / Informatik Informatik Web / Internet
ISBN-10 1-4493-6783-6 / 1449367836
ISBN-13 978-1-4493-6783-1 / 9781449367831
Zustand Neuware
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Datenanalyse für Künstliche Intelligenz

von Jürgen Cleve; Uwe Lämmel

Buch | Softcover (2024)
De Gruyter Oldenbourg (Verlag)
74,95
Auswertung von Daten mit pandas, NumPy und IPython

von Wes McKinney

Buch | Softcover (2023)
O'Reilly (Verlag)
44,90