Python and HDF5. Unlocking Scientific Data - Helion
ISBN: 978-14-919-4500-1
stron: 152, Format: ebook
Data wydania: 2013-10-21
Księgarnia: Helion
Cena książki: 92,65 zł (poprzednio: 107,73 zł)
Oszczędzasz: 14% (-15,08 zł)
Gain hands-on experience with HDF5 for storing scientific data in Python. This practical guide quickly gets you up to speed on the details, best practices, and pitfalls of using HDF5 to archive and share numerical datasets ranging in size from gigabytes to terabytes.
Through real-world examples and practical exercises, you’ll explore topics such as scientific datasets, hierarchically organized groups, user-defined metadata, and interoperable files. Examples are applicable for users of both Python 2 and Python 3. If you’re familiar with the basics of Python data analysis, this is an ideal introduction to HDF5.
- Get set up with HDF5 tools and create your first HDF5 file
- Work with datasets by learning the HDF5 Dataset object
- Understand advanced features like dataset chunking and compression
- Learn how to work with HDF5’s hierarchical structure, using groups
- Create self-describing files by adding metadata with HDF5 attributes
- Take advantage of HDF5’s type system to create interoperable files
- Express relationships among data with references, named types, and dimension scales
- Discover how Python mechanisms for writing parallel code interact with HDF5
Osoby które kupowały "Python and HDF5. Unlocking Scientific Data", wybierały także:
- Django 4. Praktyczne tworzenie aplikacji sieciowych. Wydanie IV 125,48 zł, (38,90 zł -69%)
- GraphQL. Kurs video. Buduj nowoczesne API w Pythonie 164,31 zł, (59,15 zł -64%)
- Flask. Kurs video. Od pierwszej linijki kodu do praktycznego zastosowania 119,00 zł, (47,60 zł -60%)
- Python na start. Kurs video. Tw 99,00 zł, (39,60 zł -60%)
- Python. Kurs video. Programowanie asynchroniczne 97,32 zł, (39,90 zł -59%)
Spis treści
Python and HDF5. Unlocking Scientific Data eBook -- spis treści
- Python and HDF5
- Preface
- Conventions Used in This Book
- Using Code Examples
- Safari Books Online
- How to Contact Us
- Acknowledgments
- 1. Introduction
- Python and HDF5
- Organizing Data and Metadata
- Coping with Large Data Volumes
- What Exactly Is HDF5?
- HDF5: The File
- HDF5: The Library
- HDF5: The Ecosystem
- Python and HDF5
- 2. Getting Started
- HDF5 Basics
- Setting Up
- Python 2 or Python 3?
- Code Examples
- NumPy
- HDF5 and h5py
- IPython
- Timing and Optimization
- The HDF5 Tools
- HDFView
- ViTables
- Command Line Tools
- Your First HDF5 File
- Use as a Context Manager
- File Drivers
- core driver
- family driver
- mpio driver
- The User Block
- 3. Working with Datasets
- Dataset Basics
- Type and Shape
- Reading and Writing
- Creating Empty Datasets
- Saving Space with Explicit Storage Types
- Automatic Type Conversion and Direct Reads
- Reading with astype
- Reshaping an Existing Array
- Fill Values
- Reading and Writing Data
- Using Slicing Effectively
- Start-Stop-Step Indexing
- Multidimensional and Scalar Slicing
- Boolean Indexing
- Coordinate Lists
- Automatic Broadcasting
- Reading Directly into an Existing Array
- A Note on Data Types
- Resizing Datasets
- Creating Resizable Datasets
- Data Shuffling with resize
- When and How to Use resize
- Dataset Basics
- 4. How Chunking and Compression Can Help You
- Contiguous Storage
- Chunked Storage
- Setting the Chunk Shape
- Auto-Chunking
- Manually Picking a Shape
- Performance Example: Resizable Datasets
- Filters and Compression
- The Filter Pipeline
- Compression Filters
- GZIP/DEFLATE Compression
- SZIP Compression
- LZF Compression
- Performance
- Other Filters
- SHUFFLE Filter
- FLETCHER32 Filter
- Third-Party Filters
- 5. Groups, Links, and Iteration: The H in HDF5
- The Root Group and Subgroups
- Group Basics
- Dictionary-Style Access
- Special Properties
- Working with Links
- Hard Links
- Free Space and Repacking
- Soft Links
- External Links
- A Note on Object Names
- Using get to Determine Object Types
- Using require to Simplify Your Application
- Iteration and Containership
- How Groups Are Actually Stored
- Dictionary-Style Iteration
- Containership Testing
- Multilevel Iteration with the Visitor Pattern
- Visit by Name
- Multiple Links and visit
- Visiting Items
- Canceling Iteration: A Simple Search Mechanism
- Copying Objects
- Single-File Copying
- Object Comparison and Hashing
- 6. Storing Metadata with Attributes
- Attribute Basics
- Type Guessing
- Strings and File Compatibility
- Python Objects
- Explicit Typing
- Real-World Example: Accelerator Particle Database
- Application Format on Top of HDF5
- Analyzing the Data
- Attribute Basics
- 7. More About Types
- The HDF5 Type System
- Integers and Floats
- Fixed-Length Strings
- Variable-Length Strings
- The vlen String Data Type
- Working with vlen String Datasets
- Byte Versus Unicode Strings
- Using Unicode Strings
- Dont Store Binary Data in Strings!
- Future-Proofing Your Python 2 Application
- Compound Types
- Complex Numbers
- Enumerated Types
- Booleans
- The array Type
- Opaque Types
- Dates and Times
- 8. Organizing Data with References, Types, and Dimension Scales
- Object References
- Creating and Resolving References
- References as Unbreakable Links
- References as Data
- Region References
- Creating Region References and Reading
- Fancy Indexing
- Finding Datasets with Region References
- Named Types
- The Datatype Object
- Linking to Named Types
- Managing Named Types
- Dimension Scales
- Creating Dimension Scales
- Attaching Scales to a Dataset
- Object References
- 9. Concurrency: Parallel HDF5, Threading, and Multiprocessing
- Python Parallel Basics
- Threading
- Multiprocessing
- MPI and Parallel HDF5
- A Very Quick Introduction to MPI
- MPI-Based HDF5 Program
- Collective Versus Independent Operations
- Atomicity Gotchas
- 10. Next Steps
- Asking for Help
- Contributing
- Index
- About the Author
- Colophon
- Copyright