jonm3D / h5xray

Helping Python developers understand the structure and 'cloud-friendliness' of an HDF5 file.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

H5XRay

Helping Python developers understand the structure and 'cloud-friendliness' of an HDF5 file. To best keep on top of updates, star/watch this repository.

A weekend project inspired by the h5cloud project at the 2023 ICESat-2 Hackweek.

Don't want to deal with code? Try out the web app!

Installation

To get started with H5XRay, you can easily install it using pip.

Installation

pip install git+https://github.com/jonm3d/h5xray.git

Updating to the Latest Version

pip install --upgrade git+https://github.com/jonm3d/h5xray.git

Usage

H5xray provides visualizations and reports of an HDF5 file structure. See the examples directory for more detailed usage, inluding reading local files or from S3.

Example Visualizations

Default Plot

Options Plot

Example Report Contents

Report for data/atl03_4.h5:

  • Elapsed time (s): 1.373
  • Total datasets: 1020
  • Total requests: 1276.0
  • Request byte size: 2097152 bytes
  • Assumed cost per 1000 GET requests: $0.0004
  • Total cost for file: $0.0005104

Top 5 datasets with most requests:

  1. /gt3r/heights/lat_ph - 14.0 requests | Chunking: (10000,) | Number of Chunks: [616.0]
  2. /gt3r/heights/lon_ph - 14.0 requests | Chunking: (10000,) | Number of Chunks: [616.0]
  3. /gt1r/heights/lat_ph - 13.0 requests | Chunking: (10000,) | Number of Chunks: [583.0]
  4. /gt1r/heights/lon_ph - 13.0 requests | Chunking: (10000,) | Number of Chunks: [583.0]
  5. /gt1r/heights/h_ph - 11.0 requests | Chunking: (10000,) | Number of Chunks: [583.0]

System Info:

  • OS: posix
  • Platform: Linux
  • Platform Version: #1 SMP Tue Feb 14 21:50:23 UTC 2023
  • Python Version: 3.10.12
  • Machine: x86_64
  • Processor: x86_64
  • Current Working Directory: /home/jovyan/h5xray
  • Host Name: jupyter-jonm3d
  • Number of CPUs: 4

Interactive Tree Plot

The package also includes an interactive plot for inspecting HDF5 files in a jupyter notebook.

Made with ❤️ and ☕️ by:

Jonathan Markel
PhD Student
3D Geospatial Laboratory
The University of Texas at Austin
jonathanmarkel@gmail.com

Twitter | GitHub | Website | GoogleScholar | LinkedIn

This work was supported by NASA FINESST Award 80NSSC23K0272.

About

Helping Python developers understand the structure and 'cloud-friendliness' of an HDF5 file.

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 100.0%