gapost / hdf5oct

A library for reading and writing hdf5 files with GNU Octave

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

hdf5oct - a HDF5 wrapper for GNU Octave

This is a GNU Octave package for data serialization to/from HDF5 files. It provides an interface compatible to MATLAB's "High-Level Functions for HDF5 files".

The following functions are implemented:

- h5create
- h5write
- h5writeatt
- h5read
- h5readatt
- h5info
- h5disp 

The functions can be used to export/import multidimensional array data of class

'double','single','uint64','int64', ... 'uint8', 'int8', 'string'

Getting started

The following short OCTAVE code snippets show how to use the package:

>> pkg load hdf5oct % load the package
>> data = uint32(reshape(1:10,2,5)) % create data in workspace
data =

   1   3   5   7   9
   2   4   6   8  10

% create a HDF5 file with a dataset named 'D1' of matching size and datatype
>> h5create('test.h5','/D1',size(data),'datatype','uint32')
% store the data to the dataset
>> h5write('test.h5','/D1',data)
% read back the values
>> h5read('test.h5','/D1')
ans =

   1   3   5   7   9
   2   4   6   8  10

% read the 3rd & 6th columns as a [2x2] matrix 
>> h5read('test.h5','/D1',[1 3],[2 2],[1 2])
ans =

   5   9
   6  10

Strings are always UTF8 encoded. A single string is written to a scalar dataset (size=1). Multiple strings must be passed as a cell array:

>> oneliner = "This is a single string";
>> h5create('test.h5','/D2',1,'datatype','string') % scalar string dataset
>> h5write('test.h5','/D2',oneliner)
>> h5read('test.h5','/D2')
ans = This is a single string
>> str = {"one", "δύο", "три", "neljä"};
>> h5create('test.h5','/D3',size(str),'datatype','string')
>> h5write('test.h5','/D3',str)
>> h5read('test.h5','/D3')
ans =
{
  [1,1] = one
  [1,2] = δύο
  [1,3] = три
  [1,4] = neljä
}

The structure of the HDF5 file can be viewed with h5disp. h5info can also be used for more detail.

>> h5disp('test.h5')
HDF5 test.h5
Group '/'
  Dataset '/D1'
    Extent: Simple
    Size: 2x5
    MaxSize: 2x5
    Datatype
      Class: 'int'
      OctaveClass: 'uint32'
      Size: 4
      Sign: unsigned
  Dataset '/D2'
    Extent: Scalar
    Datatype
      Class: 'string'
      OctaveClass: 'string'
      Size: H5T_VARIABLE
      charSet: utf8
      Pading: nullterm
  Dataset '/D3'
    Extent: Simple
    Size: 1x4
    MaxSize: 1x4
    Datatype
      Class: 'string'
      OctaveClass: 'string'
      Size: H5T_VARIABLE
      charSet: utf8
      Pading: nullterm

Array storage layout convention

In HDF5, arrays are stored in C-style, row-major order. On the other hand, OCTAVE and MATLAB employ fortran-style, column-major storage order.

To avoid array transposition operations during data serialization, MATLAB employs the following convention, which is also followed by hdf5oct:

Storage Type MATLAB/OCTAVE array size HDF5 DataSpace dimensions
Matrix $[N \times M]$ $[M \times N]$
Multidimensional Array $[N_1 \times N_2 \times ... \times N_m]$ $[N_m \times N_{m-1} \times ... \times N_1]$
Row Vector $[1 \times N]$ $[N \times 1]$ or $[N]$
Column Vector $[N \times 1]$ $[1 \times N]$

In this manner, the data is copied "as-is" from memory to disk, minimizing overhead and memory allocations.

A MATLAB user or an OCTAVE user employing hdf5oct to export/import data to/from HDF5 files will not notice any difference. However, when a MATLAB- or hdf5oct-generated file is opened by another application, or vice-versa, the arrays will appear transposed.

Installation

To install the latest snapshot run the following in OCTAVE

    pkg install "https://github.com/gapost/hdf5oct/archive/master.zip"

After successful installation, test the package with

    pkg test hdf5oct

This performs a number of basic tests on all functions in the package.

TODO

  • support compression flags for h5create

  • h5read: implement MATLAB compatible mapping to OCTAVE of the remaining HDF5 datatypes: Bitfield, Opaque, Reference, Enum, Compound, Array

  • write more comprehensive tests instead of a few random choices. Also test for error conditions.

License

© 2012, Tom Mullins
© 2015, Tom Mullins, Anton Starikov, Thorsten Liebig, Stefan Großhauser
© 2008-2013, Andrew Collette
© 2024, George Apostolopoulos

HighFive is used internally as a C++ interface to the HDF5 library.

Released under the LGPLv3+ and Boost Software License v1.0.

About

A library for reading and writing hdf5 files with GNU Octave

License:GNU Lesser General Public License v3.0


Languages

Language:C++ 57.9%Language:MATLAB 29.9%Language:M 5.4%Language:Objective-C 5.1%Language:Makefile 1.8%