Nibabel creates invalid GIfTI files

Question

Nibabel creates invalid GIfTI files

neurolabusc opened this issue a year ago · comments

The GIfTI format specification is available from here. Section 2.3.4.2 DataType notes This required attribute describes the numeric type of the data contained in a Data Array and are limited to the types displayed in the table.

Value	Description
NIFTI_TYPE_UINT8	Unsigned, 8-bit bytes.
NIFTI_TYPE_INT32	Signed, 32-bit integers.
NIFTI_TYPE_FLOAT32	32-bit single precision floating point.

However, the following code creates an invalid GIfTI file which popular tools can't read, using DataType="NIFTI_TYPE_FLOAT64"

# %% make_surface.py
import pathlib

import nibabel as nib

from nilearn import datasets, surface

fsaverage = datasets.fetch_surf_fsaverage("fsaverage7")

motor_images = datasets.fetch_neurovault_motor_task()
stat_img = motor_images.images[0]
surface_map = surface.vol_to_surf(stat_img, fsaverage.pial_left)

surface_map_path = "./surface_map.gii"

img = nib.gifti.gifti.GiftiImage()
img.add_gifti_data_array(
    nib.gifti.gifti.GiftiDataArray(
        surface_map,
        intent="NIFTI_INTENT_ZSCORE",
    )
)
nib.save(img, surface_map_path)

For example, Connectome Workbench is unable to view this image:

This is similar to previous issues where nibabel used excessive integer precision for GIfTI and NIfTI images. However, it is easier to establish the optimal data type for discrete integers rather than floating point data. For science, it is common to use float64 for internal calculations (15-17 significant decimal places) to avoid rounding errors, but for storage float32 is typically more than sufficient (about 7 decimal places). If there is a strong rationale for increasing the precision for GIfTI, the specification should be updated and implementation support provided to the smaller teams that create popular tools that support this format.

Chris Markiewicz · Answer 1 · Fri Feb 17 2023 00:35:16 GMT+0800 (China Standard Time)

Ah, good catch. I think we could do a pretty simple patch:

diff --git a/nibabel/gifti/gifti.py b/nibabel/gifti/gifti.py
index 326e60fa..265d61ed 100644
--- a/nibabel/gifti/gifti.py
+++ b/nibabel/gifti/gifti.py
@@ -460,7 +460,19 @@ class GiftiDataArray(xml.XmlSerializable):
         self.data = None if data is None else np.asarray(data)
         self.intent = intent_codes.code[intent]
         if datatype is None:
-            datatype = 'none' if self.data is None else self.data.dtype
+            if self.data is None:
+                datatype = 'none'
+            elif self.data.dtype == np.dtype('uint8'):
+                datatype = 'NIFTI_TYPE_UINT8'
+            elif np.issubdtype(self.data.dtype, np.integer):
+                datatype = 'NIFTI_TYPE_INT32'
+            elif np.issubdtype(self.data.dtype, np.floating):
+                datatype = 'NIFTI_TYPE_FLOAT32'
+            else:
+                raise ValueError(
+                    f"Cannot determine target data type for array with type {self.data.dtype}. "
+                    "Pass an explicit 'datatype' parameter to GiftiDataArray()."
+                )
         self.datatype = data_type_codes.code[datatype]
         self.encoding = gifti_encoding_codes.code[encoding]
         self.endian = gifti_endian_codes.code[endian]

I also don't see a very good reason to allow people to override the standard and save float64 data through a standard API. One thing we should consider is how to load files that were previously written with "forbidden" types; it would be ideal to make converting from a non-conformant to a conformant GIFTI as easy as:

nb.save(nb.load('float64.gii'), 'float32.gii')

So preserving the ability to read these files would be good.

Chris Rorden · Answer 2 · Fri Feb 17 2023 01:09:51 GMT+0800 (China Standard Time)

@effigies sounds good, thanks for your quick action. @alexisthual deserves credit for spotting this.

Matthew Brett · Answer 3 · Fri Feb 17 2023 01:23:55 GMT+0800 (China Standard Time)

FWIW - just giving an error sounds very reasonable to me. Can we be sure that there isn't anyone out there who is saving and loading these at float64 precision, and would expect to see the same outputs as previously? The patch will silently switch from float64 to float32.

Chris Markiewicz · Answer 4 · Sat Feb 18 2023 00:06:07 GMT+0800 (China Standard Time)

@neurolabusc @alexisthual your thoughts on #1199 would be appreciated.