Nibabel creates invalid GIfTI files
neurolabusc opened this issue · comments
The GIfTI format specification is available from here. Section 2.3.4.2 DataType
notes This required attribute describes the numeric type of the data contained in a Data Array and are limited to the types displayed in the table.
Value | Description |
---|---|
NIFTI_TYPE_UINT8 | Unsigned, 8-bit bytes. |
NIFTI_TYPE_INT32 | Signed, 32-bit integers. |
NIFTI_TYPE_FLOAT32 | 32-bit single precision floating point. |
However, the following code creates an invalid GIfTI file which popular tools can't read, using DataType="NIFTI_TYPE_FLOAT64"
# %% make_surface.py
import pathlib
import nibabel as nib
from nilearn import datasets, surface
fsaverage = datasets.fetch_surf_fsaverage("fsaverage7")
motor_images = datasets.fetch_neurovault_motor_task()
stat_img = motor_images.images[0]
surface_map = surface.vol_to_surf(stat_img, fsaverage.pial_left)
surface_map_path = "./surface_map.gii"
img = nib.gifti.gifti.GiftiImage()
img.add_gifti_data_array(
nib.gifti.gifti.GiftiDataArray(
surface_map,
intent="NIFTI_INTENT_ZSCORE",
)
)
nib.save(img, surface_map_path)
For example, Connectome Workbench is unable to view this image:
This is similar to previous issues where nibabel used excessive integer precision for GIfTI and NIfTI images. However, it is easier to establish the optimal data type for discrete integers rather than floating point data. For science, it is common to use float64 for internal calculations (15-17 significant decimal places) to avoid rounding errors, but for storage float32 is typically more than sufficient (about 7 decimal places). If there is a strong rationale for increasing the precision for GIfTI, the specification should be updated and implementation support provided to the smaller teams that create popular tools that support this format.
Ah, good catch. I think we could do a pretty simple patch:
diff --git a/nibabel/gifti/gifti.py b/nibabel/gifti/gifti.py
index 326e60fa..265d61ed 100644
--- a/nibabel/gifti/gifti.py
+++ b/nibabel/gifti/gifti.py
@@ -460,7 +460,19 @@ class GiftiDataArray(xml.XmlSerializable):
self.data = None if data is None else np.asarray(data)
self.intent = intent_codes.code[intent]
if datatype is None:
- datatype = 'none' if self.data is None else self.data.dtype
+ if self.data is None:
+ datatype = 'none'
+ elif self.data.dtype == np.dtype('uint8'):
+ datatype = 'NIFTI_TYPE_UINT8'
+ elif np.issubdtype(self.data.dtype, np.integer):
+ datatype = 'NIFTI_TYPE_INT32'
+ elif np.issubdtype(self.data.dtype, np.floating):
+ datatype = 'NIFTI_TYPE_FLOAT32'
+ else:
+ raise ValueError(
+ f"Cannot determine target data type for array with type {self.data.dtype}. "
+ "Pass an explicit 'datatype' parameter to GiftiDataArray()."
+ )
self.datatype = data_type_codes.code[datatype]
self.encoding = gifti_encoding_codes.code[encoding]
self.endian = gifti_endian_codes.code[endian]
I also don't see a very good reason to allow people to override the standard and save float64 data through a standard API. One thing we should consider is how to load files that were previously written with "forbidden" types; it would be ideal to make converting from a non-conformant to a conformant GIFTI as easy as:
nb.save(nb.load('float64.gii'), 'float32.gii')
So preserving the ability to read these files would be good.
@effigies sounds good, thanks for your quick action. @alexisthual deserves credit for spotting this.
FWIW - just giving an error sounds very reasonable to me. Can we be sure that there isn't anyone out there who is saving and loading these at float64 precision, and would expect to see the same outputs as previously? The patch will silently switch from float64 to float32.
@neurolabusc @alexisthual your thoughts on #1199 would be appreciated.