adamreeve / npTDMS

NumPy based Python module for reading TDMS files produced by LabView

Home Page:http://nptdms.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

npTDMS drops datatypes on properties, which makes rewritten files (sometimes) different from originals

drillabit opened this issue · comments

When using software that checks or relies on the datatype entries for properties, copying (and patching, that's what i do) TDMS-Files with npTDMS is not possible, as npTDMS drops the type information stored with the properties in the file while reading, so that dictionaries created by npTDMS do not contain this information any more.
Instead, when writing, the type is deduced from the value in the dictionary. So a number 10, stored as a UInt64 in the input file, will come out as an Int32 in the output-file.

For my purposes i fixed that in tdms_segment.py by changing line 620 in read_property() from
return prop_name, value
to
return prop_name, prop_data_type(value)
This works for me, as the writer already uses types, if provided. I did not encounter any immediate problems within npTDMS, but, of course, when working with the properties in the dictionaries, one would have to access the raw values in the types-classes differently.

Hi @drillabit, thanks for opening this issue.

Yeah I don't think we'd want to make that exact change, as you say it would affect anyone wanting to access the property values and would be a fairly big breaking change.

But it should be possible to add the type information to the properties dictionary in a non-breaking way, so that if you pass the properties dictionary (or its containing channel/group) to TdmsWriter.write_segment it can make use of the type information. Eg. I'm imagining that instead of properties being a plain dictionary, we could use a subclass of dict that adds a tdms_types property that is a dictionary mapping property names to their types.

Thanks for the quick response @adamreeve ! I agree in that there should be some way to pass the types frome the reader to the writer, so that reading and writing does not alter the file. It does not have to be via typed values. That was just the easiest way to fix it, but not the most backwards compatible one. Nevertheless i think it is quite elegant. So i would also consider an additional option to the read method, to switch to typed values, if reqired.

Yes, that could be nice as it would make the typed properties more easily accessible to users, although means users would need to opt in to that behavior if they want the property types to round trip correctly when writing them out to another file. Another alternative would be for TDMS objects to have a separate typed_properties property that is a dictionary of the typed properties, and which the writer could use if present. Would that also work for you? It could just be a bit fiddly making sure the typed and untyped properties are kept in sync when they're modified.

The typed_properties extra dictionary would work for me! Let's go for it!