How to reduce the size of the tdms_index file

Question

How to reduce the size of the tdms_index file

qingqingweiran opened this issue 3 years ago · comments

I get streaming data from tcp and I write tdms file by append mode.However, the tdms_index file is huge，it causes file opening to take a lot of time. then i read the TDMS file and then re-write it, the tdms_index get normal.
Is there a way to resolve this problem?

example

from nptdms import TdmsWriter, RootObject, GroupObject, ChannelObject
import numpy as np

root_object = RootObject(properties={
    "prop1": "foo",
    "prop2": 3,
})
group_object = GroupObject("group_1", properties={
    "prop1": 1.2345,
    "prop2": False,
})
data = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
channel_object = ChannelObject("group_1", "channel_1", data, properties={})

with TdmsWriter("my_file.tdms") as tdms_writer:
    # Write first segment
    tdms_writer.write_segment([
        root_object,
        group_object,
        channel_object])

with TdmsWriter("my_file.tdms", "a") as tdms_writer:

    for i in range(100000):
        more_data = np.array([6.0, 7.0, 8.0, 9.0, 10.0])
        channel_object = ChannelObject("group_1", "channel_1", more_data, properties={})
        tdms_writer.write_segment([channel_object])

from nptdms import TdmsFile, TdmsWriter, RootObject

original_file = TdmsFile(r"my_file.tdms")
original_groups = original_file.groups()
original_channels = [chan for group in original_groups for chan in group.channels()]

with TdmsWriter(r"my_file_copy.tdms") as copied_file:
    root_object = RootObject(original_file.properties)
    channels_to_copy = [chan for chan in original_channels ]
    copied_file.write_segment([root_object] + original_groups + channels_to_copy)

Adam Reeve · Answer 1 · Thu Dec 30 2021 03:38:08 GMT+0800 (China Standard Time)

Hi, this is related to #244, writing of TDMS files is quite limited at the moment and doesn't try to optimize the space used by the index metadata. At the moment, I think the only way around this is to build up larger chunks of data before writing segments.

Adam Reeve · Answer 2 · Sun Feb 20 2022 08:06:31 GMT+0800 (China Standard Time)

I'm going to close this as it's covered by #244 and any further relevant discussion can go on that issue.