adamreeve / npTDMS

NumPy based Python module for reading TDMS files produced by LabView

Home Page:http://nptdms.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to reduce the size of the tdms_index file

qingqingweiran opened this issue · comments

I get streaming data from tcp and I write tdms file by append mode.However, the tdms_index file is huge,it causes file opening to take a lot of time. then i read the TDMS file and then re-write it, the tdms_index get normal.
Is there a way to resolve this problem?

example

from nptdms import TdmsWriter, RootObject, GroupObject, ChannelObject
import numpy as np

root_object = RootObject(properties={
    "prop1": "foo",
    "prop2": 3,
})
group_object = GroupObject("group_1", properties={
    "prop1": 1.2345,
    "prop2": False,
})
data = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
channel_object = ChannelObject("group_1", "channel_1", data, properties={})

with TdmsWriter("my_file.tdms") as tdms_writer:
    # Write first segment
    tdms_writer.write_segment([
        root_object,
        group_object,
        channel_object])

with TdmsWriter("my_file.tdms", "a") as tdms_writer:

    for i in range(100000):
        more_data = np.array([6.0, 7.0, 8.0, 9.0, 10.0])
        channel_object = ChannelObject("group_1", "channel_1", more_data, properties={})
        tdms_writer.write_segment([channel_object])

image

from nptdms import TdmsFile, TdmsWriter, RootObject

original_file = TdmsFile(r"my_file.tdms")
original_groups = original_file.groups()
original_channels = [chan for group in original_groups for chan in group.channels()]

with TdmsWriter(r"my_file_copy.tdms") as copied_file:
    root_object = RootObject(original_file.properties)
    channels_to_copy = [chan for chan in original_channels ]
    copied_file.write_segment([root_object] + original_groups + channels_to_copy)

image

Hi, this is related to #244, writing of TDMS files is quite limited at the moment and doesn't try to optimize the space used by the index metadata. At the moment, I think the only way around this is to build up larger chunks of data before writing segments.

I'm going to close this as it's covered by #244 and any further relevant discussion can go on that issue.