adamreeve / npTDMS

NumPy based Python module for reading TDMS files produced by LabView

Home Page:http://nptdms.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

enable generator pattern

m-birke opened this issue · comments

Hi Adam,

I like to read out groups one by one w/o loading everything to memory at the same time.

"with open" provides a selective load but only if the group key (path) is known upfront. I don't have this information. So I have two solutions in my mind:

  1. implement generator pattern for groups()
  2. only load path names w/o the data (similar to read_metadata)

I didn't read the source code yet. Any comments or thoughts from your side?
KR

Hi

I don't think I understand the problem, it's not correct that "with open" requires you to know the group and channel names up front. Calling groups() returns a list but doesn't require loading any additional data from the TDMS file, it only uses the metadata that has already been loaded. Data is only loaded when you actually access channel data. Eg. if I run channel = tdms_file.groups()[0].channels()[0], at this point no additional data has been loaded. The only extra memory used is the overhead of the groups list and channels list, the group and channel objects contained in the list already exist once metadata has been read. Data is only loaded if I then call channel[:] for example.

Hi,

ah yeah, that is perfectly fine. Sorry, I misunderstood then the documentation on the PyPI page. Maybe on the PyPI page it is outdated? I think with open, TdmsFile.read(path) and TdmsFile(path) implement this lazy init behavior, right? At least my experiments did not yield any difference in memory allocation

Only with TdmsFile.open lazily loads data. The other two approaches should load data up front, but depending on the file structure the memory used by the actual data might be quite small compared to the metadata and indexes so you might not notice much difference.