Waterfall.grab_data() misinforms when data has not been loaded

Question

Waterfall.grab_data() misinforms when data has not been loaded

texadactyl opened this issue 3 years ago · comments

Describe the bug
When Waterfall.grab_data() is called for a "heavy" file (data size exceeds 1 GB), it tries to process the request even though no data has been loaded (only the header was loaded). The result looks something like this:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/blimpy-2.0.6-py3.8.egg/blimpy/waterfall.py", line 310, in grab_data
    plot_data = np.squeeze(self.data[t_start:t_stop, ..., i0:i1 + 1])
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/elkins/BASIS/seti_testing/blimpy_testing/grabby.py", line 17, in <module>
    wf.grab_data()
  File "/usr/local/lib/python3.8/dist-packages/blimpy-2.0.6-py3.8.egg/blimpy/waterfall.py", line 315, in grab_data
    raise Exception("Waterfall.grab_data: Too much data requested")
Exception: Waterfall.grab_data: Too much data requested

Background: The 1 GB limit is imposed by blimpy to avoid running out of RAM, realistic or not. When the file is first processed, the data array size is computed.

If the data size < 1 GB, the data is indeed loaded and you will not see this traceback from grab_data().
If the data size exceeds 1 GB, then only the file header (~384 bytes) is loaded and the data shape is set to (1,: ) i.e. 1-dimensional and one byte of data.

To avoid a failure in grab_data(), caller must instantiate the Waterfall object such that the data is indeed loaded. Use the max_load parameter if necessary.