Non-standard MS columns have an auto-generated schema which is not chunked according to logic for standard dimensions

Question

Non-standard MS columns have an auto-generated schema which is not chunked according to logic for standard dimensions

landmanbester opened this issue 2 years ago · comments

dask-ms version: 0.2.11
Python version: 3.8
Operating System: Ubuntu 20.04

Description

I am trying to convert an MS to zarr chunked by row and channel and its falling over with ValueError: Codec does not support buffers of > 2147483647 bytes despite the chunks only containing 25000 rows and 128 channels (around 25 MB by my count). Somewhat weirdly I think the error is mostly harmless because it does produce a dataset that I can subsequently read. I have not checked if all the subtables are what they should be though.

What I Did

Here is the full output from convert

$ dask-ms convert ms1_primary.ms -g "FIELD_ID,DATA_DESC_ID,SCAN_NUMBER" -o ms1_primary.zarr --chunks="{row:25000,chan:128}" --format zarr --force
2022-08-04 11:05:59,954 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 0 of column FLAG_CATEGORY in /home/bester/projects/ESO137/msdir/ms1_primary.ms/table.f18'
2022-08-04 11:06:00,178 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 98332 of column FLAG_CATEGORY in /home/bester/projects/ESO137/msdir/ms1_primary.ms/table.f18'
2022-08-04 11:06:00,657 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 196664 of column FLAG_CATEGORY in /home/bester/projects/ESO137/msdir/ms1_primary.ms/table.f18'
2022-08-04 11:06:01,007 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 294996 of column FLAG_CATEGORY in /home/bester/projects/ESO137/msdir/ms1_primary.ms/table.f18'
2022-08-04 11:06:01,196 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 393328 of column FLAG_CATEGORY in /home/bester/projects/ESO137/msdir/ms1_primary.ms/table.f18'
2022-08-04 11:06:01,395 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 491660 of column FLAG_CATEGORY in /home/bester/projects/ESO137/msdir/ms1_primary.ms/table.f18'
2022-08-04 11:06:01,625 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 589992 of column FLAG_CATEGORY in /home/bester/projects/ESO137/msdir/ms1_primary.ms/table.f18'
2022-08-04 11:06:01,838 - dask-ms - WARNING - Ignoring 'FLAG_CATEGORY': Unable to infer shape of column 'FLAG_CATEGORY' due to:
'Table DataManager error: Invalid operation: TSM: no array in row 688324 of column FLAG_CATEGORY in /home/bester/projects/ESO137/msdir/ms1_primary.ms/table.f18'
2022-08-04 11:06:02,008 - dask-ms - INFO - Input: 'measurementset' file:///home/bester/projects/ESO137/msdir/ms1_primary.ms
2022-08-04 11:06:02,008 - dask-ms - INFO - Output: 'zarr' file:///home/bester/projects/ESO137/msdir/ms1_primary.zarr
2022-08-04 11:06:09,797 - dask-ms - WARNING - Ignoring SOURCE
2022-08-04 11:06:09,802 - dask-ms - WARNING - Ignoring 'DIRECTION': Unable to infer shape of column 'DIRECTION' due to:
'TableProxy::getCell: no such row'
2022-08-04 11:06:09,803 - dask-ms - WARNING - Ignoring 'TARGET': Unable to infer shape of column 'TARGET' due to:
'TableProxy::getCell: no such row'
> /home/bester/software/dask-ms/daskms/apps/convert.py(354)execute()
-> dask.compute(writes)
(Pdb) c
Traceback (most recent call last):
  File "/home/bester/.venv/dms/bin/dask-ms", line 33, in <module>
    sys.exit(load_entry_point('dask-ms', 'console_scripts', 'dask-ms')())
  File "/home/bester/software/dask-ms/daskms/apps/entrypoint.py", line 9, in main
    return EntryPoint(sys.argv[1:]).execute()
  File "/home/bester/software/dask-ms/daskms/apps/entrypoint.py", line 32, in execute
    cmd.execute()
  File "/home/bester/software/dask-ms/daskms/apps/convert.py", line 354, in execute
    dask.compute(writes)
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/dask/base.py", line 598, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/dask/threaded.py", line 89, in get
   results = get_async(
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/dask/local.py", line 511, in get_async
    raise_exception(exc, tb)
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/dask/local.py", line 319, in reraise
    raise exc
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/dask/local.py", line 224, in execute_task
    result = _execute_task(task, data)
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/dask/optimization.py", line 990, in __call__
    return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/dask/core.py", line 149, in get
    result = _execute_task(task, cache)
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/home/bester/software/dask-ms/daskms/experimental/zarr/__init__.py", line 187, in zarr_setter
    zarray[selection] = data
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/zarr/core.py", line 1353, in __setitem__
    self.set_basic_selection(pure_selection, value, fields=fields)
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/zarr/core.py", line 1448, in set_basic_selection
    return self._set_basic_selection_nd(selection, value, fields=fields)
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/zarr/core.py", line 1748, in _set_basic_selection_nd
    self._set_selection(indexer, value, fields=fields)
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/zarr/core.py", line 1800, in _set_selection
    self._chunk_setitem(chunk_coords, chunk_selection, chunk_value, fields=fields)
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/zarr/core.py", line 2062, in _chunk_setitem
    self._chunk_setitem_nosync(chunk_coords, chunk_selection, value,
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/zarr/core.py", line 2073, in _chunk_setitem_nosync
    self.chunk_store[ckey] = self._encode_chunk(cdata)
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/zarr/core.py", line 2194, in _encode_chunk
    cdata = self._compressor.encode(chunk)
  File "numcodecs/blosc.pyx", line 557, in numcodecs.blosc.Blosc.encode
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/numcodecs/compat.py", line 155, in ensure_contiguous_ndarray
    ensure_contiguous_ndarray_like(
  File "/home/bester/.venv/dms/lib/python3.8/site-packages/numcodecs/compat.py", line 121, in ensure_contiguous_ndarray_like
    raise ValueError(msg)
ValueError: Codec does not support buffers of > 2147483647 bytes

But I can still read the main table

Python 3.8.13 (default, Apr 19 2022, 00:53:22)
Type 'copyright', 'credits' or 'license' for more information
IPython 8.4.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from daskms import xds_from_storage_ms

In [2]: xds = xds_from_storage_ms('ms1_primary.zarr/')

In [3]: d = xds[0].DATA.values

In [4]: d.shape
Out[4]: (98332, 4096, 4)

so I am not sure what is happening here.

Simon Perkins · Answer 1 · Thu Aug 04 2022 19:14:34 GMT+0800 (China Standard Time)

Thanks for reporting. Can you run the command again within pdb as follows:

$ python -m pdb $(which dask-ms) convert ms1_primary.ms -g "FIELD_ID,DATA_DESC_ID,SCAN_NUMBER" -o ms1_primary.zarr --chunks="{row:25000,chan:128}" --format zarr --force

and report on the dimensions of zarray, selection and data in the following part of the stack trace?

  File "/home/bester/software/dask-ms/daskms/experimental/zarr/__init__.py", line 187, in zarr_setter
    zarray[selection] = data

Simon Perkins · Answer 2 · Thu Aug 04 2022 20:21:31 GMT+0800 (China Standard Time)

But I can still read the main table

This probably because it get created upfront. I'll bet that its filled with zeros. What does d.chunks report?

JSKenyon · Answer 3 · Thu Aug 04 2022 20:27:32 GMT+0800 (China Standard Time)

Is it possible that it is actually a subtable that is causing the problem? I don't recall how those are chunked (or, indeed, if they are left unchunked).

Landman Bester · Answer 4 · Thu Aug 04 2022 20:32:39 GMT+0800 (China Standard Time)

The chunks are as expected and DATA seems populated

In [1]: from daskms import xds_from_storage_ms

In [2]: xds = xds_from_storage_ms('ms1_primary.zarr/')

In [3]: xds[0].DATA.chunks
Out[3]:
((25000, 25000, 25000, 23332),
 (128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128,
  128),
 (4,))

In [4]: d = xds[0].DATA.values

In [5]: d[0]
Out[5]:
array([[ 1.4974735e+03+0.0000000e+00j, -1.8867255e+03+7.0398737e+02j,
        -1.8867255e+03-7.0398737e+02j,  2.5700393e+03+0.0000000e+00j],
       [ 4.8791138e+01+0.0000000e+00j,  1.0086360e-02+5.2753524e-03j,
         1.0086360e-02-5.2753524e-03j,  4.0523239e+01+0.0000000e+00j],
       [ 4.8889027e+01+0.0000000e+00j,  8.3513446e-02+5.6876391e-03j,
         8.3513446e-02-5.6876391e-03j,  4.0596863e+01+0.0000000e+00j],
       ...,
       [ 3.5337872e+01+0.0000000e+00j,  3.0996327e-04+5.0446814e-01j,
         3.0996327e-04-5.0446814e-01j,  1.4307446e+01+0.0000000e+00j],
       [ 3.5533756e+01+0.0000000e+00j,  2.2583008e-02+5.0612271e-01j,
         2.2583008e-02-5.0612271e-01j,  1.4375396e+01+0.0000000e+00j],
       [ 3.6027515e+01+0.0000000e+00j, -8.5114129e-03+5.1566869e-01j,
        -8.5114129e-03-5.1566869e-01j,  1.4549672e+01+0.0000000e+00j]],
      dtype=complex64)

I also suspect it may be one of the subtables because it happens right at the end of a run. I am still waiting for it to fall over again so I can report the information you asked for @sjperkins (unfortunately oates is acting up again and things are taking forever)

Simon Perkins · Answer 5 · Thu Aug 04 2022 20:33:58 GMT+0800 (China Standard Time)

Is it possible that it is actually a subtable that is causing the problem? I don't recall how those are chunked (or, indeed, if they are left unchunked).

That'd be a really big subtable if a column has ~2GiB of data.

Also just spitballing some figures (complex64 == 8 bytes)

98332 x 4096 x 4 x 8 ~= 24GiB
25000 x 128 x 4 x 8 ~= 97MiB (which should be fine)

Simon Perkins · Answer 6 · Thu Aug 04 2022 20:35:14 GMT+0800 (China Standard Time)

I also suspect it may be one of the subtables because it happens right at the end of a run.

Hmmmm that is interesting...

Simon Perkins · Answer 7 · Thu Aug 04 2022 21:15:29 GMT+0800 (China Standard Time)

I don't recall how those are chunked (or, indeed, if they are left unchunked).

They are left unchunked. One way of finding out if there are large subtables would be to do something like a:

$ du -hs ms1_primary.ms/

Landman Bester · Answer 8 · Thu Aug 04 2022 21:20:20 GMT+0800 (China Standard Time)

It does not seem that way

$ du -h ms1_primary.ms/
32K     ms1_primary.ms/SOURCE
32K     ms1_primary.ms/ANTENNA
20K     ms1_primary.ms/FLAG_CMD
20K     ms1_primary.ms/PROCESSOR
44K     ms1_primary.ms/FEED
160K    ms1_primary.ms/SPECTRAL_WINDOW
20K     ms1_primary.ms/DATA_DESCRIPTION
28K     ms1_primary.ms/OBSERVATION
24K     ms1_primary.ms/POLARIZATION
20K     ms1_primary.ms/STATE
96K     ms1_primary.ms/POINTING
28K     ms1_primary.ms/FIELD
28K     ms1_primary.ms/HISTORY
387G    ms1_primary.ms/

Simon Perkins · Answer 9 · Thu Aug 04 2022 21:39:07 GMT+0800 (China Standard Time)

They are left unchunked.

Actually this isn't quite true. A default chunking of 10,000 rows is applied.

It does not seem that way

OK must be one of the DATA columns then. If you have your initial attempt at writing the zarr dataset out still lying around, can you do a:

from pprint import pprint
datasets = xds_from_ms(...)
pprint(list(dict(ds.chunks) for ds in datasets))

Landman Bester · Answer 10 · Thu Aug 04 2022 21:46:52 GMT+0800 (China Standard Time)

Ah, I have some non-standard columns in there

In [1]: from daskms import xds_from_storage_ms

In [2]: xds = xds_from_storage_ms('ms1_primary.zarr/')

In [3]: from pprint import pprint

In [4]: pprint(list(dict(ds.chunks) for ds in xds))
[{'RESIDUAL-1': (4096,),
  'RESIDUAL-2': (4,),
  'chan': (128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128),
  'corr': (4,),
  'row': (25000, 25000, 25000, 23332),
  'uvw': (3,)},
 {'RESIDUAL-1': (4096,),
  'RESIDUAL-2': (4,),
  'chan': (128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128),
  'corr': (4,),
  'row': (25000, 25000, 25000, 23332),
  'uvw': (3,)},
 {'RESIDUAL-1': (4096,),
  'RESIDUAL-2': (4,),
  'chan': (128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128),
  'corr': (4,),
  'row': (25000, 25000, 25000, 23332),
  'uvw': (3,)},
 {'RESIDUAL-1': (4096,),
  'RESIDUAL-2': (4,),
  'chan': (128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128),
  'corr': (4,),
  'row': (25000, 25000, 25000, 23332),
  'uvw': (3,)},
 {'RESIDUAL-1': (4096,),
  'RESIDUAL-2': (4,),
  'chan': (128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128),
  'corr': (4,),
  'row': (25000, 25000, 25000, 23332),
  'uvw': (3,)},
 {'RESIDUAL-1': (4096,),
  'RESIDUAL-2': (4,),
  'chan': (128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128),
  'corr': (4,),
  'row': (25000, 25000, 25000, 23332),
  'uvw': (3,)},
 {'RESIDUAL-1': (4096,),
  'RESIDUAL-2': (4,),
  'chan': (128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128),
  'corr': (4,),
  'row': (25000, 25000, 25000, 23332),
  'uvw': (3,)},
 {'RESIDUAL-1': (4096,),
  'RESIDUAL-2': (4,),
  'chan': (128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128,
           128),
  'corr': (4,),
  'row': (25000, 25000, 25000, 23332),
  'uvw': (3,)}]

In [5]: r = xds[0].RESIDUAL.values

In [6]: r[0]
Out[6]:
array([[0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
       [0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
       [0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
       ...,
       [0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
       [0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j],
       [0.+0.j, 0.+0.j, 0.+0.j, 0.+0.j]], dtype=complex64)

Looks like the RESIDUAL column is written with frequency chunks of 4096. This works at the outset because it can be compressed?

JSKenyon · Answer 11 · Thu Aug 04 2022 21:47:01 GMT+0800 (China Standard Time)

Ah, could this be a schema thing? A column not in the default schema will not know about the chan axis.

Edit: Posted before I saw the above. I think this is def the root cause.

Simon Perkins · Answer 12 · Thu Aug 04 2022 21:50:35 GMT+0800 (China Standard Time)

Ah yes 25000 x 4096 x 4 x 8 = 3GiB

Simon Perkins · Answer 13 · Thu Aug 04 2022 22:10:26 GMT+0800 (China Standard Time)

Ah, could this be a schema thing? A column not in the default schema will not know about the chan axis.

@JSKenyon and I just discussed this in a meets. The problem here is that there are non-standard columns in the MS.

As it stands xds_from_ takes a schema argument allowing one to configure this properly. A couple of solutions might be possible here:

A --schema argument for dask-ms convert
Writing a __daskms__attributes__ column keyword into the MS column containing the dimension schema.

Landman Bester · Answer 14 · Thu Aug 04 2022 22:48:35 GMT+0800 (China Standard Time)

My 2 cents:

requires the user to know which non-standard columns exist in the MS upfront. You could maybe bail out with a nice informative error message that tells the user which non-standard columns to specify a --schema for but that is a bit clunky.
will only solve the problem if the column was actually written by dask-ms so you would still run into the issue. Although you could probably resort to 1) if non-standard columns are detected.

A possible alternative (albeit not a very clean one) would be to check if unknown dimensions match existing dimensions in the MS and then chunk them the same (eg. in the above case RESIDUAL-1 matches 'chan' along axis 1 and could be chunked the same). I suspect this will work 99% of the time. Maybe print a warning if this is the case, throw an error if any non-standard columns don't match any existing dimensions and resort to 1).

JSKenyon · Answer 15 · Fri Aug 05 2022 15:05:07 GMT+0800 (China Standard Time)

1. requires the user to know which non-standard columns exist in the MS upfront. You could maybe bail out with a nice informative error message that tells the user which non-standard columns to specify a --schema for but that is a bit clunky.

This may be the best option as it can be detected quickly. That said, this could get very unwieldy if an MS has lots of non-standard columns.

2. will only solve the problem if the column was actually written by dask-ms so you would still run into the issue. Although you could probably resort to 1) if non-standard columns are detected.

It is true that this wouldn't fix the problem if the column was written by software not using dask-ms. However, it might by a decent 90% solution with the remaining 10% solved by option 1.

A possible alternative (albeit not a very clean one) would be to check if unknown dimensions match existing dimensions in the MS and then chunk them the same (eg. in the above case RESIDUAL-1 matches 'chan' along axis 1 and could be chunked the same). I suspect this will work 99% of the time. Maybe print a warning if this is the case, throw an error if any non-standard columns don't match any existing dimensions and resort to 1).

This is possible but it can be slightly brittle. For the MAIN table, this is plausible as we could do as you suggest with a dim priority in the event that there are dims of the same size. My suggested priority would be [chan, corr, uvw] for the MAIN table i.e. a (nrow, 3) column will prioritise (row, chan) over (row, uvw) if the known channel dimension is 3. Option 1 would then only be needed in cases where the user is doing the unusual i.e. adding a new column with uvw as a dim.

Unfortunately, none of the above succeed in completely hiding this from the user, although option 2 will come close for our software e.g. QuartiCal and pfb-clean. I think that adding non-standard columns is relatively unusual in the legacy stack (outside of CubiCal etc).

Finally, all of the above is only true for the main table. Subtables are probably even tougher to deal with as they each have different dims. On top of that we need to remember that xds_to_table can technically write arbitrary new tables, though this is less of a problem if option 2 is in place.

Simon Perkins · Answer 16 · Tue Aug 09 2022 17:55:18 GMT+0800 (China Standard Time)

How about applying chunking heuristics to DATA-like columns only?

Can you think of any other column "schemas" that Quartical/pfb-clean/Cubical use?

Landman Bester · Answer 17 · Tue Aug 09 2022 19:00:26 GMT+0800 (China Standard Time)

How about applying chunking heuristics to DATA-like columns only?

That will suffice for my purposes.

Can you think of any other column "schemas" that Quartical/pfb-clean/Cubical use?

Only the WEIGHT column but I believe that will be deprecated eventually and I am strongly opposed to using it anyway

Landman Bester · Answer 18 · Thu May 09 2024 18:36:02 GMT+0800 (China Standard Time)

@Athanaseus just got hit by this again. Converting an MS with non-standard columns leaves the resulting dataset in a state that is hard to deal with since the will not have the expected chan and corr dimensions for non-standard columns

Landman Bester · Answer 19 · Thu May 09 2024 18:39:40 GMT+0800 (China Standard Time)

Can you think of any other column "schemas" that Quartical/pfb-clean/Cubical use?

This MS also has a BITFLAG column

Simon Perkins · Answer 20 · Thu May 09 2024 18:46:04 GMT+0800 (China Standard Time)

I should block off some time to look at this tomorrow. One possible workaround is to use the --exclude flag, if the column is unnecessary.

Landman Bester · Answer 21 · Thu May 09 2024 18:51:27 GMT+0800 (China Standard Time)

Thanks @sjperkins. This is the column he wants to image. Actually he used the --exclude flag to drop the dozen or so other non-standard columns already (I guess an inevitable side effect of experimentation). I can wrangle the dataset into shape manually for now

Simon Perkins · Answer 22 · Thu May 09 2024 19:02:06 GMT+0800 (China Standard Time)

Thanks @sjperkins. This is the column he wants to image. Actually he used the --exclude flag to drop the dozen or so other non-standard columns already (I guess an inevitable side effect of experimentation). I can wrangle the dataset into shape manually for now

Are these columns shaped like DATA/FLAG?