pydicom / pydicom

Read, modify and write DICOM files with python code

Home Page:https://pydicom.github.io/pydicom/dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Private tag sequence reads as 'UN' array

srodney opened this issue · comments

Thank you for your work on this excellent package!

After saving a dicom file that contains a private tag sequence, when that sequence is read back in as part of a Dataset, it is read as VR='UN' and an array of bytes, rather than a Sequence of Datasets with DataElements.

This appears identical to the issue described in #1336

Expected behavior
A private sequence written to a .dcm file should be read in as a Sequence.

Steps To Reproduce
See the attached code. I used the example code from the 2.4.2 documentation for creating a minimum dataset from scratch.

pydicom_private_sequence_reads_as_UN_array.ipynb.zip

Your environment

module version
platform macOS-14.0-x86_64-i386-64bit
Python 3.10.6 (v3.10.6:9c7b4bd164, Aug 1 2022, 17:13:48) [Clang 13.0.0 (clang-1300.0.29.30)]
pydicom 2.4.2
gdcm module not found
jpeg_ls module not found
numpy 1.23.2
PIL 10.0.0
pylibjpeg module not found
openjpeg module not found
libjpeg module not found

Here's an excerpt from the jupyter notebook attached above, showing the essential steps to reproduce on any valid dicom dataset

# given any valid dicom Dataset ds:

# Add a private sequence with a single item of valid dicom
test_seq_item = pydicom.dataset.Dataset()
test_seq_item.BlockType = "APERTURE"
test_seq_item.BlockName = "Block1"
test_seq = pydicom.sequence.Sequence([test_seq_item])

ds.add_new( 0x37770010, 'LO', 'TEST_CREATOR') # Private Creator
ds.add_new( 0x37771000, 'SQ', test_seq )

# Write it out to disk in the same location as before
ds.save_as( filename_little_endian, write_like_original=False )

# read it in again.  See the private sequence as 'UN'
ds_after_read = pydicom.dcmread( filename_little_endian )
print(ds_after_read)

Ah, wait = I just discovered that this problem disappears if I write the dataset as BIG_ENDIAN and EXPLICIT_VR.

so I'm perhaps stumbling into something that was addressed in #1067 or #1305 and #1323

so I think this is very likely invalid, and I just need to read those issues and their resolutions more carefully.

Just as an aside: private tags should be set using private blocks.

Also, don't use the big endian transfer syntax for writing - this is long retired. But of course, if you write the data as little endian, the VR information gets lost, so the behavior is somewhat expected, and can be fixed by writing as explicit endian instead.
As you mentioned, #1323 should have fixed this for unknown sequences with unknown length. There is no way to know the VR of a private tag saved as UN with known length, if is not registered in the private dictionary. You could register it yourself in your case if you need to write the data as implicit VR.

Thanks very much. At least in preliminary testing, it looks like writing as explicit and using private blocks will solve the issue Closing this, with thanks!