Consider Optimization to hold onto only essential elements in Parser

Question

Consider Optimization to hold onto only essential elements in Parser

suyashkumar opened this issue 2 years ago · comments

When using the element-by-element API in the Parser, we probably don't need to hold the whole dataset in p.dataset (as I alluded to a bit in #224 and elsewhere). We really only need to hold onto some subset of elements that we need to parse later elements (like PixelData). The rest, we return to the user and the user can decide what they wish to do with them (they may wish to write them out somewhere and then throw them away, for example).

This may help with some memory usage in some usecases for an end user, but because non-PixelData aren't usually super huge, it will be a small to medium size impact.

Suyash Kumar · Answer 1 · Sat Dec 18 2021 03:31:12 GMT+0800 (China Standard Time)

The basic implementation is pretty simple (put something quick up here that I need to cleanup, and add better benchmarks for later https://github.com/suyashkumar/dicom/compare/s/parser-optimize?expand=1), but in the default case where ParseFile or Parse is called (where we need to build and return the entire dataset anyway) we may end up using slightly more memory where we end up storying an additional copy of the metadata elements and the elements that are needed to read future elements. This set should be pretty small though. One option is to put this behavior behind an option in the Parser so that it can only be activated when a user is using Parser in element-by-element mode, but it may not be worth the complexity.