joshy / dcm

High level DICOM network and file operations. Provides CLI and Python async API.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

dcm

Full documentation available here.

This is a Python package and CLI application for performing high-level DICOM file and network operations. This includes performing queries, performing 'rsync' like data transfers, routing data (including dynamic routing), sorting data, dumping / diffing data sets, and more.

The goal for the CLI is to provide a human friendly interface for working with DICOM data and networking. We hide many of the pitfalls and idiosyncrasies of the DICOM networking standard behind high level abstractions and sensible defaults. Command line users can tweak these defaults and access more advanced features through the powerful configuration system.

The goal for the Python API is similar in many ways with a different audience. Again we want to provide a high-level and user friendly API that hides many of the lower level details when appropriate. Obviously the Python API is much more expansive and flexible. The API we provide leverages asyncio for concurrency and uses type hints extensively (which are checked by mypy in the test suite).

This project would not be possible without the hard work of the pydicom and pynetdicom contributors.

Requires Python 3.7+.

CLI Quickstart

After installing the python package you will have the command dcm available. Running dcm --help will give an overview of the CLI options.

Before interacting with most DICOM servers (e.g a PACS) you will need to have your computer registered by the administrator of that system with a specific AETitle and port.

You can use the conf command to edit the TOML configuration file and add your AETitle and port in the [local_nodes.default] section. You can also add any PACS or other servers at this point in the [remote_nodes] section.

You can use the echo command to test connectivity with a DICOM server:

$ dcm echo mypacs.org:MYPACS:11112
Success

Any command that requires DICOM network node information can take it directly as hostname:ae_title:port as we have done above, or you can reference nodes in your config file by name. For excample if our config file contains this section:

[remote_nodes.mypacs]
hostname = "mypacs.org"
ae_title = "MYPACS"
port = 11112

We could rewrite the last command as simply:

$ dcm echo mypacs
Success

We can query the remote system using the query command:

$ dcm query mypacs StudyDate=20201031

2 patients | 3 studies | 56 series | 17174 instances
└── ID: Vamp001 | Name: Dracula^Count | 2 studies | 35 series | 8974 instances
│   ├── Date: 20201031 | Time: 164315.866000 | 21 series | 8200 instances
│   └── Date: 20201031 | Time: 203205.137000 | 14 series | 774 instances
└── ID: Vamp002 | Name: Chocula^Count | 1 studies | 21 series | 8200 instances
    └── Date: 20201031 | Time: 102407.269000 | 21 series | 8200 instances

Once we find the data we want to download we can use the sync command to transfer it. We can even just pipe the output from the query command into the sync command:

$ dcm query mypacs StudyDate=20201031 PatientID=Vamp001 | dcm sync path/to/save/data

This is equivalent to running:

$ dcm sync -s mypacs -q StudyDate=20201031 -q PatientID=Vamp001 path/to/save/data

The data will be saved into the provided directory, sorted into a directory hierarchy by patient / study / series.

You can actually provide multiple sources and multiple destinations to the sync command, and both the sources and and destinations can be either directories or network nodes. By default, whenever possible, all data that already exists on the destination is skipped (unless --force-all is used). For local directories we have no way of reliably knowing what already exists so we can't do this (adding a Sqlite database option to manage this is future work).

Python Quickstart

The net.LocalEntity class provides an high-level async API for most common DICOM networking tasks.

>>> import asyncio
>>> from dcm.net import DcmNode, LocalEntity
>>> local = LocalEntity(DcmNode('0.0.0.0', 'LOCALAE', 11112))
>>> mypacs = DcmNode('mypacs.org', 'MYPACS', 11112)
>>> asyncio.run(local.echo(mypacs))
True

We can use it to perform queries, all at once with the query method or through an async generator using the queries method, which will produce query.QueryResult objects.

>>> from pydicom import Dataset
>>> query = Dataset()
>>> query.StudyDate = '20201031'
>>> qr = asyncio.run(local.query(mypacs, query=query))
>>> print(qr.to_tree())

2 patients | 3 studies | 56 series | 17174 instances
└── ID: Vamp001 | Name: Dracula^Count | 2 studies | 35 series | 8974 instances
│   ├── Date: 20201031 | Time: 164315.866000 | 21 series | 8200 instances
│   └── Date: 20201031 | Time: 203205.137000 | 14 series | 774 instances
└── ID: Vamp002 | Name: Chocula^Count | 1 studies | 21 series | 8200 instances
    └── Date: 20201031 | Time: 102407.269000 | 21 series | 8200 instances

The easiest way to save this data locally is to use the download method which will save all the DICOM files into a single directory, using the SOPInstanceUID to name the files.

>>> asyncio.run(local.download(mypacs, qr, 'path/to/save/data'))

For more control you can use the retrieve method which is an async generator that produces the incoming data as pydicom.Dataset objects.

>>> async def print_incoming(local, remote, qr):
...     async for ds in local.retrieve(remote, qr):
...         print(ds.SOPInstanceUID)

>>> asyncio.run(print_incoming(local, mypacs, qr))

Higher level data transfer functionality is exposed through the route and sync modules. You can create route.StaticRoute and ``route.DynamicRoute``objects to add on-the-fly data filtering and dynamic selection of destinations.

A route.Router object can then be created with one or more routes and can then be used to both do "pre-routing" using the Router.pre_route method or on-the-fly routing using the Router.route context manager.

The sync.sync function provides the highest level abstraction for data transfers, and is the closest to the CLI sync command in functionality. This function is mostly a convienance wrapper around the sync.SyncManager class which allows for more control over how transfers are generated and what to do with them.

Contributing Quickstart

If your system python is too old, or you want to be able to run the tests locally against multiple python versions it is recommended that you use pyenv to manage installed python versions.

We use the newer "pyproject.toml" instead of a "setup.py" (plus a bunch of other files). Using poetry to manage dependencies and virtual environments is highly recommended.

All code should be formatted with the black code formatter, and this will be done automatically before each commit by pre-commit if you run poetry run pre-commit install once from inside your local git repo.

All code should be typed and pass the mypy type checker unless there is a good reason not to.

Running Tests Locally

The dependencies needed for testing and development are all listed as poetry "development dependencies". Doing poetry run pytest --slow is the easiest way to run all the tests against the current environment. Leaving off the --slow argument will skip some particularly slow tests.

If you have multiple python versions setup with pyenv you can do poetry run tox to run the tests against all versions.

While the mypy checker is run by default by pytest, one advantge of mypy is that it can run many orders of magnitude faster that the test suite while still catching many errors. You can do poetry run mypy to just run the mypy checker.

Many tests will be skipped if dcmtk is not installed as we use it to provide a test server. Most tests will still run against a different test server provided by the pynetdicom qrscp app.

Continuous integration

We use github actions to automatically run the test suite on all pushes and pull requests.

About

High level DICOM network and file operations. Provides CLI and Python async API.

License:MIT License


Languages

Language:Python 100.0%