Eduardo Pinho

Posted on Jul 23, 2023

A new milestone: what's new in DICOM-rs 0.6.0

#rust #dicom

I recently released DICOM-rs v0.6.0. DICOM-rs is an implementation of the DICOM standard for the next generation of medical imaging systems. Comprising a total of 71 pull requests, there is so much in this version that I felt the need to make a blog post to further expand on its major changes and talk about some cool new features.

So what's new?

Attribute selectors

It was not a secret that accessing specific elements from a DICOM object could get tricky, especially when working with nested data sets. In fact, accessing nested data set elements is not part of an obscure use case. This happens to be a common necessity, from working with various kinds of multi-frame instances, to structured reports.

For instance, this code snippet brought up to discussion tries to retrieve the Referenced SOP Instance UID from a Referenced Image Sequence, as part of the Shared Functional Groups Sequence. This means navigating two levels of nested sequences.

let obj = open_file("../dicoms/my_dicom_path.dcm")?;

let referenced_sop_instance_uid = obj
    .element(tags::SHARED_FUNCTIONAL_GROUPS_SEQUENCE)?
    .items()
    .unwrap()
    .first()
    .unwrap()
    .element(tags::REFERENCED_IMAGE_SEQUENCE)?
    .items()
    .unwrap()
    .first()
    .unwrap()
    .element(tags::REFERENCED_SOP_INSTANCE_UID)?
    .to_str()?;

Handling each error case possible when accessing deeply nested attributes becomes unwieldy and prone to mistakes, as evidenced by this example. In this case, an error is correctly raised when either of the three data elements requested do not exist, or when the Referenced SOP Instance UID value could not be converted to a string (e.g. if it turns out to be a sequence itself). However, this code panics if either of the two data set sequences do not contain any items, or if the data happens to be malformed and the sequence element does not really have a sequence. These situations might seem unlikely in a controlled scenario, but it opens room for attackers to feed meticulously crafted inputs which take down the whole thread or process!

The main difficulty comes from the fact that items() and first() return an Option, which can only be raised with the ? operator in a function context which also returns an Option. Since these methods are chained in methods returning a Result, these options need to be transformed to a Result.

With Snafu, this can be done by adding context methods to the chain, either with a new custom error type or with the dynamic Whatever error:

let referenced_sop_instance_uid = obj
    .element(tags::SHARED_FUNCTIONAL_GROUPS_SEQUENCE)
    .whatever_context("Missing Shared Functional Groups Sequence")?
    .items()
    .whatever_context("Shared Functional Groups is not a sequence")?
    .first()
    .whatever_context("No items in Shared Functional Groups Sequence")?
    .element(tags::REFERENCED_IMAGE_SEQUENCE)
    .whatever_context("Missing Referenced Images Sequence")?
    .items()
    .whatever_context("Referenced Images Sequence is not a sequence")?
    .first()
    .whatever_context("No items in Referenced Images Sequence")?
    .element(tags::REFERENCED_SOP_INSTANCE_UID)
    .whatever_context("Missing Referenced SOP Instance UID")?
    .to_str()
    .whatever_context("Referenced SOP Instance UID cannot be converted to a string")?;

Needless to say, even the more correct version of the snippet continues to be very verbose.

The attribute selector API has come to change this, while also bringing new ways to edit and build DICOM objects. A new key component in the dicom-core crate is the AttributeSelector, which uniquely describes a specific attribute in a DICOM object, even if they are nested, by composing a list of sequence items and attribute tags. Attribute selectors can be constructed from tuples of tags, or tuples of interleaved tags and item indices. The following two selectors are equivalent:

use dicom_core::ops::AttributeSelector;

let selector: AttributeSelector = (
    tags::SHARED_FUNCTIONAL_GROUPS_SEQUENCE,
    0,
    tags::REFERENCED_IMAGE_SEQUENCE,
    0,
    tags::REFERENCED_SOP_INSTANCE_UID
).into();

// or omit the item indices to assume the first item
let selector: AttributeSelector = (
    tags::SHARED_FUNCTIONAL_GROUPS_SEQUENCE,
    tags::REFERENCED_IMAGE_SEQUENCE,
    tags::REFERENCED_SOP_INSTANCE_UID,
).into();

And the example above can now be written like this:

let referenced_sop_instance_uid = obj
    .value_at((
        tags::SHARED_FUNCTIONAL_GROUPS_SEQUENCE,
        tags::REFERENCED_IMAGE_SEQUENCE,
        tags::REFERENCED_SOP_INSTANCE_UID,
    ))
    .whatever_context("Could not retrieve Referenced SOP Instance UID")?
    .to_str()
    .whatever_context("Referenced SOP Instance UID cannot be converted to a string")?;

Plus, there is a text syntax for turning strings into attribute selectors, so that tools may benefit from user-input attribute selectors. dicom-findscu already uses it for building the C-FIND query object based on user input.

# retrieve the modality worklist information
# for scheduled procedures where the patient has arrived
dicom-findscu INFO@pacs.example.com:1045 --mwl \
    -q ScheduledProcedureStepSequence \
    -q ScheduledProcedureStepSequence.ScheduledProcedureStepStatus=ARRIVED

Improved object and value ergonomics

Construction of objects from scratch also had a few unnecessary pain-points sometimes. Here is another example from discussions:

let start_date_sequence: DataElement<InMemDicomObject, InMemFragment> = DataElement::new(
    tags::SCHEDULED_PROCEDURE_STEP_START_DATE,
    VR::LO,
    PrimitiveValue::from("19951015"),
);
let scheduled_procedure_step_sequence = DataElement::new(
    tags::SCHEDULED_PROCEDURE_STEP_SEQUENCE,
    VR::SQ,
    Value::new_sequence(smallvec![start_date_sequence], Length::UNDEFINED),
);
obj.push(scheduled_procedure_step_sequence);

This attempt failed with a compile time error due to the inflexible type parameters of Value::new_sequence. In general, building up a DICOM value was a bit troublesome when dealing with data set sequences or pixel data fragment sequences. With the new version, the base DICOM value type DicomValue was further decomposed so that data set sequences now have their own type DataSetSequence, which not only has nicer conversions from sequences of items, it can also be passed instead of DicomValue directly to DataElement::new. In addition, string types &str and String were granted a privileged conversion path to DicomValue.

let scheduled_procedure_step_sequence = DataElement::new(
    tags::SCHEDULED_PROCEDURE_STEP_SEQUENCE,
    VR::SQ,
    DataSetSequence::from(vec![
        DataElement::new(
            tags::SCHEDULED_PROCEDURE_STEP_START_DATE,
            VR::LO,
            "19951015",
        )
    ]),
);
obj.push(scheduled_procedure_step_sequence);

There are a few other breaking changes which were made either to fix existing problems, reduce building costs for unused features, or increase convenience of use. A few examples follow.

The type parameter P in DataElement<I, P> and DicomValue<I, P> now defaults to the current definition for an in-memory pixel data fragment, which is an alias to Vec<u8>.
It is no longer possible to compare a Tag for equality against a [u16; 2] or a (u16, u16). This was a bit of an obscure capability which was bringing issues when trying to use assert_eq! in places where inference to Tag was expected, and developers are expected to convert those to Tag first anyway.
Some type name changes were made in dicom-dictionary-std to accommodate other kinds of dictionaries.
Error types coming from operations in dicom-ul and dicom-object have been redesigned to be smaller in size and more focused on the operation at hand. Any code which mentioned the specific error type by name may need to be updated accordingly.
A bug was fixed in AbortRQServiceProviderReason, which had two different errors combined into the same variant by mistake.
dicom-pixeldata now puts conversion of DICOM objects into dynamic image values behind the "image" feature, and to a multi-dimensional arrays behind the "ndarray" feature. The "rayon" feature can also be excluded for use in an environment which does not support Rayon right off the bat.

Operations API

Still speaking of ergonomics, creating new objects is even easier with the attribute operations API! dicom-core now comes with ApplyOp trait, a common API for manipulating attributes in several ways. Thanks to attribute actions such as Set, we can build the same example above with less lines and less chances of making mistakes.

// create an empty sequence
obj.apply(AttributeOp::new(
    tags::SCHEDULED_PROCEDURE_STEP_SEQUENCE,     
    AttributeAction::Set(PrimitiveValue::Empty)
))?;
// create item and add procedure step start date
obj.apply(AttributeOp::new(
    (tags::SCHEDULED_PROCEDURE_STEP_SEQUENCE, tags::SCHEDULED_PROCEDURE_STEP_START_DATE),
    AttributeAction::SetStr("19951015".into())
))?;

UID constants and SOP class dictionary

The existence of constant declarations is a noteworthy improvement to developer experience. IDEs prepared for Rust development will index them and provide listings of constants to the developer automatically as they look for that specific attribute or abstract syntax.

DICOM-rs v0.4 brought constants for DICOM tags. With DICOM-rs v0.6, we now have constants for various normative DICOM unique identifiers (UIDs), spanning from SOP classes and transfer syntaxes, to even the so-called Well-known SOP Instances. No more copying around non-descriptive string literals when wishing to grab Patient Root Query/Retrieve Information Model - MOVE.

use dicom_dictionary_std::uids;

// old
let sop_class_uid = "1.2.840.10008.5.1.4.1.2.1.2";
// new
let sop_class_uid = uids::PATIENT_ROOT_QUERY_RETRIEVE_INFORMATION_MODEL_MOVE;

On top of this, dicom-dictionary-std also provides a run-time dictionary of SOP classes, so that applications can translate UIDs to their description. This is already in place in dicom-dump, so that the Media Storage SOP Class UID is presented with a human readable description next to the UID.

DICOM JSON support

As the world is unquestionably connected through the World Wide Web, many medical systems already rely on DICOMweb to communicate. Much of the efforts in the project have been around working with standard DICOM data representations and the good ol' upper layer network protocols, but DICOM-rs 0.6 now brings one piece of the bridge to Web oriented development.

dicom-json implements DICOM JSON with the ability to serialize DICOM data into JSON and JSON data back to DICOM data. This allows for an easy translation from types found in dicom-object and dicom-core to the textual representation that dominates most of the web nowadays.

let json = r#"{
    "00080021": { "vr": "DA", "Value":["20230610"] },
    "00200013": { "vr": "IS", "Value":["5"] }
}"#;
let obj: InMemDicomObject = dicom_json::from_str(&json)?;

Internally, serialization/deserialization is backed by Serde, a mature, well-established serialization framework. An added benefit of its use is that converting DICOM data to a JavaScript value via wasm-bindgen becomes available for free.

Pixel data adapter API redesign

One of the open challenges in DICOM-rs is transcoding DICOM data sets between transfer syntaxes. While this is mostly trivial for transfer syntaxes of non-encapsulated uncompressed pixel data, existing tools in the project will choke if they need to decode or encode the pixel data in some way, such as decoding JPEG compressed imaging data into its native pixel data form.

While this will continue to be the case in v0.6.0, this version brings a more complete API for building pixel data readers and writers (also known as pixel data adapters), which are bound to transfer syntax implementations for the ability to convert pixel data in both directions. The old PixelRWAdapter trait was removed and two new traits took their place, with a new set of methods which can even work with individual frames. This change will mostly affect transfer syntax implementations and components relying on middle-level APIs.
Support for decoding RLE Lossless and decoding/encoding some JPEG-based transfer syntaxes, is already available, with more to come.

With this new API just published, this makes a foundation for the next steps towards DICOM data transcoding, hopefully with the efficiency and performance that some users have come to expect.

There is a lot that I could cover here, but at the risk of making this post more exhausting than it already is, this is where I stop. 😅 Have a look at the full changelog for the full listings if you are interested.

New website

I also took this opportunity to announce the new website for DICOM-rs, although it has been online for a few months. It is a seemingly simple one-page site published with GitHub Pages using Zola, and contains a quick overview of the project, as well as a few examples of what it can do.
A small surprise can be found in the middle: a live example of a DICOM dump application, which reads the user's DICOM file in the browser and shows some of its attributes.
All source code of the website, including the live example, is available here.

What's next?

Without replicating what's in the roadmap, I can point out some of the things that I have in mind for the next milestones.

As already mentioned above, an API for transcoding DICOM objects will be an early feature to focus on, possibly in the next minor release. This will then be integrated into the storescu tool to enable an automatic object conversion if requested by the service class provider.
The lack of high level constructs for writing DICOM network services is one of the most frequent concerns raised so far. The UL association API helps to accept and establish associations with other DICOM nodes, but there is still little hand-holding for sending and receiving commands and data in conformance with the intended DICOM Message Service Elements (DIMSE), and components to assist in compliance with the DICOM upper layer protocol state machine. There is also not a clear direction on how to make network services asynchronous, although this is in the project's plans. Making network operations non-blocking is important for a better usage of the computational resources available.
Already in my thoughts, I believe that there is an opportunity to revamp DICOM object reading and construction by declaring modules from Information Object Definitions (IODs) as specially annotated Rust types. It will be a great undertaking, but it will bring immense gains to the library in multiple ways when it's done.
Lazy DICOM object loading has been in my head since the beginning of the project, but it has been constantly set aside due to its difficulty to get right while maintaining usability. This still ought to be tackled sooner or later. To help drive this forward, I am likely to take it in a different direction, by first implementing a separate API for reading and writing DICOM data in independent chunks.

Discussion is open through the respective GitHub discussion thread.

DEV Community

A new milestone: what's new in DICOM-rs 0.6.0

So what's new?

Attribute selectors

Improved object and value ergonomics

Operations API

UID constants and SOP class dictionary

DICOM JSON support

Pixel data adapter API redesign

New website

What's next?

Top comments (0)

Read next

Variance - best perspective of understanding lifetime in Rust

Fast multi-arch Docker build for Rust projects

Iterators in Rust - Map, Filter, Reduce

Iterators in Rust