Kai Zhang
Member of the Technical Staff
Nand Dalal
Member of the Technical Staff

The Modern 3-Step Approach to Organizing Large Medical Datasets


A key part of developing machine learning models for medical imaging is first selecting the relevant scans to train the model. And the number of scans available for developing models continues to grow. At Nines we've developed a more scalable approach to identifying the image type of a scan using modern deep learning algorithms.

Come up to standards

The volume of created and transferred digital medical images continues to swell. According to a study published in JAMA¹, among older adults CT imaging rates were 428 per 1000 person-years in 2016 vs 204 per 1000 person-years in 2000 in US health care systems. One problem is that when it comes to organizing those images, the use of the DICOM standard is not so standard. 

We in the academic and medical research community are grateful that some of this swell of large, appropriately anonymized image data is available as we build new service models. Many medical machine learning algorithms in academic literature were developed and measured using tailored, anonymized public datasets such as the Lung Image Database Consortium image collection (LIDC-IDRI) or ChestX-ray8 from the NIH.

At Nines we're building computer vision models that can analyze radiology images, and we use data responsibly approved and anonymized from select hospital providers. We curate diverse datasets in terms of institutions, scanner types, etc. to test the generalizability of our algorithms. But the benefits of this dataset diversity come with a downside: the data usually hasn't been curated as thoroughly as a public dataset or with machine learning in mind. For example, the process of anonymization can introduce inconsistent associations with DICOM files.

Developing a model requires selecting relevant scans yet this seemingly straightforward step actually is nontrivial. Why? A label such as image type often is not associated with the DICOM files. It must be inferred. And in practice, these fields often are not standardized across hospitals, manufacturers and years.

While developing NinesAI™, our CT head emergent triaging device, we needed to select the image type of axial non-contrast head CT scans. But if we were to develop a device to analyze lung nodules, we'd need to select chest CT scans with convolution kernels for lungs -- a different kind of image type. Ultimately, we need to associate an image type label with every image in our dataset of hundreds of thousands of scans. Yet the non-standardization mentioned earlier creates a downstream level of inefficient complexity.

We use a 3-part organization method that differs from historical approaches. Let's take a look.

Modern organization in 3 steps

Let’s look at an historical approach to identifying the image type of a scan and contrast that to Nines’ method. It’s useful to break down the procedures into three main steps: Data Understanding, Filter Construction, and Filter Application.

1. Data Understanding

This is the process of identifying which scans are of the desired image type and which are not, given a subset of a large dataset.

  • ~~▸ Historical approach: Inspect the DICOM fields used to infer the image type.
  • ~~~Here are some of the DICOM fields which can be used to select axial non-contrast head CT scans:
  • ~~~~ProtocolName
  • ~~~~ConvolutionKernel
  • ~~~~BodyPartExamined
  • ~~~~Modality
  • ~~~~ImageType
  • ~~~The problem is a lack of standardization. For example, just for the "ProtocolName" field there were 47 possible values, including:
  • ~~~~"Routine_Head"
  • ~~~~"1.1 Routine Head"
  • ~~~~"1.1 ROUTINE HEAD"
  • ~~~~"1_HeadRoutineSeq"
  • ~~~~"01_HeadRoutineSeq"
  • ~~~The second field above, ConvolutionKernel, can be a code name only understood by the scanner manufacturer. We saw 30+ possible values, including:
  • ~~~~"H27"
  • ~~~~"FC20"
  • ~~~~"UA"
  • ~~~~"STANDARD"
  • ~~~We can see how the volume of different DICOM fields with non-standard values quickly makes the number of valid combinations unmanageable. Let’s look at how Nines’ approach avoids this problem.
  • ~~▸ Nines’ approach: Label a few thousand scans with the image type of interest.
  • ~~~Provide a binary label for whether or not a scan is an axial non-contrast head CT. And we do this solely based on the image data, excluding the various other DICOM fields considered in the historical approach.
  • ~~~Benefit of Nines' approach
  • ~~~⁃ This approach of labeling the scans based on image data  is much simpler and less error prone because radiologists can reliably look at a scan and say whether it is the image type of interest.
  • ~~~⁃ We can also validate this approach by labeling a small subset of the scans multiple times and computing the inter reader agreement. This will tell us whether the radiologists agreed on the definition of the image type.

2. Filter Construction

This is the process of using the DICOM data in the previous step to build an algorithm that can identify the image type in scans.

  • ~~▸ Historical approach: Write a rule-based filter based on logical combinations of the DICOM fields used to infer the image type.
  • ~~~This rule-based filter is code that needs to be meticulously written to handle the numerous edge cases that arise when handling multiple combinations of non-standard DICOM field values. For example, DICOM fields often have a large range of possible values, so one has to use advanced forms of regex-based string matching.
  • ~~▸ Nines approach: Train a binary classification CNN (convolutional neural network) model to classify whether a given image is or is not of the image type.
  • ~~~Benefit of Nines' approach
  • ~~~⁃ The CNN model uses the images directly to identify the image type rather than relying on combinations of potentially inconsistent DICOM fields, and is more generalizable.
  • ~~~⁃ The code is much simpler. There are many libraries for training CNN models, reducing filtering code complexity.

3. Filter Application

  • This is the process of applying the constructed filter to the full dataset or new, previously unseen, datasets.
  • ~~▸ Historical approach: Apply the rule-based filter to the data and visualize the filtered scans to see if it worked as intended. If not, repeat steps 1 and 2.
  • ~~~When relying on DICOM fields we found that almost every time we applied to filter to a new dataset, we had to refine and reapply the filter. This meant that we had to revisit the rule-based filter repeatedly during model development and add features to handle new edge cases we found in the data.
  • ~~~We can see how using rule-based filters to identify the image type can add lots of overhead and lead to a complicated codebase that’s hard to maintain. Let’s look at how the Nines approach avoids these problems.
  • ~~▸ Nines approach: Evaluate the binary classification CNN model on a representative set of images and tune the threshold for high precision.
  • ~~~Benefit of Nines approach
  • ~~~⁃ We found this method to be more generalizable to new datasets since it relies on the images rather than the DICOM fields.
  • ~~~⁃ As the CNN model is more generalizable, it can be used to scale up to much larger datasets.
  • ~~~⁃ With this approach the model can be improved by labeling more data and retraining the model. This avoids having to add complex code to handle edge cases.
  • ~~~⁃ By measuring the performance of the model vs the validation set with manual labels, we understand the accuracy of the filtering step. In contrast,  the actual accuracy at scale is often unknown in rule-based systems.

It is important to note that while the binary classification CNN model is more reliable than the rule-based filters, it can still have some non-zero error rate. As such, we should only apply this automated method to training datasets which need to be scaled. Validation and testing datasets should be annotated manually by radiologists to ensure correctness.

While this method of using a binary classification CNN model was not used in the development of NinesAI, we plan to include this approach in current and future ML model development at Nines.

Key takeaways

A key part of the medical imaging model development process is first selecting the relevant scans to use from a large dataset. This can be nontrivial because an image type label often is not directly associated with the DICOM files and must be inferred.

We've seen how Nines’ approach of using the images directly via a CNN model is a more simplified and scalable approach to inferring image type than the historical approach of a rule-based filter relying on the DICOM fields.

We developed this approach by working closely with Nines radiologists, especially during the Data Understanding step. At Nines we’re investigating new ways to assist radiology workflows, using modern computer vision image type classifiers similar to those discussed here.


General Inquiries

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.