MONAI Model Zoo

MONAI Model Zoo hosts a collection of medical imaging models in the MONAI Bundle format.

The MONAI Bundle format defines portable describes of deep learning models. A bundle includes the critical information necessary during a model development life cycle and allows users and programs to understand the purpose and usage of the models.

All Models

Brats mri segmentation

MONAI team

A pre-trained model for volumetric (3D) segmentation of brain tumor subregions from multimodal MRIs based on BraTS 2018 data

Model Details
Brats mri segmentation Download

Model Metadata:

Overview: A pre-trained model for volumetric (3D) segmentation of brain tumor subregions from multimodal MRIs based on BraTS 2018 data

Author(s): MONAI team

References:

  • Myronenko, Andriy. '3D MRI brain tumor segmentation using autoencoder regularization.' International MICCAI Brainlesion Workshop. Springer, Cham, 2018. https://arxiv.org/abs/1810.11654

Downloads: 399

File Size: 33.5MB

Model README:

Model Overview

A pre-trained model for volumetric (3D) segmentation of brain tumor subregions from multimodal MRIs based on BraTS 2018 data. The whole pipeline is modified from clara_pt_brain_mri_segmentation .

The model is trained to segment 3 nested subregions of primary brain tumors (gliomas): the "enhancing tumor" (ET), the "tumor core" (TC), the "whole tumor" (WT) based on 4 aligned input MRI scans (T1c, T1, T2, FLAIR). - The ET is described by areas that show hyper intensity in T1c when compared to T1, but also when compared to "healthy" white matter in T1c. - The TC describes the bulk of the tumor, which is what is typically resected. The TC entails the ET, as well as the necrotic (fluid-filled) and the non-enhancing (solid) parts of the tumor. - The WT describes the complete extent of the disease, as it entails the TC and the peritumoral edema (ED), which is typically depicted by hyper-intense signal in FLAIR.

Model workflow

Data

The training data is from the Multimodal Brain Tumor Segmentation Challenge (BraTS) 2018 .

  • Target: 3 tumor subregions
  • Task: Segmentation
  • Modality: MRI
  • Size: 285 3D volumes (4 channels each)

The provided labelled data was partitioned, based on our own split, into training (200 studies), validation (42 studies) and testing (43 studies) datasets.

Preprocessing

The data list/split can be created with the script scripts/prepare_datalist.py .

python scripts/prepare_datalist.py --path your-brats18-dataset-path

Training configuration

This model utilized a similar approach described in 3D MRI brain tumor segmentation using autoencoder regularization, which was a winning method in BraTS2018 [1]. The training was performed with the following:

  • GPU: At least 16GB of GPU memory.
  • Actual Model Input: 224 x 224 x 144
  • AMP: True
  • Optimizer: Adam
  • Learning Rate: 1e-4
  • Loss: DiceLoss

Input

4 channel aligned MRIs at 1 x 1 x 1 mm - T1c - T1 - T2 - FLAIR

Output

3 channels - Label 0: TC tumor subregion - Label 1: WT tumor subregion - Label 2: ET tumor subregion

Performance

Dice score was used for evaluating the performance of the model. This model achieved Dice scores on the validation data of: - Tumor core (TC): 0.8559 - Whole tumor (WT): 0.9026 - Enhancing tumor (ET): 0.7905 - Average: 0.8518

Training Loss and Dice

A graph showing the training loss and the mean dice over 300 epochs

Validation Dice

A graph showing the validation mean dice over 300 epochs

MONAI Bundle Commands

In addition to the Pythonic APIs, a few command line interfaces (CLI) are provided to interact with the bundle. The CLI supports flexible use cases, such as overriding configs at runtime and predefining arguments in a file.

For more details usage instructions, visit the MONAI Bundle Configuration Page .

Execute training:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf

Override the train config to execute multi-GPU training:

torchrun --standalone --nnodes=1 --nproc_per_node=8 -m monai.bundle run training --meta_file configs/metadata.json --config_file "['configs/train.json','configs/multi_gpu_train.json']" --logging_file configs/logging.conf

Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove --standalone , modify --nnodes , or do some other necessary changes according to the machine used. For more details, please refer to pytorch's official tutorial .

Override the train config to execute evaluation with the trained model:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file "['configs/train.json','configs/evaluate.json']" --logging_file configs/logging.conf

Execute inference:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf

References

[1] Myronenko, Andriy. "3D MRI brain tumor segmentation using autoencoder regularization." International MICCAI Brainlesion Workshop. Springer, Cham, 2018. https://arxiv.org/abs/1810.11654.

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Breast density classification

Center for Augmented Intelligence in Imaging, Mayo Clinic Florida

A pre-trained model for classifying breast images (mammograms)

Model Details
Breast density classification Download

Model Metadata:

Overview: A pre-trained model for classifying breast images (mammograms)

Author(s): Center for Augmented Intelligence in Imaging, Mayo Clinic Florida

References:

  • Gupta, Vikash, et al. A multi-reconstruction study of breast density estimation using Deep Learning. arXiv preprint arXiv:2202.08238 (2022).

Downloads: 32

File Size: 94.5MB

Model README:

Description

A pre-trained model for breast-density classification.

Model Overview

This model is trained using transfer learning on InceptionV3. The model weights were fine tuned using the Mayo Clinic Data. The details of training and data is outlined in https://arxiv.org/abs/2202.08238.

The bundle does not support torchscript.

Input and Output Formats

The input image should have the size [3, 299, 299]. The output is an array with probabilities for each of the four class.

Sample Data

In the folder sample_data few example input images are stored for each category of images. These images are stored in jpeg format for sharing purpose.

Input and Output Formats

The input image should have the size [299, 299, 3]. For a dicom image which are single channel. The channel can be repeated 3 times. The output is an array with probabilities for each of the four class.

Commands Example

Create a json file with names of all the input files. Execute the following command

python scripts/create_dataset.py -base_dir <path to the bundle root dir>/sample_data -output_file configs/sample_image_data.json

Change the filename for the field data with the absolute path for sample_image_data.json

Add scripts folder to your python path as follows

export PYTHONPATH=$PYTHONPATH:<path to the bundle root dir>/scripts

Execute Inference

The inference can be executed as follows

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json configs/logging.conf

Execute training

It is a work in progress and will be shared in the next version soon.

Contributors

This model is made available from Center for Augmented Intelligence in Imaging, Mayo Clinic Florida. For questions email Vikash Gupta (gupta.vikash@mayo.edu).

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Endoscopic inbody classification

NVIDIA DLMED team

A pre-trained binary classification model for endoscopic inbody classification task

Model Details
Endoscopic inbody classification Download

Model Metadata:

Overview: A pre-trained binary classification model for endoscopic inbody classification task

Author(s): NVIDIA DLMED team

References:

  • J. Hu, L. Shen and G. Sun, Squeeze-and-Excitation Networks, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132-7141. https://arxiv.org/pdf/1709.01507.pdf

Downloads: 415

File Size: 184.7MB

Model README:

Model Overview

A pre-trained model for the endoscopic inbody classification task and trained using the SEResNet50 structure, whose details can be found in [1]. All datasets are from private samples of Activ Surgical . Samples in training and validation dataset are from the same 4 videos, while test samples are from different two videos.

The PyTorch model and torchscript model are shared in google drive. Modify the bundle_root parameter specified in configs/train.json and configs/inference.json to reflect where models are downloaded. Expected directory path to place downloaded models is models/ under bundle_root .

image

Data

The datasets used in this work were provided by Activ Surgical .

We've provided a link of 20 samples (10 in-body and 10 out-body) to show what this dataset looks like.

Preprocessing

After downloading this dataset, python script in scripts folder named data_process can be used to generate label json files by running the command below and modifying datapath to path of unziped downloaded data. Generated label json files will be stored in label folder under the bundle path.

python scripts/data_process.py --datapath /path/to/data/root

By default, label path parameter in train.json and inference.json of this bundle is point to the generated label folder under bundle path. If you move these generated label files to another place, please modify the train_json , val_json and test_json parameters specified in configs/train.json and configs/inference.json to where these label files are.

The input label json should be a list made up by dicts which includes image and label keys. An example format is shown below.

[
    {
        "image":"/path/to/image/image_name0.jpg",
        "label": 0
    },
    {
        "image":"/path/to/image/image_name1.jpg",
        "label": 0
    },
    {
        "image":"/path/to/image/image_name2.jpg",
        "label": 1
    },
    ....
    {
        "image":"/path/to/image/image_namek.jpg",
        "label": 0
    },
]

Training configuration

The training as performed with the following: - GPU: At least 12GB of GPU memory - Actual Model Input: 256 x 256 x 3 - Optimizer: Adam - Learning Rate: 1e-3

Input

A three channel video frame

Output

Two Channels - Label 0: in body - Label 1: out body

Performance

Accuracy was used for evaluating the performance of the model. This model achieves an accuracy score of 0.98

Training Loss

A graph showing the training loss over 25 epochs.

Validation Accuracy

A graph showing the validation accuracy over 25 epochs.

MONAI Bundle Commands

In addition to the Pythonic APIs, a few command line interfaces (CLI) are provided to interact with the bundle. The CLI supports flexible use cases, such as overriding configs at runtime and predefining arguments in a file.

For more details usage instructions, visit the MONAI Bundle Configuration Page .

Execute training:

python -m monai.bundle run training \
    --meta_file configs/metadata.json \
    --config_file configs/train.json \
    --logging_file configs/logging.conf

Override the train config to execute multi-GPU training:

torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run training \
    --meta_file configs/metadata.json \
    --config_file "['configs/train.json','configs/multi_gpu_train.json']" \
    --logging_file configs/logging.conf

Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove --standalone , modify --nnodes , or do some other necessary changes according to the machine used. For more details, please refer to pytorch's official tutorial .

Override the train config to execute evaluation with the trained model:

python -m monai.bundle run evaluating \
    --meta_file configs/metadata.json \
    --config_file "['configs/train.json','configs/evaluate.json']" \
    --logging_file configs/logging.conf

Execute inference:

python -m monai.bundle run evaluating \
    --meta_file configs/metadata.json \
    --config_file configs/inference.json \
    --logging_file configs/logging.conf

The classification result of every images in test.json will be printed to the screen.

Export checkpoint to TorchScript file:

python -m monai.bundle ckpt_export network_def \
    --filepath models/model.ts \
    --ckpt_file models/model.pt \
    --meta_file configs/metadata.json \
    --config_file configs/inference.json

References

[1] J. Hu, L. Shen and G. Sun, Squeeze-and-Excitation Networks, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132-7141. https://arxiv.org/pdf/1709.01507.pdf

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Endoscopic tool segmentation

NVIDIA DLMED team

A pre-trained binary segmentation model for endoscopic tool segmentation

Model Details
Endoscopic tool segmentation Download

Model Metadata:

Overview: A pre-trained binary segmentation model for endoscopic tool segmentation

Author(s): NVIDIA DLMED team

References:

  • Tan, M. and Le, Q. V. Efficientnet: Rethinking model scaling for convolutional neural networks. ICML, 2019a. https://arxiv.org/pdf/1905.11946.pdf
  • O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015. https://arxiv.org/pdf/1505.04597.pdf

Downloads: 389

File Size: 81.7MB

Model README:

Model Overview

A pre-trained model for the endoscopic tool segmentation task and is trained using a flexible unet structure with an efficient-b2 [1] as the backbone and a UNet architecture [2] as the decoder. Datasets use private samples from Activ Surgical .

The PyTorch model and torchscript model are shared in google drive. Details can be found in large_files.yml file. Modify the "bundle_root" parameter specified in configs/train.json and configs/inference.json to reflect where models are downloaded. Expected directory path to place downloaded models is "models/" under "bundle_root".

image

Data

Datasets used in this work were provided by Activ Surgical .

Since datasets are private, existing public datasets like EndoVis 2017 can be used to train a similar model.

Preprocessing

When using EndoVis or any other dataset, it should be divided into "train", "valid" and "test" folders. Samples in each folder would better be images and converted to jpg format. Otherwise, "images", "labels", "val_images" and "val_labels" parameters in "configs/train.json" and "datalist" in "configs/inference.json" should be modified to fit given dataset. After that, "dataset_dir" parameter in "configs/train.json" and "configs/inference.json" should be changed to root folder which contains previous "train", "valid" and "test" folders.

Please notice that loading data operation in this bundle is adaptive. If images and labels are not in the same format, it may lead to a mismatching problem. For example, if images are in jpg format and labels are in npy format, PIL and Numpy readers will be used separately to load images and labels. Since these two readers have their own way to parse file's shape, loaded labels will be transpose of the correct ones and incur a missmatching problem.

Training configuration

The training as performed with the following: - GPU: At least 12GB of GPU memory - Actual Model Input: 736 x 480 x 3 - Optimizer: Adam - Learning Rate: 1e-4

Input

A three channel video frame

Output

Two channels: - Label 1: tools - Label 0: everything else

Performance

IoU was used for evaluating the performance of the model. This model achieves a mean IoU score of 0.87.

Training Loss

A graph showing the training loss over 100 epochs.

Validation IoU

A graph showing the validation mean IoU over 100 epochs.

MONAI Bundle Commands

In addition to the Pythonic APIs, a few command line interfaces (CLI) are provided to interact with the bundle. The CLI supports flexible use cases, such as overriding configs at runtime and predefining arguments in a file.

For more details usage instructions, visit the MONAI Bundle Configuration Page .

Execute training:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf

Override the train config to execute evaluation with the trained model:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file "['configs/train.json','configs/evaluate.json']" --logging_file configs/logging.conf

Execute inference:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf

Export checkpoint to TorchScript file:

python -m monai.bundle ckpt_export network_def --filepath models/model.ts --ckpt_file models/model.pt --meta_file configs/metadata.json --config_file configs/inference.json

References

[1] Tan, M. and Le, Q. V. Efficientnet: Rethinking model scaling for convolutional neural networks. ICML, 2019a. https://arxiv.org/pdf/1905.11946.pdf

[2] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015. https://arxiv.org/pdf/1505.04597.pdf

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Lung nodule ct detection

MONAI team

A pre-trained model for volumetric (3D) detection of the lung lesion from CT image on LUNA16 dataset

Model Details
Lung nodule ct detection Download

Model Metadata:

Overview: A pre-trained model for volumetric (3D) detection of the lung lesion from CT image on LUNA16 dataset

Author(s): MONAI team

References:

  • Lin, Tsung-Yi, et al. 'Focal loss for dense object detection. ICCV 2017

Downloads: 390

File Size: 148.1MB

Model README:

Model Overview

A pre-trained model for volumetric (3D) detection of the lung nodule from CT image.

This model is trained on LUNA16 dataset (https://luna16.grand-challenge.org/Home/), using the RetinaNet (Lin, Tsung-Yi, et al. "Focal loss for dense object detection." ICCV 2017. https://arxiv.org/abs/1708.02002).

model workflow

Data

The dataset we are experimenting in this example is LUNA16 (https://luna16.grand-challenge.org/Home/), which is based on LIDC-IDRI database [3,4,5].

LUNA16 is a public dataset of CT lung nodule detection. Using raw CT scans, the goal is to identify locations of possible nodules, and to assign a probability for being a nodule to each location.

Disclaimer: We are not the host of the data. Please make sure to read the requirements and usage policies of the data and give credit to the authors of the dataset! We acknowledge the National Cancer Institute and the Foundation for the National Institutes of Health, and their critical role in the creation of the free publicly available LIDC/IDRI Database used in this study.

10-fold data splitting

We follow the official 10-fold data splitting from LUNA16 challenge and generate data split json files using the script from nnDetection .

Please download the resulted json files from https://github.com/Project-MONAI/MONAI-extra-test-data/releases/download/0.8.1/LUNA16_datasplit-20220615T233840Z-001.zip.

In these files, the values of "box" are the ground truth boxes in world coordinate.

Data resampling

The raw CT images in LUNA16 have various of voxel sizes. The first step is to resample them to the same voxel size. In this model, we resampled them into 0.703125 x 0.703125 x 1.25 mm.

Please following the instruction in Section 3.1 of https://github.com/Project-MONAI/tutorials/tree/main/detection to do the resampling.

Data download

The mhd/raw original data can be downloaded from LUNA16 . The DICOM original data can be downloaded from LIDC-IDRI database [3,4,5]. You will need to resample the original data to start training.

Alternatively, we provide resampled nifti images and a copy of original mhd/raw images from LUNA16 for users to download.

Training configuration

The training was performed with the following:

  • GPU: at least 16GB GPU memory
  • Actual Model Input: 192 x 192 x 80
  • AMP: True
  • Optimizer: Adam
  • Learning Rate: 1e-2
  • Loss: BCE loss and L1 loss

Input

1 channel - List of 3D CT patches

Output

In Training Mode: A dictionary of classification and box regression loss.

In Evaluation Mode: A list of dictionaries of predicted box, classification label, and classification score.

Performance

Coco metric is used for evaluating the performance of the model. The pre-trained model was trained and validated on data fold 0. This model achieves a mAP=0.853, mAR=0.994, AP(IoU=0.1)=0.862, AR(IoU=0.1)=1.0.

Training Loss

A graph showing the detection train loss

Validation Accuracy

The validation accuracy in this curve is the mean of mAP, mAR, AP(IoU=0.1), and AR(IoU=0.1) in Coco metric.

A graph showing the detection val accuracy

MONAI Bundle Commands

In addition to the Pythonic APIs, a few command line interfaces (CLI) are provided to interact with the bundle. The CLI supports flexible use cases, such as overriding configs at runtime and predefining arguments in a file.

For more details usage instructions, visit the MONAI Bundle Configuration Page .

Execute training:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf

Override the train config to execute evaluation with the trained model:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file "['configs/train.json','configs/evaluate.json']" --logging_file configs/logging.conf

Execute inference on resampled LUNA16 images by setting "whether_raw_luna16": false in inference.json :

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf

With the same command, we can execute inference on original LUNA16 images by setting "whether_raw_luna16": true in inference.json . Remember to also set "data_list_file_path": "$@bundle_root + '/LUNA16_datasplit/mhd_original/dataset_fold0.json'" and change "data_file_base_dir" .

Note that in inference.json, the transform "LoadImaged" in "preprocessing" and "AffineBoxToWorldCoordinated" in "postprocessing" has "affine_lps_to_ras": true . This depends on the input images. LUNA16 needs "affine_lps_to_ras": true . It is possible that your inference dataset should set "affine_lps_to_ras": false .

References

[1] Lin, Tsung-Yi, et al. "Focal loss for dense object detection." ICCV 2017. https://arxiv.org/abs/1708.02002)

[2] Baumgartner and Jaeger et al. "nnDetection: A self-configuring method for medical object detection." MICCAI 2021. https://arxiv.org/pdf/2106.00817.pdf

[3] Armato III, S. G., McLennan, G., Bidaut, L., McNitt-Gray, M. F., Meyer, C. R., Reeves, A. P., Zhao, B., Aberle, D. R., Henschke, C. I., Hoffman, E. A., Kazerooni, E. A., MacMahon, H., Van Beek, E. J. R., Yankelevitz, D., Biancardi, A. M., Bland, P. H., Brown, M. S., Engelmann, R. M., Laderach, G. E., Max, D., Pais, R. C. , Qing, D. P. Y. , Roberts, R. Y., Smith, A. R., Starkey, A., Batra, P., Caligiuri, P., Farooqi, A., Gladish, G. W., Jude, C. M., Munden, R. F., Petkovska, I., Quint, L. E., Schwartz, L. H., Sundaram, B., Dodd, L. E., Fenimore, C., Gur, D., Petrick, N., Freymann, J., Kirby, J., Hughes, B., Casteele, A. V., Gupte, S., Sallam, M., Heath, M. D., Kuhn, M. H., Dharaiya, E., Burns, R., Fryd, D. S., Salganicoff, M., Anand, V., Shreter, U., Vastagh, S., Croft, B. Y., Clarke, L. P. (2015). Data From LIDC-IDRI [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2015.LO9QL9SX

[4] Armato SG 3rd, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, Reeves AP, Zhao B, Aberle DR, Henschke CI, Hoffman EA, Kazerooni EA, MacMahon H, Van Beeke EJ, Yankelevitz D, Biancardi AM, Bland PH, Brown MS, Engelmann RM, Laderach GE, Max D, Pais RC, Qing DP, Roberts RY, Smith AR, Starkey A, Batrah P, Caligiuri P, Farooqi A, Gladish GW, Jude CM, Munden RF, Petkovska I, Quint LE, Schwartz LH, Sundaram B, Dodd LE, Fenimore C, Gur D, Petrick N, Freymann J, Kirby J, Hughes B, Casteele AV, Gupte S, Sallamm M, Heath MD, Kuhn MH, Dharaiya E, Burns R, Fryd DS, Salganicoff M, Anand V, Shreter U, Vastagh S, Croft BY. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans. Medical Physics, 38: 915--931, 2011. DOI: https://doi.org/10.1118/1.3528204

[5] Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., Tarbox, L., & Prior, F. (2013). The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. Journal of Digital Imaging, 26(6), 1045–1057. https://doi.org/10.1007/s10278-013-9622-7

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Mednist gan

MONAI Team

This example of a GAN generator produces hand xray images like those in the MedNIST dataset

Model Details
Mednist gan Download

Model Metadata:

Overview: This example of a GAN generator produces hand xray images like those in the MedNIST dataset

Author(s): MONAI Team

Downloads: 147

File Size: 1.1MB

Model README:

MedNIST GAN Hand Model

This model is a generator for creating images like the Hand category in the MedNIST dataset. It was trained as a GAN and accepts random values as inputs to produce an image output. The train.json file describes the training process along with the definition of the discriminator network used, and is based on the MONAI GAN tutorials .

This is a demonstration network meant to just show the training process for this sort of network with MONAI, its outputs are not particularly good and are of the same tiny size as the images in MedNIST. The training process was very short so a network with a longer training time would produce better results.

Downloading the Dataset

Download the dataset from here and extract the contents to a convenient location.

The MedNIST dataset was gathered from several sets from TCIA , the RSNA Bone Age Challenge , and the NIH Chest X-ray dataset .

The dataset is kindly made available by Dr. Bradley J. Erickson M.D., Ph.D. (Department of Radiology, Mayo Clinic) under the Creative Commons CC BY-SA 4.0 license .

If you use the MedNIST dataset, please acknowledge the source.

Training

Assuming the current directory is the bundle directory, and the dataset was extracted to the directory ./MedNIST , the following command will train the network for 50 epochs:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf --bundle_root .

Not also the output from the training will be placed in the models directory but will not overwrite the model.pt file that may be there already. You will have to manually rename the most recent checkpoint file to model.pt to use the inference script mentioned below after checking the results are correct. This saved checkpoint contains a dictionary with the generator weights stored as model and omits the discriminator.

Another feature in the training file is the addition of sigmoid activation to the network by modifying it's structure at runtime. This is done with a line in the training section calling add_module on a layer of the network. This works best for training although the definition of the model now doesn't strictly match what it is in the generator section.

The generator and discriminator networks were both trained with the Adam optimizer with a learning rate of 0.0002 and betas values [0.5, 0.999] . These have been emperically found to be good values for the optimizer and this GAN problem.

Inference

The included inference.json generates a set number of png samples from the network and saves these to the directory ./outputs . The output directory can be changed by setting the output_dir value, and the number of samples changed by setting the num_samples value. The following command line assumes it is invoked in the bundle directory:

python -m monai.bundle run inferring --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf --bundle_root .

Note this script uses postprocessing to apply the sigmoid activation the model's outputs and to save the results to image files.

Export

The generator can be exported to a Torchscript bundle with the following:

python -m monai.bundle ckpt_export network_def --filepath mednist_gan.ts --ckpt_file models/model.pt --meta_file configs/metadata.json --config_file configs/inference.json

The model can be loaded without MONAI code after this operation. For example, an image can be generated from a set of random values with:

import torch
net = torch.jit.load("mednist_gan.ts")
latent = torch.rand(1, 64)
img = net(latent)  # (1,1,64,64)

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Mednist reg

MONAI team

This is an example of a ResNet and spatial transformer for hand xray image registration

Model Details
Mednist reg Download

Model Metadata:

Overview: This is an example of a ResNet and spatial transformer for hand xray image registration

Author(s): MONAI team

Downloads: 1

File Size: 40.3MB

Model README:

MedNIST Hand Image Registration

Based on the tutorial of 2D registration

Downloading the Dataset

Download the dataset from here and extract the contents to a convenient location.

The MedNIST dataset was gathered from several sets from TCIA , the RSNA Bone Age Challenge , and the NIH Chest X-ray dataset .

The dataset is kindly made available by Dr. Bradley J. Erickson M.D., Ph.D. (Department of Radiology, Mayo Clinic) under the Creative Commons CC BY-SA 4.0 license .

If you use the MedNIST dataset, please acknowledge the source.

Training

Training with same-subject image inputs

python -m monai.bundle run training --config_file configs/train.yaml --dataset_dir "/workspace/data/MedNIST/Hand"

Training with cross-subject image inputs

python -m monai.bundle run training \
  --config_file configs/train.yaml \
  --dataset_dir "/workspace/data/MedNIST/Hand" \
  --cross_subjects True

Training from an existing checkpoint file, for example, models/model_key_metric=-0.0734.pt :

python -m monai.bundle run training --config_file configs/train.yaml [...omitting other args] --ckpt "models/model_key_metric=-0.0734.pt"

Inference

The following figure shows an intra-subject ( --cross_subjects False ) model inference results (Fixed, moving and predicted images from left to right)

fixed moving predicted

The command shows an inference workflow with the checkpoint "models/model_key_metric=-0.0890.pt" and using device "cuda:1" :

python -m monai.bundle run eval \
  --config_file configs/inference.yaml \
  --ckpt "models/model_key_metric=-0.0890.pt" \
  --logging_file configs/logging.conf \
  --device "cuda:1"

Fine-tuning for cross-subject alignments

The following commands starts a finetuning workflow based on the checkpoint "models/model_key_metric=-0.0065.pt" for 5 epochs using the global mutual information loss.

python -m monai.bundle run training \
  --config_file configs/train.yaml \
  --cross_subjects True \
  --ckpt "models/model_key_metric=-0.0065.pt" \
  --lr 0.000001 \
  --trainer#loss_function "@mutual_info_loss" \
  --max_epochs 5

The following figure shows an inter-subject ( --cross_subjects True ) model inference results (Fixed, moving and predicted images from left to right)

fixed moving predicted

Visualize the first pair of images for debugging (requires matplotlib )

python -m monai.bundle run display --config_file configs/train.yaml
python -m monai.bundle run display --config_file configs/train.yaml --cross_subjects True

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Pancreas ct dints segmentation

MONAI team

Searched architectures for volumetric (3D) segmentation of the pancreas from CT image

Model Details
Pancreas ct dints segmentation Download

Model Metadata:

Overview: Searched architectures for volumetric (3D) segmentation of the pancreas from CT image

Author(s): MONAI team

References:

  • He, Y., Yang, D., Roth, H., Zhao, C. and Xu, D., 2021. Dints: Differentiable neural network topology search for 3d medical image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 5841-5850).

Downloads: 202

File Size: 943.3MB

Model README:

Model Overview

A neural architecture search algorithm for volumetric (3D) segmentation of the pancreas and pancreatic tumor from CT image. This model is trained using the neural network model from the neural architecture search algorithm, DiNTS [1].

image

Data

The training dataset is the Panceas Task from the Medical Segmentation Decathalon. Users can find more details on the datasets at http://medicaldecathlon.com/.

  • Target: Liver and tumour
  • Modality: Portal venous phase CT
  • Size: 420 3D volumes (282 Training +139 Testing)
  • Source: Memorial Sloan Kettering Cancer Center
  • Challenge: Label unbalance with large (background), medium (pancreas) and small (tumour) structures.

Preprocessing

The data list/split can be created with the script scripts/prepare_datalist.py .

python scripts/prepare_datalist.py --path /path-to-Task07_Pancreas/ --output configs/dataset_0.json

Training configuration

The training was performed with at least 16GB-memory GPUs.

Actual Model Input: 96 x 96 x 96

Neural Architecture Search Configuration

The neural architecture search was performed with the following:

  • AMP: True
  • Optimizer: SGD
  • Initial Learning Rate: 0.025
  • Loss: DiceCELoss

Optimial Architecture Training Configuration

The training was performed with the following:

  • AMP: True
  • Optimizer: SGD
  • (Initial) Learning Rate: 0.025
  • Loss: DiceCELoss
  • Note: If out-of-memory or program crash occurs while caching the data set, please change the cache_rate in CacheDataset to a lower value in the range (0, 1).

The segmentation of pancreas region is formulated as the voxel-wise 3-class classification. Each voxel is predicted as either foreground (pancreas body, tumour) or background. And the model is optimized with gradient descent method minimizing soft dice loss and cross-entropy loss between the predicted mask and ground truth segmentation.

Input

One channel - CT image

Output

Three channels - Label 2: pancreatic tumor - Label 1: pancreas - Label 0: everything else

Performance

Dice score is used for evaluating the performance of the model. This model achieves a mean dice score of 0.62.

Training Loss

The loss over 3200 epochs (the bright curve is smoothed, and the dark one is the actual curve)

Training loss over 3200 epochs (the bright curve is smoothed, and the dark one is the actual curve)

Validation Dice

The mean dice score over 3200 epochs (the bright curve is smoothed, and the dark one is the actual curve)

Validation mean dice score over 3200 epochs (the bright curve is smoothed, and the dark one is the actual curve)

Searched Architecture Visualization

Users can install Graphviz for visualization of searched architectures (needed in custom/decode_plot.py). The edges between nodes indicate global structure, and numbers next to edges represent different operations in the cell searching space. An example of searched architecture is shown as follows:

Example of Searched Architecture

MONAI Bundle Commands

In addition to the Pythonic APIs, a few command line interfaces (CLI) are provided to interact with the bundle. The CLI supports flexible use cases, such as overriding configs at runtime and predefining arguments in a file.

For more details usage instructions, visit the MONAI Bundle Configuration Page .

Execute model searching:

python -m scripts.search run --config_file configs/search.yaml

Execute multi-GPU model searching (recommended):

torchrun --nnodes=1 --nproc_per_node=8 -m scripts.search run --config_file configs/search.yaml

Execute training:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.yaml --logging_file configs/logging.conf

Override the train config to execute multi-GPU training:

torchrun --nnodes=1 --nproc_per_node=2 -m monai.bundle run training --meta_file configs/metadata.json --config_file "['configs/train.yaml','configs/multi_gpu_train.yaml']" --logging_file configs/logging.conf

Override the train config to execute evaluation with the trained model:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file "['configs/train.yaml','configs/evaluate.yaml']" --logging_file configs/logging.conf

Execute inference:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.yaml --logging_file configs/logging.conf

Export checkpoint for TorchScript

python -m monai.bundle ckpt_export network_def --filepath models/model.ts --ckpt_file models/model.pt --meta_file configs/metadata.json --config_file configs/inference.yaml

References

[1] He, Y., Yang, D., Roth, H., Zhao, C. and Xu, D., 2021. Dints: Differentiable neural network topology search for 3d medical image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 5841-5850).

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Pathology tumor detection

MONAI team

A pre-trained model for metastasis detection on Camelyon 16 dataset.

Model Details
Pathology tumor detection Download

Model Metadata:

Overview: A pre-trained model for metastasis detection on Camelyon 16 dataset.

Author(s): MONAI team

References:

Downloads: 148

File Size: 43.7MB

Model README:

Model Overview

A pre-trained model for automated detection of metastases in whole-slide histopathology images.

The model is trained based on ResNet18 [1] with the last fully connected layer replaced by a 1x1 convolution layer. Diagram showing the flow from model input, through the model architecture, and to model output

Data

All the data used to train, validate, and test this model is from Camelyon-16 Challenge . You can download all the images for "CAMELYON16" data set from various sources listed here .

Location information for training/validation patches (the location on the whole slide image where patches are extracted) are adopted from NCRF/coords .

Annotation information are adopted from NCRF/jsons .

  • Target: Tumor
  • Task: Detection
  • Modality: Histopathology
  • Size: 270 WSIs for training/validation, 48 WSIs for testing

Preprocessing

This bundle expects the training/validation data (whole slide images) reside in a {data_root}/training/images . By default data_root is pointing to /workspace/data/medical/pathology/ You can modify data_root in the bundle config files to point to a different directory.

To reduce the computation burden during the inference, patches are extracted only where there is tissue and ignoring the background according to a tissue mask. Please also create a directory for prediction output. By default output_dir is set to eval folder under the bundle root.

Please refer to "Annotation" section of Camelyon challenge to prepare ground truth images, which are needed for FROC computation. By default, this data set is expected to be at /workspace/data/medical/pathology/ground_truths . But it can be modified in evaluate_froc.sh .

Training configuration

The training was performed with the following:

  • Config file: train.config
  • GPU: at least 16 GB of GPU memory.
  • Actual Model Input: 224 x 224 x 3
  • AMP: True
  • Optimizer: Novograd
  • Learning Rate: 1e-3
  • Loss: BCEWithLogitsLoss
  • Whole slide image reader: cuCIM (if running on Windows or Mac, please install OpenSlide on your system and change wsi_reader to "OpenSlide")

Input

The training pipeline is a json file (dataset.json) which includes path to each WSI, the location and the label information for each training patch.

Output

A probability number of the input patch being tumor or normal.

Inference on a WSI

Inference is performed on WSI in a sliding window manner with specified stride. A foreground mask is needed to specify the region where the inference will be performed on, given that background region which contains no tissue at all can occupy a significant portion of a WSI. Output of the inference pipeline is a probability map of size 1/stride of original WSI size.

Performance

FROC score is used for evaluating the performance of the model. After inference is done, evaluate_froc.sh needs to be run to evaluate FROC score based on predicted probability map (output of inference) and the ground truth tumor masks. This model achieve the 0.91 accuracy on validation patches, and FROC of 0.72 on the 48 Camelyon testing data that have ground truth annotations available.

A Graph showing Train Acc, Train Loss, and Validation Acc

MONAI Bundle Commands

In addition to the Pythonic APIs, a few command line interfaces (CLI) are provided to interact with the bundle. The CLI supports flexible use cases, such as overriding configs at runtime and predefining arguments in a file.

For more details usage instructions, visit the MONAI Bundle Configuration Page .

Execute training

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf

Override the train config to execute multi-GPU training

torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run training --meta_file configs/metadata.json --config_file "['configs/train.json','configs/multi_gpu_train.json']" --logging_file configs/logging.conf

Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove --standalone , modify --nnodes , or do some other necessary changes according to the machine used. For more details, please refer to pytorch's official tutorial .

Execute inference

CUDA_LAUNCH_BLOCKING=1 python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf

Evaluate FROC metric

cd scripts && source evaluate_froc.sh

Export checkpoint to TorchScript file

TorchScript conversion is currently not supported.

References

[1] He, Kaiming, et al, "Deep Residual Learning for Image Recognition." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778. 2016. https://arxiv.org/pdf/1512.03385.pdf

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Prostate mri anatomy

Keno Bressem

A pre-trained model for volumetric (3D) segmentation of the prostate from MRI images

Model Details
Prostate mri anatomy Download

Model Metadata:

Overview: A pre-trained model for volumetric (3D) segmentation of the prostate from MRI images

Author(s): Keno Bressem

References:

  • Adams, L. C., Makowski, M. R., Engel, G., Rattunde, M., Busch, F., Asbach, P., ... & Bressem, K. K. (2022). Prostate158-An expert-annotated 3T MRI dataset and algorithm for prostate cancer detection. Computers in Biology and Medicine, 148, 105817.

Downloads: 139

File Size: 268.9MB

Model README:

Prostate MRI zonal segmentation

Authors

Lisa C. Adams, Keno K. Bressem

Tags

Segmentation, MR, Prostate

Model Description

This model was trained with the UNet architecture [1] and is used for 3D volumetric segmentation of the anatomical prostate zones on T2w MRI images. The segmentation of the anatomical regions is formulated as a voxel-wise classification. Each voxel is classified as either central gland (1), peripheral zone (2), or background (0). The model is optimized using a gradient descent method that minimizes the focal soft-dice loss between the predicted mask and the actual segmentation.

Data

The model was trained in the prostate158 training data, which is available at https://doi.org/10.5281/zenodo.6481141. Only T2w images were used for this task.

Preprocessing

MRI images in the prostate158 dataset were preprocessed, including center cropping and resampling. When applying the model to new data, this preprocessing should be repeated.

Center cropping

T2w images were acquired with a voxel spacing of 0.47 x 0.47 x 3 mm and an axial FOV size of 180 x 180 mm. However, the prostate rarely exceeds an axial diameter of 100 mm, and for zonal segmentation, the tissue surrounding the prostate is not of interest and only increases the image size and thus the computational cost. Center-cropping can reduce the image size without sacrificing information.

The script center_crop.py allows to reproduce center-cropping as performed in the prostate158 paper.

python scripts/center_crop.py --file_name path/to/t2_image --out_name cropped_t2

Resampling

DWI and ADC sequences in prostate158 were resampled to the orientation and voxel spacing of the T2w sequence. As the zonal segmentation uses T2w images, no additional resampling is nessecary. However, the training script will perform additonal resampling automatically.

Performance

The model achives the following performance on the prostate158 test dataset:

Rater 1
Rater 2
Metric Transitional Zone Peripheral Zone Transitional Zone Peripheral Zone
Dice Coefficient 0.877 0.754 0.875 0.730
Hausdorff Distance 18.3 22.8 17.5 33.2
Surface Distance 2.19 1.95 2.59 1.88

For more details, please see the original publication or official GitHub repository

System Configuration

The model was trained for 100 epochs on a workstaion with a single Nvidia RTX 3080 GPU. This takes approximatly 8 hours.

Limitations (Optional)

This training and inference pipeline was developed for research purposes only. This research use only software that has not been cleared or approved by FDA or any regulatory agency. The model is for research/developmental purposes only and cannot be used directly for clinical procedures.

Citation Info (Optional)

@article{ADAMS2022105817,
title = {Prostate158 - An expert-annotated 3T MRI dataset and algorithm for prostate cancer detection},
journal = {Computers in Biology and Medicine},
volume = {148},
pages = {105817},
year = {2022},
issn = {0010-4825},
doi = {https://doi.org/10.1016/j.compbiomed.2022.105817},
url = {https://www.sciencedirect.com/science/article/pii/S0010482522005789},
author = {Lisa C. Adams and Marcus R. Makowski and Günther Engel and Maximilian Rattunde and Felix Busch and Patrick Asbach and Stefan M. Niehues and Shankeeth Vinayahalingam and Bram {van Ginneken} and Geert Litjens and Keno K. Bressem},
keywords = {Prostate cancer, Deep learning, Machine learning, Artificial intelligence, Magnetic resonance imaging, Biparametric prostate MRI}
}

References

[1] Sakinis, Tomas, et al. "Interactive segmentation of medical images through fully convolutional neural networks." arXiv preprint arXiv:1903.08205 (2019).

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Renalstructures unest segmentation

Vanderbilt University + MONAI team

A transformer-based model for renal segmentation from CT image

Model Details
Renalstructures unest segmentation Download

Model Metadata:

Overview: A transformer-based model for renal segmentation from CT image

Author(s): Vanderbilt University + MONAI team

References:

  • Tang, Yucheng, et al. 'Self-supervised pre-training of swin transformers for 3d medical image analysis. arXiv preprint arXiv:2111.14791 (2021). https://arxiv.org/abs/2111.14791.

Downloads: 230

File Size: 309.0MB

Model README:

Description

A pre-trained model for training and inferencing volumetric (3D) kidney substructures segmentation from contrast-enhanced CT images (Arterial/Portal Venous Phase). Training pipeline is provided to support model fine-tuning with bundle and MONAI Label active learning.

A tutorial and release of model for kidney cortex, medulla and collecting system segmentation.

Authors: Yinchi Zhou (yinchi.zhou@vanderbilt.edu) | Xin Yu (xin.yu@vanderbilt.edu) | Yucheng Tang (yuchengt@nvidia.com) |

Model Overview

A pre-trained UNEST base model [1] for volumetric (3D) renal structures segmentation using dynamic contrast enhanced arterial or venous phase CT images.

Data

The training data is from the [ImageVU RenalSeg dataset] from Vanderbilt University and Vanderbilt University Medical Center. (The training data is not public available yet).

  • Target: Renal Cortex | Medulla | Pelvis Collecting System
  • Task: Segmentation
  • Modality: CT (Artrial | Venous phase)
  • Size: 96 3D volumes

The data and segmentation demonstration is as follow:


Method and Network

The UNEST model is a 3D hierarchical transformer-based semgnetation network.

Details of the architecture:

Training configuration

The training was performed with at least one 16GB-memory GPU.

Actual Model Input: 96 x 96 x 96

Input and output formats

Input: 1 channel CT image

Output: 4: 0:Background, 1:Renal Cortex, 2:Medulla, 3:Pelvicalyceal System

Performance

A graph showing the validation mean Dice for 5000 epochs.


This model achieves the following Dice score on the validation data (our own split from the training dataset):

Mean Valdiation Dice = 0.8523

Note that mean dice is computed in the original spacing of the input data.

commands example

Download trained checkpoint model to ./model/model.pt:

Add scripts component: To run the workflow with customized components, PYTHONPATH should be revised to include the path to the customized component:

export PYTHONPATH=$PYTHONPATH:"'<path to the bundle root dir>/scripts'"

Execute Training:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf

Execute inference:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf

More examples output


Disclaimer

This is an example, not to be used for diagnostic purposes.

References

[1] Yu, Xin, Yinchi Zhou, Yucheng Tang et al. "Characterizing Renal Structures with 3D Block Aggregate Transformers." arXiv preprint arXiv:2203.02430 (2022). https://arxiv.org/pdf/2203.02430.pdf

[2] Zizhao Zhang et al. "Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding." AAAI Conference on Artificial Intelligence (AAAI) 2022

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Spleen ct segmentation

MONAI team

A pre-trained model for volumetric (3D) segmentation of the spleen from CT image

Model Details
Spleen ct segmentation Download

Model Metadata:

Overview: A pre-trained model for volumetric (3D) segmentation of the spleen from CT image

Author(s): MONAI team

References:

  • Xia, Yingda, et al. '3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training. arXiv preprint arXiv:1811.12506 (2018). https://arxiv.org/abs/1811.12506.
  • Kerfoot E., Clough J., Oksuz I., Lee J., King A.P., Schnabel J.A. (2019) Left-Ventricle Quantification Using Residual U-Net. In: Pop M. et al. (eds) Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges. STACOM 2018. Lecture Notes in Computer Science, vol 11395. Springer, Cham. https://doi.org/10.1007/978-3-030-12029-0_40

Downloads: 1368

File Size: 33.9MB

Model README:

Model Overview

A pre-trained model for volumetric (3D) segmentation of the spleen from CT images.

This model is trained using the runner-up [1] awarded pipeline of the "Medical Segmentation Decathlon Challenge 2018" using the UNet architecture [2] with 32 training images and 9 validation images.

model workflow

Data

The training dataset is the Spleen Task from the Medical Segmentation Decathalon. Users can find more details on the datasets at http://medicaldecathlon.com/.

  • Target: Spleen
  • Modality: CT
  • Size: 61 3D volumes (41 Training + 20 Testing)
  • Source: Memorial Sloan Kettering Cancer Center
  • Challenge: Large-ranging foreground size

Training configuration

The segmentation of spleen region is formulated as the voxel-wise binary classification. Each voxel is predicted as either foreground (spleen) or background. And the model is optimized with gradient descent method minimizing Dice + cross entropy loss between the predicted mask and ground truth segmentation.

The training was performed with the following:

  • GPU: at least 12GB of GPU memory
  • Actual Model Input: 96 x 96 x 96
  • AMP: True
  • Optimizer: Adam
  • Learning Rate: 1e-4
  • Loss: DiceCELoss

Input

One channel - CT image

Output

Two channels - Label 1: spleen - Label 0: everything else

Performance

Dice score is used for evaluating the performance of the model. This model achieves a mean dice score of 0.96.

Training Loss

A graph showing the training loss over 1260 epochs (10080 iterations).

Validation Dice

A graph showing the validation mean Dice over 1260 epochs.

MONAI Bundle Commands

In addition to the Pythonic APIs, a few command line interfaces (CLI) are provided to interact with the bundle. The CLI supports flexible use cases, such as overriding configs at runtime and predefining arguments in a file.

For more details usage instructions, visit the MONAI Bundle Configuration Page .

Execute training:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf

Override the train config to execute multi-GPU training:

torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run training --meta_file configs/metadata.json --config_file "['configs/train.json','configs/multi_gpu_train.json']" --logging_file configs/logging.conf

Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove --standalone , modify --nnodes , or do some other necessary changes according to the machine used. For more details, please refer to pytorch's official tutorial .

Override the train config to execute evaluation with the trained model:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file "['configs/train.json','configs/evaluate.json']" --logging_file configs/logging.conf

Override the train config and evaluate config to execute multi-GPU evaluation:

torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file "['configs/train.json','configs/evaluate.json','configs/multi_gpu_evaluate.json']" --logging_file configs/logging.conf

Execute inference:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf

References

[1] Xia, Yingda, et al. "3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training." arXiv preprint arXiv:1811.12506 (2018). https://arxiv.org/abs/1811.12506.

[2] Kerfoot E., Clough J., Oksuz I., Lee J., King A.P., Schnabel J.A. (2019) Left-Ventricle Quantification Using Residual U-Net. In: Pop M. et al. (eds) Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges. STACOM 2018. Lecture Notes in Computer Science, vol 11395. Springer, Cham. https://doi.org/10.1007/978-3-030-12029-0_40

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Spleen deepedit annotation

MONAI team

This is a pre-trained model for 3D segmentation of the spleen organ from CT images using DeepEdit.

Model Details
Spleen deepedit annotation Download

Model Metadata:

Overview: This is a pre-trained model for 3D segmentation of the spleen organ from CT images using DeepEdit.

Author(s): MONAI team

References:

  • Sakinis, Tomas, et al. 'Interactive segmentation of medical images through fully convolutional neural networks.' arXiv preprint arXiv:1903.08205 (2019)

Downloads: 1013

File Size: 219.1MB

Model README:

Model Overview

A pre-trained model for 3D segmentation of the spleen organ from CT images using DeepEdit.

DeepEdit is an algorithm that combines the power of two models in one single architecture. It allows the user to perform inference as a standard segmentation method (i.e., UNet) and interactively segment part of an image using clicks [2]. DeepEdit aims to facilitate the user experience and, at the same time, develop new active learning techniques.

The model was trained on 32 images and validated on 9 images.

Data

The training dataset is the Spleen Task from the Medical Segmentation Decathalon. Users can find more details on the datasets at http://medicaldecathlon.com/.

  • Target: Spleen
  • Modality: CT
  • Size: 61 3D volumes (41 Training + 20 Testing)
  • Source: Memorial Sloan Kettering Cancer Center
  • Challenge: Large-ranging foreground size

Training configuration

The training as performed with the following: - GPU: at least 12GB of GPU memory - Actual Model Input: 128 x 128 x 128 - AMP: True - Optimizer: Adam - Learning Rate: 1e-4 - Loss: DiceCELoss

Input

Three channels - CT image - Spleen Segment - Background Segment

Output

Two channels - Label 1: spleen - Label 0: everything else

Performance

Dice score is used for evaluating the performance of the model. This model achieves a dice score of greater than 0.90, depending on the number of simulated clicks.

Training Dice

A graph showing the train dice over 90 epochs.

Training Loss

A graph showing the training loss over 90 epochs.

Validation Dice

A graph showing the validation dice over 90 epochs.

MONAI Bundle Commands

In addition to the Pythonic APIs, a few command line interfaces (CLI) are provided to interact with the bundle. The CLI supports flexible use cases, such as overriding configs at runtime and predefining arguments in a file.

For more details usage instructions, visit the MONAI Bundle Configuration Page .

Execute training:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf

Override the train config to execute multi-GPU training:

torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run training --meta_file configs/metadata.json --config_file "['configs/train.json','configs/multi_gpu_train.json']" --logging_file configs/logging.conf

Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove --standalone , modify --nnodes , or do some other necessary changes according to the machine used. For more details, please refer to pytorch's official tutorial .

Override the train config to execute evaluation with the trained model:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file "['configs/train.json','configs/evaluate.json']" --logging_file configs/logging.conf

Execute inference:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf

References

[1] Diaz-Pinto, Andres, et al. DeepEdit: Deep Editable Learning for Interactive Segmentation of 3D Medical Images. MICCAI Workshop on Data Augmentation, Labelling, and Imperfections. MICCAI 2022.

[2] Diaz-Pinto, Andres, et al. "MONAI Label: A framework for AI-assisted Interactive Labeling of 3D Medical Images." arXiv preprint arXiv:2203.12362 (2022).

[3] Sakinis, Tomas, et al. "Interactive segmentation of medical images through fully convolutional neural networks." arXiv preprint arXiv:1903.08205 (2019).

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Swin unetr btcv segmentation

MONAI team

A pre-trained model for volumetric (3D) multi-organ segmentation from CT image

Model Details
Swin unetr btcv segmentation Download

Model Metadata:

Overview: A pre-trained model for volumetric (3D) multi-organ segmentation from CT image

Author(s): MONAI team

References:

  • Hatamizadeh, Ali, et al. 'Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images. arXiv preprint arXiv:2201.01266 (2022). https://arxiv.org/abs/2201.01266.
  • Tang, Yucheng, et al. 'Self-supervised pre-training of swin transformers for 3d medical image analysis. arXiv preprint arXiv:2111.14791 (2021). https://arxiv.org/abs/2111.14791.

Downloads: 1212

File Size: 220.2MB

Model README:

Model Overview

A pre-trained Swin UNETR [1,2] for volumetric (3D) multi-organ segmentation using CT images from Beyond the Cranial Vault (BTCV) Segmentation Challenge dataset [3].

model workflow

Data

The training data is from the BTCV dataset (Register through Synapse and download the Abdomen/RawData.zip ).

  • Target: Multi-organs
  • Task: Segmentation
  • Modality: CT
  • Size: 30 3D volumes (24 Training + 6 Testing)

Preprocessing

The dataset format needs to be redefined using the following commands:

unzip RawData.zip
mv RawData/Training/img/ RawData/imagesTr
mv RawData/Training/label/ RawData/labelsTr
mv RawData/Testing/img/ RawData/imagesTs

Training configuration

The training as performed with the following: - GPU: At least 32GB of GPU memory - Actual Model Input: 96 x 96 x 96 - AMP: True - Optimizer: Adam - Learning Rate: 2e-4

Input

1 channel - CT image

Output

14 channels: - 0: Background - 1: Spleen - 2: Right Kidney - 3: Left Kideny - 4: Gallbladder - 5: Esophagus - 6: Liver - 7: Stomach - 8: Aorta - 9: IVC - 10: Portal and Splenic Veins - 11: Pancreas - 12: Right adrenal gland - 13: Left adrenal gland

Performance

Dice score was used for evaluating the performance of the model. This model achieves a mean dice score of 0.8269

Training Loss

The figure shows the training loss curve for 10K iterations.

Validation Dice

A graph showing the validation mean Dice for 5000 epochs.

MONAI Bundle Commands

In addition to the Pythonic APIs, a few command line interfaces (CLI) are provided to interact with the bundle. The CLI supports flexible use cases, such as overriding configs at runtime and predefining arguments in a file.

For more details usage instructions, visit the MONAI Bundle Configuration Page .

Execute training:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf

Override the train config to execute multi-GPU training:

torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run training --meta_file configs/metadata.json --config_file "['configs/train.json','configs/multi_gpu_train.json']" --logging_file configs/logging.conf

Please note that the distributed training-related options depend on the actual running environment; thus, users may need to remove --standalone , modify --nnodes , or do some other necessary changes according to the machine used. For more details, please refer to pytorch's official tutorial .

Override the train config to execute evaluation with the trained model:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file "['configs/train.json','configs/evaluate.json']" --logging_file configs/logging.conf

Execute inference:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf

Export checkpoint to TorchScript file:

TorchScript conversion is currently not supported.

References

[1] Hatamizadeh, Ali, et al. "Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images." arXiv preprint arXiv:2201.01266 (2022). https://arxiv.org/abs/2201.01266.

[2] Tang, Yucheng, et al. "Self-supervised pre-training of swin transformers for 3d medical image analysis." arXiv preprint arXiv:2111.14791 (2021). https://arxiv.org/abs/2111.14791.

[3] Landman B, et al. "MICCAI multi-atlas labeling beyond the cranial vault–workshop and challenge." In Proc. of the MICCAI Multi-Atlas Labeling Beyond Cranial Vault—Workshop Challenge 2015 Oct (Vol. 5, p. 12).

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Valve landmarks

Eric Kerfoot

This network is used to find where valves attach to heart to help construct 3D FEM models for computation. The output is an array of 10 2D coordinates.

Model Details
Valve landmarks Download

Model Metadata:

Overview: This network is used to find where valves attach to heart to help construct 3D FEM models for computation. The output is an array of 10 2D coordinates.

Author(s): Eric Kerfoot

References:

  • Kerfoot, E, King, CE, Ismail, T, Nordsletten, D & Miller, R 2021, Estimation of Cardiac Valve Annuli Motion with Deep Learning. https://doi.org/10.1007/978-3-030-68107-4_15

Downloads: 115

File Size: 14.1MB

Model README:

2D Cardiac Valve Landmark Regressor

This network identifies 10 different landmarks in 2D+t MR images of the heart (2 chamber, 3 chamber, and 4 chamber) representing the insertion locations of valve leaflets into the myocardial wall. These coordinates are used in part of the construction of 3D FEM cardiac models suitable for physics simulation of heart functions.

Input images are individual 2D slices from the time series, and the output from the network is a (2, 10) set of 2D points in HW image coordinate space. The 10 coordinates correspond to the attachment point for these valves:

  1. Mitral anterior in 2CH
  2. Mitral posterior in 2CH
  3. Mitral septal in 3CH
  4. Mitral free wall in 3CH
  5. Mitral septal in 4CH
  6. Mitral free wall in 4CH
  7. Aortic septal
  8. Aortic free wall
  9. Tricuspid septal
  10. Tricuspid free wall

Landmarks which do not appear in a particular image are predicted to be (0, 0) or close to this location. The mitral valve is expected to appear in all three views. Landmarks are not provided for the pulmonary valve.

Example plot of landmarks on a single frame, see view_results.ipynb for visualising network output:

Landmark Example Image

Training

The training script train.json is provided to train the network using a dataset of image pairs containing the MR image and a landmark image. This is done to reuse image-based transforms which do not currently operate on geometry. A number of other transforms are provided in valve_landmarks.py to implement Fourier-space dropout, image shifting which preserve landmarks, and smooth-field deformation applied to images and landmarks.

The dataset used for training unfortunately cannot be made public, however the training script can be used with any NPZ file containing the training image stack in key trainImgs and landmark image stack in trainLMImgs , plus testImgs and testLMImgs containing validation data. The landmark images are defined as 0 for every non-landmark pixel, with landmark pixels contaning the following values for each landmark type:

  • 10: Mitral anterior in 2CH
  • 15: Mitral posterior in 2CH
  • 20: Mitral septal in 3CH
  • 25: Mitral free wall in 3CH
  • 30: Mitral septal in 4CH
  • 35: Mitral free wall in 4CH
  • 100: Aortic septal
  • 150: Aortic free wall
  • 200: Tricuspid septal
  • 250: Tricuspid free wall

The following command will train with the default NPZ filename ./valvelandmarks.npz , assuming the current directory is the bundle directory:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file "['configs/train.json', 'configs/common.json']" \
    --bundle_root . --dataset_file ./valvelandmarks.npz --output_dir /path/to/outputs

Inference

The included inference.json script will run inference on a directory containing Nifti files whose images have shape (256, 256, 1, N) for N timesteps. For each image the output in the output_dir directory will be a npy file containing a result array of shape (N, 2, 10) storing the 10 coordinates for each N timesteps. Invoking this script can be done as follows, assuming the current directory is the bundle directory:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file "['configs/inference.json', 'configs/common.json']" \
    --bundle_root . --dataset_dir /path/to/data --output_dir /path/to/outputs

The provided test Nifti file can be placed in a directory which is then used as the dataset_dir value. This image was derived from the AMRG Cardiac Atlas dataset (AMRG Cardiac Atlas, Auckland MRI Research Group, Auckland, New Zealand). The results from this inference can be visualised by changing path values in view_results.ipynb .

Reference

The work for this model and its application is described in:

Kerfoot, E, King, CE, Ismail, T, Nordsletten, D & Miller, R 2021, Estimation of Cardiac Valve Annuli Motion with Deep Learning. in E Puyol Anton, M Pop, M Sermesant, V Campello, A Lalande, K Lekadir, A Suinesiaputra, O Camara & A Young (eds), Statistical Atlases and Computational Models of the Heart. MandMs and EMIDEC Challenges - 11th International Workshop, STACOM 2020, Held in Conjunction with MICCAI 2020, Revised Selected Papers. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12592 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 146-155, 11th International Workshop on Statistical Atlases and Computational Models of the Heart, STACOM 2020 held in Conjunction with MICCAI 2020, Lima, Peru, 4/10/2020. https://doi.org/10.1007/978-3-030-68107-4_15

License

This model is released under the MIT License. The license file is included with the model.

Ventricular short axis 3label

Eric Kerfoot

This network segments full cycle short axis images of the ventricles, labelling LV pool separate from myocardium and RV pool

Model Details
Ventricular short axis 3label Download

Model Metadata:

Overview: This network segments full cycle short axis images of the ventricles, labelling LV pool separate from myocardium and RV pool

Author(s): Eric Kerfoot

Downloads: 136

File Size: 11.8MB

Model README:

3 Label Ventricular Segmentation

This network segments cardiac ventricle in 2D short axis MR images. The left ventricular pool is class 1, left ventricular myocardium class 2, and right ventricular pool class 3. Full cycle segmentation with this network is possible although much of the training data is composed of segmented end-diastole images. The input to the network is single 2D images thus segmenting whole time-dependent volumes consists of multiple inference operations.

The network and training scheme are essentially identical to that described in:

Kerfoot E., Clough J., Oksuz I., Lee J., King A.P., Schnabel J.A. (2019) Left-Ventricle Quantification Using Residual U-Net. In: Pop M. et al. (eds) Statistical Atlases and Computational Models of the Heart. Atrial Segmentation and LV Quantification Challenges. STACOM 2018. Lecture Notes in Computer Science, vol 11395. Springer, Cham. https://doi.org/10.1007/978-3-030-12029-0_40

Data

The dataset used to train this network unfortunately cannot be made public as it contains unreleased image data from King's College London. Existing public datasets such as the Sunnybrook Cardiac Dataset and ACDC Challenge set can be used to train a similar network.

The train.json configuration assumes all data is stored in a single npz file with keys "images" and "segs" containing respectively the raw image data and their accompanying segmentations. The given network was training with stored volumes with shapes (9095, 256, 256) thus other data of differing spatial dimensions must be cropped to (256, 256) or zero-padded to that size. For the training data this was done as a preprocessing step but the original pixel values are otherwise unchanged from their original forms.

Training

The network is trained with this data in conjunction with a series of augmentations for regularisation and robustness. Many of the original images are smaller than the expected size of (256, 256) and so were zero-padded, the network can thus be expected to be robust against large amounts of empty space in the inputs. Rotation and zooming is also applied to force the network to learn different sizes and orientations of the heart in the field of view.

Free-form deformation is applied to vary the shape of the heart and its surrounding tissues which mimics to a degree deformation like what would be observed through the cardiac cycle. This of course does not replicate the heart moving through plane during the cycle or represent other observed changes but does provide enough variation that full-cycle segmentation is generally acceptable.

Smooth fields are used to vary contrast and intensity in localised regions to simulate some of the variation in image quality caused by acquisition artefacts. Guassian noise is also added to simulate poor quality acquisition. These together force the network to learn to deal with a wider variation of image quality and partially to account for the difference between scanner vendors.

Training is invoked with the following command line:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf --bundle_root .

The dataset file is assumed to be allimages3label.npz but can be changed by setting the dataset_file value to your own file.

Inference

An example notebook visualise.ipynb demonstrates using the network directly with input images. Inference of 3D volumes only can be accomplished with the inference.json configuration:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf --dataset_dir dataset --output_dir ./output/ --bundle_root .

License

This model is released under the MIT License. The license file is included with the model.

Wholebrainseg large unest segmentation

Vanderbilt University + MONAI team

A 3D transformer-based model for whole brain segmentation from T1W MRI image

Model Details
Wholebrainseg large unest segmentation Download

Model Metadata:

Overview: A 3D transformer-based model for whole brain segmentation from T1W MRI image

Author(s): Vanderbilt University + MONAI team

References:

  • Xin, et al. Characterizing Renal Structures with 3D Block Aggregate Transformers. arXiv preprint arXiv:2203.02430 (2022). https://arxiv.org/pdf/2203.02430.pdf

Downloads: 202

File Size: 310.6MB

Model README:

Description

Detailed whole brain segmentation is an essential quantitative technique in medical image analysis, which provides a non-invasive way of measuring brain regions from a clinical acquired structural magnetic resonance imaging (MRI). We provide the pre-trained model for training and inferencing whole brain segmentation with 133 structures. Training pipeline is provided to support active learning in MONAI Label and training with bundle.

A tutorial and release of model for whole brain segmentation using the 3D transformer-based segmentation model UNEST.

Authors: Xin Yu (xin.yu@vanderbilt.edu)

Yinchi Zhou (yinchi.zhou@vanderbilt.edu) | Yucheng Tang (yuchengt@nvidia.com)

-------------------------------------------------------------------------------------


Fig.1 - The demonstration of T1w MRI images registered in MNI space and the whole brain segmentation labels with 133 classes

Model Overview

A pre-trained UNEST base model [1] for volumetric (3D) whole brain segmentation with T1w MR images. To leverage information across embedded sequences, ”shifted window” transformers are proposed for dense predictions and modeling multi-scale features. However, these attempts that aim to complicate the self-attention range often yield high computation complexity and data inefficiency. Inspired by the aggregation function in the nested ViT, we propose a new design of a 3D U-shape medical segmentation model with Nested Transformers (UNesT) hierarchically with the 3D block aggregation function, that learn locality behaviors for small structures or small dataset. This design retains the original global self-attention mechanism and achieves information communication across patches by stacking transformer encoders hierarchically.


Fig.2 - The network architecture of UNEST Base model

Data

The training data is from the Vanderbilt University and Vanderbilt University Medical Center with public released OASIS and CANDI datsets. Training and testing data are MRI T1-weighted (T1w) 3D volumes coming from 3 different sites. There are a total of 133 classes in the whole brain segmentation task. Among 50 T1w MRI scans from Open Access Series on Imaging Studies (OASIS) (Marcus et al., 2007) dataset, 45 scans are used for training and the other 5 for validation. The testing cohort contains Colin27 T1w scan (Aubert-Broche et al., 2006) and 13 T1w MRI scans from the Child and Adolescent Neuro Development Initiative (CANDI) (Kennedy et al., 2012). All data are registered to the MNI space using the MNI305 (Evans et al., 1993) template and preprocessed follow the method in (Huo et al., 2019). Input images are randomly cropped to the size of 96 × 96 × 96.

Important

The brain MRI images for training are registered to Affine registration from the target image to the MNI305 template using NiftyReg. The data should be in the MNI305 space before inference.

If your images are already in MNI space, skip the registration step.

You could use any resitration tool to register image to MNI space. Here is an example using ants. Registration to MNI Space: Sample suggestion. E.g., use ANTS or other tools for registering T1 MRI image to MNI305 Space.

pip install antspyx

#Sample ANTS registration

import ants
import sys
import os

fixed_image = ants.image_read('<fixed_image_path>')
moving_image = ants.image_read('<moving_image_path>')
transform = ants.registration(fixed_image,moving_image,'Affine')

reg3t = ants.apply_transforms(fixed_image,moving_image,transform['fwdtransforms'][0])
ants.image_write(reg3t,output_image_path)

Training configuration

The training and inference was performed with at least one 24GB-memory GPU.

Actual Model Input: 96 x 96 x 96

Input and output formats

Input: 1 channel T1w MRI image in MNI305 Space.

commands example

Download trained checkpoint model to ./model/model.pt:

Add scripts component: To run the workflow with customized components, PYTHONPATH should be revised to include the path to the customized component:

export PYTHONPATH=$PYTHONPATH: '<path to the bundle root dir>/scripts'

Execute Training:

python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf

Execute inference:

python -m monai.bundle run evaluating --meta_file configs/metadata.json --config_file configs/inference.json --logging_file configs/logging.conf

More examples output


Fig.3 - The output prediction comparison with variant and ground truth

Training/Validation Benchmarking

A graph showing the training accuracy for fine-tuning 600 epochs.


With 10 fine-tuned labels, the training process converges fast.

Complete ROI of the whole brain segmentation

133 brain structures are segmented.

#1 #2 #3 #4
0: background 1 : 3rd-Ventricle 2 : 4th-Ventricle 3 : Right-Accumbens-Area
4 : Left-Accumbens-Area 5 : Right-Amygdala 6 : Left-Amygdala 7 : Brain-Stem
8 : Right-Caudate 9 : Left-Caudate 10 : Right-Cerebellum-Exterior 11 : Left-Cerebellum-Exterior
12 : Right-Cerebellum-White-Matter 13 : Left-Cerebellum-White-Matter 14 : Right-Cerebral-White-Matter 15 : Left-Cerebral-White-Matter
16 : Right-Hippocampus 17 : Left-Hippocampus 18 : Right-Inf-Lat-Vent 19 : Left-Inf-Lat-Vent
20 : Right-Lateral-Ventricle 21 : Left-Lateral-Ventricle 22 : Right-Pallidum 23 : Left-Pallidum
24 : Right-Putamen 25 : Left-Putamen 26 : Right-Thalamus-Proper 27 : Left-Thalamus-Proper
28 : Right-Ventral-DC 29 : Left-Ventral-DC 30 : Cerebellar-Vermal-Lobules-I-V 31 : Cerebellar-Vermal-Lobules-VI-VII
32 : Cerebellar-Vermal-Lobules-VIII-X 33 : Left-Basal-Forebrain 34 : Right-Basal-Forebrain 35 : Right-ACgG--anterior-cingulate-gyrus
36 : Left-ACgG--anterior-cingulate-gyrus 37 : Right-AIns--anterior-insula 38 : Left-AIns--anterior-insula 39 : Right-AOrG--anterior-orbital-gyrus
40 : Left-AOrG--anterior-orbital-gyrus 41 : Right-AnG---angular-gyrus 42 : Left-AnG---angular-gyrus 43 : Right-Calc--calcarine-cortex
44 : Left-Calc--calcarine-cortex 45 : Right-CO----central-operculum 46 : Left-CO----central-operculum 47 : Right-Cun---cuneus
48 : Left-Cun---cuneus 49 : Right-Ent---entorhinal-area 50 : Left-Ent---entorhinal-area 51 : Right-FO----frontal-operculum
52 : Left-FO----frontal-operculum 53 : Right-FRP---frontal-pole 54 : Left-FRP---frontal-pole 55 : Right-FuG---fusiform-gyrus
56 : Left-FuG---fusiform-gyrus 57 : Right-GRe---gyrus-rectus 58 : Left-GRe---gyrus-rectus 59 : Right-IOG---inferior-occipital-gyrus ,
60 : Left-IOG---inferior-occipital-gyrus 61 : Right-ITG---inferior-temporal-gyrus 62 : Left-ITG---inferior-temporal-gyrus 63 : Right-LiG---lingual-gyrus
64 : Left-LiG---lingual-gyrus 65 : Right-LOrG--lateral-orbital-gyrus 66 : Left-LOrG--lateral-orbital-gyrus 67 : Right-MCgG--middle-cingulate-gyrus
68 : Left-MCgG--middle-cingulate-gyrus 69 : Right-MFC---medial-frontal-cortex 70 : Left-MFC---medial-frontal-cortex 71 : Right-MFG---middle-frontal-gyrus
72 : Left-MFG---middle-frontal-gyrus 73 : Right-MOG---middle-occipital-gyrus 74 : Left-MOG---middle-occipital-gyrus 75 : Right-MOrG--medial-orbital-gyrus
76 : Left-MOrG--medial-orbital-gyrus 77 : Right-MPoG--postcentral-gyrus 78 : Left-MPoG--postcentral-gyrus 79 : Right-MPrG--precentral-gyrus
80 : Left-MPrG--precentral-gyrus 81 : Right-MSFG--superior-frontal-gyrus 82 : Left-MSFG--superior-frontal-gyrus 83 : Right-MTG---middle-temporal-gyrus
84 : Left-MTG---middle-temporal-gyrus 85 : Right-OCP---occipital-pole 86 : Left-OCP---occipital-pole 87 : Right-OFuG--occipital-fusiform-gyrus
88 : Left-OFuG--occipital-fusiform-gyrus 89 : Right-OpIFG-opercular-part-of-the-IFG 90 : Left-OpIFG-opercular-part-of-the-IFG 91 : Right-OrIFG-orbital-part-of-the-IFG
92 : Left-OrIFG-orbital-part-of-the-IFG 93 : Right-PCgG--posterior-cingulate-gyrus 94 : Left-PCgG--posterior-cingulate-gyrus 95 : Right-PCu---precuneus
96 : Left-PCu---precuneus 97 : Right-PHG---parahippocampal-gyrus 98 : Left-PHG---parahippocampal-gyrus 99 : Right-PIns--posterior-insula
100 : Left-PIns--posterior-insula 101 : Right-PO----parietal-operculum 102 : Left-PO----parietal-operculum 103 : Right-PoG---postcentral-gyrus
104 : Left-PoG---postcentral-gyrus 105 : Right-POrG--posterior-orbital-gyrus 106 : Left-POrG--posterior-orbital-gyrus 107 : Right-PP----planum-polare
108 : Left-PP----planum-polare 109 : Right-PrG---precentral-gyrus 110 : Left-PrG---precentral-gyrus 111 : Right-PT----planum-temporale
112 : Left-PT----planum-temporale 113 : Right-SCA---subcallosal-area 114 : Left-SCA---subcallosal-area 115 : Right-SFG---superior-frontal-gyrus
116 : Left-SFG---superior-frontal-gyrus 117 : Right-SMC---supplementary-motor-cortex 118 : Left-SMC---supplementary-motor-cortex 119 : Right-SMG---supramarginal-gyrus
120 : Left-SMG---supramarginal-gyrus 121 : Right-SOG---superior-occipital-gyrus 122 : Left-SOG---superior-occipital-gyrus 123 : Right-SPL---superior-parietal-lobule
124 : Left-SPL---superior-parietal-lobule 125 : Right-STG---superior-temporal-gyrus 126 : Left-STG---superior-temporal-gyrus 127 : Right-TMP---temporal-pole
128 : Left-TMP---temporal-pole 129 : Right-TrIFG-triangular-part-of-the-IFG 130 : Left-TrIFG-triangular-part-of-the-IFG 131 : Right-TTG---transverse-temporal-gyrus
132 : Left-TTG---transverse-temporal-gyrus

Bundle Integration in MONAI Lable

The inference and training pipleine can be easily used by the MONAI Label server and 3D Slicer for fast labeling T1w MRI images in MNI space.


Disclaimer

This is an example, not to be used for diagnostic purposes.

References

[1] Yu, Xin, Yinchi Zhou, Yucheng Tang et al. Characterizing Renal Structures with 3D Block Aggregate Transformers. arXiv preprint arXiv:2203.02430 (2022). https://arxiv.org/pdf/2203.02430.pdf

[2] Zizhao Zhang et al. Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding. AAAI Conference on Artificial Intelligence (AAAI) 2022

[3] Huo, Yuankai, et al. 3D whole brain segmentation using spatially localized atlas network tiles. NeuroImage 194 (2019): 105-119.

License

Copyright (c) MONAI Consortium

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.