A Comparative Study on Generalized Automated Medical Image Segmentation for Dataset Building

Table of content

  1. Introduction
  2. Approach
  3. Results and Analysis
  4. Conclusion
  5. Publications
  6. Links

Introduction

Medical image annotation plays a crucial role in building datasets for clinical applications, such as diagnosis, treatment planning, and research. However, the manual annotation process is labor-intensive, time-consuming, costly, and prone to human error, requiring trained experts to handle complex anatomical structures with low contrast, overlapping or blurry boundaries, and irregular regions of interest (ROI). These challenges necessitate the development of automated solutions to enhance the efficiency and scalability of medical image annotation.

Traditional deep learning models, while effective, often require retraining or fine-tuning for novel annotation tasks, which is not feasible for clinical annotators due to the time and expertise required. To address these limitations, few-shot learning approaches have gained popularity for their ability to generalize to unseen annotation tasks involving new anatomies and image modalities using only a limited number of image-label pairs, without the need for additional retraining.

In this work, we comprehensively study the generalizability of state-of-the-art few-shot learning segmentation models for medical images, evaluating their performance across diverse medical datasets and identifying their limitations. Our research aims to identify effective models for generalized automated medical image segmentation, focusing on region annotations (segmentation). By using a small set of support images as references for unseen tasks, these models can automate the annotation process, significantly improving the efficiency and accuracy of medical image annotation, thus supporting various clinical applications more effectively.

Approach

Automated Annotation

Research Objectives

There is need of consistent performance across diverse data & complex domain shifts, No re-training & fine-tuning required for new unseen tasks which saves time & resources, Clinical researchers do not need any expertise.

Results and Analysis

image

In one-shot setting, SegGPT & PerSAM show the high performance where UniverSeg & Painter lag behind. In few-shot setting, UniverSeg followed by SegGPT show high performance. This shows the potential for improvement of model’s performance with more support samples.

image

image

Increasing the support set size improves the average dice score evaluation of the prediction across different medical image modalities.

image

These visual representations of few-shot baselines UniverSeg & SegGPT on the diverse medical image modalities hold evidence for the improvement of prediction performance with the increase of support size by which the predictions closely match the ground truth masks.

Conclusion

This work investigates approaches for adapting to new, unseen segmentation tasks by experts without the need for model retraining and fine-tuning on large datasets. The choice of a segmentation model for specific medical image applications depends on the characteristics of the dataset and the required performance level. Our analysis highlights that foundational models like SAM, MedSAM, and Grounding Dino, which are trained on natural image domains, often fall short in medical image segmentation due to domain-specific complexities that these models do not inherently address.

From the comparative results, it is evident that few-shot learning approaches offer a viable solution for medical image segmentation tasks, as they can predict labels without extensive retraining and fine-tuning, even when data is limited.

Overall, this analysis underscores the efficacy of few-shot learning models in medical image segmentation, particularly for applications requiring rapid adaptation to new tasks without the overhead of retraining. These findings support the broader conclusion that few-shot learning approaches are well-suited for generalizing across unseen image modalities in a single pass during inference, offering a practical and efficient solution for medical image annotation challenges. This adaptability makes few-shot learning models a valuable tool in clinical settings where new and varied segmentation tasks frequently arise.

Publications

  1. Final Presentation Slides

Team

Supervisors