Out of Domain Generalization in Medical Imaging via Vision Language Models

Team

Supervisors

Table of content

  1. Abstract
  2. Related works
  3. Methodology
  4. Experiment Setup and Implementation
  5. Results and Analysis
  6. Conclusion
  7. Publications
  8. Links

Abstract

This research addresses domain generalization challenges in medical imaging using BiomedCLIP as the baseline model, a vision-language model optimized via advanced prompting strategies. We propose an automatic prompting method that improves interpretability and generalization through iterative feedback to large language models (LLMs), specifically adapting prompts for disease classification tasks from histopathological images.


Vision-language models (VLMs) such as CLIP and BiomedCLIP have shown great promise in biomedical tasks, with models like BiomedCoOp and XCoOp introducing domain-specific prompt learning. However, many lack interpretability and rely on single static LLM outputs. Our method builds on these by integrating iterative feedback for prompt refinement, improving both robustness and transparency in clinical tasks.


Methodology

We use BiomedCLIP as our base and apply a series of prompt optimization techniques to enhance out-of-domain generalization. The methodology includes:

Preprocessing

Prompt Optimization

An LLM-driven prompt generation framework starts with an initial set of prompts from Gemini. Using performance scores, prompts are iteratively refined to improve classification. The process includes:

CLIP Fine-Tuning Techniques

Three strategies are explored and compared:

Validation Strategy

Model performance is validated both in-domain and out-of-domain using accuracy, F1-score, AUC, and OOD metrics.


Results and Analysis

Conclusion

The integration of iterative, interpretable prompt generation using LLMs significantly improves the domain generalization capabilities of vision-language models in medical imaging. The approach offers a path forward for deploying robust and explainable AI tools in clinical settings.


Publications