Abstract
Standard Knee Osteoarthritis (OA) grading often fails to capture the mismatch between structural damage and patient pain. Our research introduces a pipeline that leverages BiomedCLIP to extract 512-dimensional embeddings from 4,502 X-rays. By fusing these with clinical data (Pain, BMI, Age) and using HDBSCAN clustering, we identified distinct phenotypes. Our findings reveal that certain phenotypes, like Lateral JSN, progress twice as fast as others over a 10-year period.
Project Walkthrough
Research Methodology
YOLO Preprocessing
Automated cropping of bilateral X-rays into individual knee regions to isolate joint space.
BiomedCLIP Encoder
Fine-tuned on KL grade regression to ensure visual attention focuses on osteophytes and JSN.
Late Fusion
Fusing 512-D visual vectors with 5 clinical features for HDBSCAN clustering.
Full architecture of the multimodal phenotyping pipeline.
Key Results
86.4%
Model Accuracy
The fine-tuned BiomedCLIP achieved a Mean Absolute Error of 0.865 KL grade units, significantly outperforming zero-shot baselines.
Progression Analysis
- Lateral JSN phenotype progresses 2x faster than Medial.
- Identified unique "Pain-Dominant" clusters within KL Grade 2.
- XAI validation confirmed focus on joint space and condyles.