2025 Poster Presentations
P464: CROSSMODA COMPUTATIONAL COMPETITION: EVOLUTION OF CROSS-MODALITY DOMAIN ADAPTATION TECHNIQUES FOR VESTIBULAR SCHWANNOMA AND COCHLEA SEGMENTATION FROM 2021 TO 2023
Navodini Wijethilake1; Reuben Dorent2; Marina Ivory1; Aaron Kujawa1; Jonathan Shapey3; Tom Vercauteren1; 1King's College London, London, United Kingdom; 2Harvard Medical School, USA; 3King's College Hospital, London, United Kingdom
Machine learning has significantly advanced medical image analysis, but varied data acquisition methods in real-world clinical settings often lead to "domain shift," between training and deployment which reduces the model performance. The international Cross-Modality Domain Adaptation (CrossMoDA) computational competition was launched to tackle this issue in vestibular schwannoma (VS) imaging. Accurate tumor and cochlea identification is essential for radiosurgery planning and monitoring, with clinicians typically using contrast-enhanced T1 (ceT1) imaging. However, safety concerns and high costs associated with contrast agents have prompted interest in T2 imaging, which is safer and more cost-effective.
The CrossMoDA, initiated in 2021, focuses on unsupervised cross-modality segmentation from ceT1 to T2 as an extreme example of domain shift and aims to automate VS and cochlea segmentation on T2 scans. Over time, the challenge's objectives have evolved to enhance clinical relevance. The inaugural 2021 edition used single-institutional data for a two-class segmentation task (tumor and cochlea). The 2022 edition expanded to multi-institutional data and in 2023, the challenge further extended to include multi-institutional, heterogeneous data from routine surveillance and added a sub-segmentation task to differentiate between intra-/extra-meatal tumor components.
The 2021 challenge utilized the L-SC-GK dataset (Single-center Gamma-Knife). The T-SC-GK data was introduced for the 2022 edition and for the 2023 edition, U-MC-RC (Multi-center routine-clinical) dataset was introduced. Figure 1 illustrates the diversity of the datasets.
Typical unsupervised domain adaptation unfolds in three key stages (Figure 2). 1) Unpaired image translation techniques are applied to convert ceT1 images into synthetic T2 images. 2) Training a segmentation model using both the synthetic T2 images and the corresponding ceT1 labels. 3) Self-training to minimize the domain gap between synthetic and real T2 images.
Cochlea Dice (Median±SD) |
Cochlea ASSD (Median±SD) |
VS Dice (Median±SD) |
VS ASSD (Median±SD) |
|
Crossmoda2021 |
85.71±3.86 |
0.13±1.62 |
87.02±12.74 |
0.39±0.44 |
Crossmoda2022 |
87.64±3.42 |
0.15±1.04 |
86.07±6.38 |
0.38±0.47 |
Crossmoda2023 |
84.07±3.24 |
0.21±0.07 |
87.09±6.26 |
0.4±0.22 |
The Dice and ASSD metrics for the VS and Cochlea structures were used to assess the performance of the winning approaches (Figure 3, Table 1). The results indicate that the number of outliers decreases with the expanding dataset introduced in each edition. This is notable since the diversity of the datasets concurrently increased. Specifically, the winning approach of the 2023 edition reduced the number of outliers on the 2021 and 2022 testing data, demonstrating how increased data heterogeneity can enhance segmentation performance on homogeneous data. However, the Cochlea Dice score decreased in the 2023 edition, probably due to the introduction of a subdivided tumour annotation that challenged the ability to maintain high performance across all segmentation classes.