2026 Proffered Presentations
S096: A META-ANALYSIS OF ACCURACY AND CLINICAL IMPLICATIONS OF ARTIFICIAL INTELLIGENCE MODELLING IN PREDICTING REMISSION RATES AFTER ENDONASAL TRANSSPHENOIDAL SURGERY FOR CUSHING DISEASE
Mehdi Khaleghi, MD; Garrett Dyess; Asa Record; Andrew Romeo, MD; Jai D Thakur, MD; University of South Alabama
Purpose: Advancements in endonasal transsphenoidal surgery have improved outcomes for patients with Cushing’s disease (CD), yet persistent hypercortisolemia remains challenging. Remission criteria vary, and recurrences are high. Artificial intelligence (AI) models, including machine learning (ML) and artificial neural networks (ANN), may enhance prognostication by analyzing complex data patterns. This study systematically reviews AI’s role in predicting remission in CD. Our primary purpose is to investigate whether AI models can be integrated into the daily practice of neurosurgeons and neuroendocrinologists to stratify patients with CD at risk for unfavorable outcomes.
Methods: PubMed, Scopus, and Ovid/Medline databases were searched to identify articles on AI-based predictive modeling for CD remission. The pooled sensitivity, specificity, positive and negative likelihood ratios (PLR-NLR), area under the curve (AUC), and diagnostic odds ratio (DOR) were calculated using a random-effects model.
Results: After screening the studies according to the PRISMA guideline, six studies met the eligibility criteria for quantitative synthesis of meta-analysis. The overall pooled sensitivity and specificity were 87.9% (CI:76.8%-94.1%) and 59.5% (CI:56%-62.9%). Subgroup analysis showed a pooled sensitivity of 96.3% (CI:30.9%-99.9%) in short-term versus 86.3% (CI: 73.5%-93.4%) in the long-term remission, and 89.6% (CI: 73.4%-96.4%) in the ML group versus 85.8% (CI:56%-96.6%) in the ANN group. The pooled specificity was 84.2% (CI:18.8%-99.2%) in short-term versus 61% (CI: 45.8%-74.3%) in long-term remission. The pooled specificity was 59.3% (CI:55.7%-62.7%) in the ML group versus 81.2% (CI:16.8%-98.9%) in the ANN group. The pooled DOR was 13.86 (CI:5.550, 34.610), with the subgroup analysis showing a DOR of 183.298 (CI:0.129-261315.026) in the short-term and 10.073 (CI:5.264-19.272) in the long-term remission group. The DOR was 15.626 (CI:2.994-81.552) in the ML group versus 29.186 (CI:6.599-129.087) in the ANN group, with the meta-regression showing significantly higher pooled DOR in ANN than ML (p=0.046). The pooled NLR and PLR were 0.195 (CI:0.112-0.34) and 2.0 (CI:1.83-2.19), respectively. There were no significant differences in sensitivity and specificity among short- versus long-term or ML versus ANN. A weighted analysis showed a pooled AUC of 0.73 (CI: 0.69-0.78), reflecting a relatively high diagnostic accuracy.
Conclusions: AI modelling using ML and ANN has the potential to stratify those CD patients who are less likely to achieve remission after endonasal transsphenoidal surgery and thus benefit from closer observation, earlier intervention, or adjuvant treatments. By recognizing intrinsic patterns in variable-outcome relationships, ML and ANN have shown relatively high accuracy and efficiency, making them viable alternative tools for surgical planning and patient counseling. To integrate AI-driven models into real practice and standard care for CD, further refinements and validation are necessary to enhance physician-patient rapport and outcomes.




