Machine learning-based prediction of clinical outcomes in luminal b breast cancer using somatic variant signatures
Machine learning-based prediction of clinical outcomes in luminal b breast cancer using somatic variant signatures
DOI:
https://doi.org/10.51473/rcmos.v1i1.2026.2175Keywords:
machine learning. breast cancer. somatic variants. prognosis. GenomicsAbstract
The application of machine learning techniques in oncology has enabled the integration of complex genomic data for predicting clinical outcomes. In Luminal B breast cancer, high tumor heterogeneity poses a major challenge for prognostic stratification and therapeutic decision-making. This study aimed to develop predictive models based on somatic variants to assess tumor aggressiveness. A machine learning pipeline was implemented using supervised algorithms, including XGBoost, Support Vector Machine, and Artificial Neural Networks. Feature selection was based on predictive importance, prioritizing biologically relevant genes. Model performance was evaluated using metrics such as area under the ROC curve, sensitivity, specificity, and F1-score. The results demonstrated high predictive performance, with XGBoost achieving the best results (AUC = 0.88), followed by Neural Networks (AUC = 0.87) and SVM (AUC = 0.85). Interpretability analysis revealed that genes such as PIK3CA, TP53, and ERBB2 were the main contributors to model predictions. These findings highlight the potential of machine learning approaches in identifying genomic patterns associated with tumor aggressiveness, supporting precision medicine strategies.
Downloads
References
ANDRÉ, F. et al. Alpelisib for PIK3CA-mutated, hormone receptor–positive advanced breast cancer. New England Journal of Medicine, v. 380, n. 20, p. 1929–1940, 2019.
BURSTEIN, H. J. et al. Estimating the benefits of therapy for early-stage breast cancer: the St. Gallen International Consensus Guidelines. Annals of Oncology, v. 25, n. 10, p. 1871–1888, 2014.
CANCER GENOME ATLAS NETWORK. Comprehensive molecular portraits of human breast tumours. Nature, v. 490, n. 7418, p. 61–70, 2012.
COLLINS, G. S. et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. Annals of Internal Medicine, v. 162, n. 1, p. 55–63, 2015.
ESTEVA, A. et al. A guide to deep learning in healthcare. Nature Medicine, v. 25, p. 24–29, 2019.
KOURI, A. et al. Artificial intelligence in oncology: current applications and future directions. CA: A Cancer Journal for Clinicians, v. 70, n. 4, p. 268–287, 2020.
LIBBRECHT, M. W.; NOBLE, W. S. Machine learning applications in genetics and genomics. Nature Reviews Genetics, v. 16, n. 6, p. 321–332, 2015.
LUNDBERG, S. M.; LEE, S.-I. A unified approach to interpreting model predictions. In: ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS. [S. l.]: NIPS, 2017. p. 4765–4774.
PEROU, C. M. et al. Molecular portraits of human breast tumours. Nature, v. 406, p. 747–752, 2000.
PRAT, A. et al. Prognostic significance of Ki67 in breast cancer. Journal of Clinical Oncology, v. 33, n. 36, p. 4234–4242, 2015.
SILWAL-PANDIT, L. et al. TP53 mutation spectrum in breast cancer. Breast Cancer Research, v. 19, n. 1, p. 1–14, 2017.
SUNG, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide. CA: A Cancer Journal for Clinicians, v. 71, n. 3, p. 209–249, 2021.
TOPOL, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nature Medicine, v. 25, p. 44–56, 2019.
Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2026 Amanda Razera, Maiara Luiza Biava Miri , Eduardo de Almeida Ravarena , Gabryela Paulista Mateucci , Camila Padilha Duda (Autor)

This work is licensed under a Creative Commons Attribution 4.0 International License.

