Machine learning-based prediction of clinical outcomes in luminal b breast cancer using somatic variant signatures

Machine learning-based prediction of clinical outcomes in luminal b breast cancer using somatic variant signatures

Authors

  • Amanda Razera Universidade Estadual do Centro-Oeste  Author
  • Maiara Luiza Biava Miri  Centro Universitário Campo Real   Author
  • Eduardo de Almeida Ravarena  Centro Universitário Campo Real  Author
  • Gabryela Paulista  Mateucci  Centro Universitário Campo Real  Author
  • Camila Padilha Duda Centro Universitário Campo Real  Author

DOI:

https://doi.org/10.51473/rcmos.v1i1.2026.2175

Keywords:

machine learning. breast cancer. somatic variants. prognosis. Genomics

Abstract

The application of machine learning techniques in oncology has enabled the integration of complex genomic data for predicting clinical outcomes. In Luminal B breast cancer, high tumor heterogeneity poses a major challenge for prognostic stratification and therapeutic decision-making. This study aimed to develop predictive models based on somatic variants to assess tumor aggressiveness. A machine learning pipeline was implemented using supervised algorithms, including XGBoost, Support Vector Machine, and Artificial Neural Networks. Feature selection was based on predictive importance, prioritizing biologically relevant genes. Model performance was evaluated using metrics such as area under the ROC curve, sensitivity, specificity, and F1-score. The results demonstrated high predictive performance, with XGBoost achieving the best results (AUC = 0.88), followed by Neural Networks (AUC = 0.87) and SVM (AUC = 0.85). Interpretability analysis revealed that genes such as PIK3CA, TP53, and ERBB2 were the main contributors to model predictions. These findings highlight the potential of machine learning approaches in identifying genomic patterns associated with tumor aggressiveness, supporting precision medicine strategies. 

Downloads

Download data is not yet available.

References

ANDRÉ, F. et al. Alpelisib for PIK3CA-mutated, hormone receptor–positive advanced breast cancer. New England Journal of Medicine, v. 380, n. 20, p. 1929–1940, 2019.

BURSTEIN, H. J. et al. Estimating the benefits of therapy for early-stage breast cancer: the St. Gallen International Consensus Guidelines. Annals of Oncology, v. 25, n. 10, p. 1871–1888, 2014.

CANCER GENOME ATLAS NETWORK. Comprehensive molecular portraits of human breast tumours. Nature, v. 490, n. 7418, p. 61–70, 2012.

COLLINS, G. S. et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. Annals of Internal Medicine, v. 162, n. 1, p. 55–63, 2015.

ESTEVA, A. et al. A guide to deep learning in healthcare. Nature Medicine, v. 25, p. 24–29, 2019.

KOURI, A. et al. Artificial intelligence in oncology: current applications and future directions. CA: A Cancer Journal for Clinicians, v. 70, n. 4, p. 268–287, 2020.

LIBBRECHT, M. W.; NOBLE, W. S. Machine learning applications in genetics and genomics. Nature Reviews Genetics, v. 16, n. 6, p. 321–332, 2015.

LUNDBERG, S. M.; LEE, S.-I. A unified approach to interpreting model predictions. In: ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS. [S. l.]: NIPS, 2017. p. 4765–4774.

PEROU, C. M. et al. Molecular portraits of human breast tumours. Nature, v. 406, p. 747–752, 2000.

PRAT, A. et al. Prognostic significance of Ki67 in breast cancer. Journal of Clinical Oncology, v. 33, n. 36, p. 4234–4242, 2015.

SILWAL-PANDIT, L. et al. TP53 mutation spectrum in breast cancer. Breast Cancer Research, v. 19, n. 1, p. 1–14, 2017.

SUNG, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide. CA: A Cancer Journal for Clinicians, v. 71, n. 3, p. 209–249, 2021.

TOPOL, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nature Medicine, v. 25, p. 44–56, 2019.

Published

2026-03-19

How to Cite

RAZERA, Amanda; MIRI , Maiara Luiza Biava; RAVARENA , Eduardo de Almeida; MATEUCCI , Gabryela Paulista ; DUDA, Camila Padilha. Machine learning-based prediction of clinical outcomes in luminal b breast cancer using somatic variant signatures: Machine learning-based prediction of clinical outcomes in luminal b breast cancer using somatic variant signatures. Multidisciplinary Scientific Journal The Knowledge, Brasil, v. 1, n. 1, 2026. DOI: 10.51473/rcmos.v1i1.2026.2175. Disponível em: https://submissoesrevistarcmos.com.br/rcmos/article/view/2175. Acesso em: 21 mar. 2026.