Ethical Training of Neural Networks: Datasets, Bias, and Privacy in Deepfake Detection

Ethical Training of Neural Networks: Datasets, Bias, and Privacy in Deepfake Detection

Authors

  • Matheus de Oliveira Pereira Paula Université Côte d’Azur Author

DOI:

https://doi.org/10.51473/rcmos.v1i2.2024.1871

Keywords:

deepfakes; algorithmic ethics; dataset bias; digital privacy; informational sovereignty

Abstract

The automated detection of deepfakes through large-scale neural networks has become a civilizational necessity for protecting informational integrity in the digital environment. However, the most critical point in this architecture does not lie in the model itself, but in the ethical formation of the datasets used during training — which are frequently affected by structural biases related to race, gender, and geopolitics, as well as by potential violations of privacy and non-consensual biometric data extraction. This article develops an epistemologically rigorous analysis of the ethical dilemmas embedded in dataset construction for deepfake detection models, examining how flawed curation can reinforce algorithmic oppression, reproduce historical inequalities, and ultimately endanger democratic legitimacy. It concludes by proposing an ethical framework and engineering guidelines for responsible and auditable dataset governance, anchored in informational sovereignty and computational justice

Downloads

Download data is not yet available.

Author Biography

  • Matheus de Oliveira Pereira Paula, Université Côte d’Azur

    Bacharelado em Sistemas de Informação — Instituto Federal de Educação, Ciência e Tecnologia Fluminense Mestrado: MSc Data Science and Artificial Intelligence — Université Côte d’Azur

References

ARAL, SINAN. The hype machine: how social media disrupts our elections, our economy, and our health—and how we must adapt. New York: Currency, 2020.

BRUNDAGE, MILES; WHITTLESTONE, JESS. Governance of artificial intelligence: ethics, law and policy. Oxford: Oxford Internet Institute, 2020.

BUOLAMWINI, JOY; GEBRU, TIMNIT. Gender shades: intersectional accuracy disparities in commercial gender classification. Proceedings of Machine Learning Research, 2018.

CRAWFORD, KATE. Atlas of AI: power, politics, and the planetary costs of artificial intelligence. New Haven: Yale University Press, 2021. DOI: https://doi.org/10.12987/9780300252392

DOSHI-VELEZ, FINALE; KIM, BEEN. Towards a rigorous science of interpretable machine learning. arXiv, 2018.

FLORIDI, LUCIANO. The logic of information: a theory of philosophy as conceptual design. Oxford: Oxford University Press, 2020. DOI: https://doi.org/10.1093/oso/9780198833635.001.0001

GEBRU, TIMNIT et al. Datasheets for datasets. In: CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY. 2018.

MILAN, STEFANIA; TRERÉ, EMILIANO. The rise of data colonialism: reclaiming digital sovereignty. Social Media + Society, 2019.

NATO STRATCOM. Deepfake detection and information integrity report. Brussels: NATO Strategic Communications Centre of Excellence, 2021.

VACCARI, CRISTIAN; CHADWICK, ANDREW. Deepfakes and disinformation: political campaigns in the AI age. Journal of Political Communication, 2020.

ZUBOFF, SHOSHANA. The age of surveillance capitalism. New York: PublicAffairs, 2019.

Published

2024-12-20

How to Cite

PAULA, Matheus de Oliveira Pereira. Ethical Training of Neural Networks: Datasets, Bias, and Privacy in Deepfake Detection: Ethical Training of Neural Networks: Datasets, Bias, and Privacy in Deepfake Detection. Multidisciplinary Scientific Journal The Knowledge, Brasil, v. 1, n. 2, 2024. DOI: 10.51473/rcmos.v1i2.2024.1871. Disponível em: https://submissoesrevistarcmos.com.br/rcmos/article/view/1871. Acesso em: 1 jan. 2026.