Big Data Tools Review – Case Study: Spark Vs Flink
DOI:
https://doi.org/10.51473/rcmos.v1i2.2025.1517Keywords:
Big Data, Análise de Dados, Open Source, Spark, FlinkAbstract
This article aimed to compare the performance of Big Data tools, Spark and Flink, considering five attributes that make these systems highly complex and demanding in terms of processing. As a result, the tools used to work with this data tend to be significantly more robust than conventional ones, often being more expensive as well. Open-source systems provide access to the source code, making it easier for collaborators to understand the systems and algorithms, and allowing them to adapt them according to their project needs. The methodology adopted in this study is based on descriptive research to relate the variables, with an exploratory and explanatory approach of a qualitative nature. A case study is presented, comparing the Spark and Flink platforms, considering factors such as scalability, data storage, complexity, and implementation options.
Downloads
References
CASTELLS, M. A. Sociedade em rede. v. 1. São Paulo: Paz e Terra, 1999.
FAYYAD, U. M.; PIATETSKY-SHAPIRO, G.; UTHURUSAMY, R. Advances in Knowledge Discovery and Data Mining. USA: MIT Press, 1996.
GAO, J. Z.; XIE, C.; TAO, C. Big data validation and quality assurance issues, challenges, and needs. In: SOSE, p. 433-441. IEEE Computer Society, 2016. DOI: https://doi.org/10.1109/SOSE.2016.63
JOHN, R. M. Big Data, 1.3 X/year CAGR: historical trendline 1.6 X/year since 1990, 2.0 X/year leap 1998/1999, 1998.
LEE, Y. W. et al. A methodology for information quality assessment. Information & Management: Aimq, v. 40, n. 2, p. 133-, 2012. DOI: https://doi.org/10.1016/S0378-7206(02)00043-5
MARCONI, M. de A.; LAKATOS, E. M. Metodologia científica. 7. ed. (atualização João Bosco Medeiros). São Paulo: Atlas, 2017.
MARQUES, A. V. Importância dos big data no setor, 2017.
MICROSOFT. Big Data Architectures. Microsoft Learn, 2025. Disponível em: https://learn.microsoft.com/en-us/azure/architecture/databases/guide/big-data-architectures
. Acesso em: 15 out. 2025.
MIRANDA, J. V. Big Data. Alura, 2023. Disponível em: https://www.alura.com.br/artigos/big-data
. Acesso em: 15 out. 2025.
NARKHEDE, N.; SHAPIRA, G.; PALINO, T. Kafka: The Definitive Guide. Sebastopol, CA: O’Reilly Media, Inc., 2017.
PEREIRA, A. S. et al. Metodologia da pesquisa científica. Santa Maria: UAB/NTE/UFSM, 2018. DOI: http://repositorio.ufsm.br/handle/1/15824
.
PUSHKAREV, V. et al. An overview of open source data quality tools. In: IKE, p. 370-376. CSREA Press, 2010.
RESEARCHGATE. Generic Architecture for a Big Data Analytical. 2021. Disponível em: https://www.researchgate.net/publication/318870641/figure/fig1/
. Acesso em: 15 out. 2025.
TORSTEN, H. Machine Learning & Statistical Learning. 2018. Disponível em: https://pt.wikipedia.org/wiki/Big_data
. Acesso em: 15 out. 2025.
WAMPLER, D. Fast Data Architectures for Streaming Applications. 2. ed. Sebastopol, CA: O’Reilly Media, Inc., 2018.
WITTEN, I. H.; FRANK, E.; HALL, M. A. Data Mining: Practical Machine Learning Tools and Techniques. 3. ed. San Francisco: Morgan Kaufmann Publishers, 2011. DOI: https://doi.org/10.1016/B978-0-12-374856-0.00001-8
Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2025 David Francisco Cudijinguissa (Autor)

This work is licensed under a Creative Commons Attribution 4.0 International License.