Evaluating the Performance of NoSQL Databases for Big Data in Cloud Computing Environments
Downloads
This study aims to evaluate the performance of NoSQL databases in distributed cloud computing environments, addressing the lack of comprehensive benchmarking in this domain. Specifically, it investigates MongoDB and Riak KV, two widely used NoSQL systems, across diverse cloud platforms, including Google Cloud, DigitalOcean, and OpenStack. Using the Yahoo Cloud Serving Benchmark, we designed and implemented a benchmarking model to measure key performance indicators, including latency, throughput, and scalability, under varying workloads and data sizes. The analysis revealed that MongoDB integrated with Google Cloud consistently outperformed other configurations, demonstrating superior throughput and lower latency in read and write operations. In contrast, Riak Key Value generally exhibited higher latency, especially in scan-intensive workloads. To support practical decision-making, a decision tree model was developed based on empirical findings to guide optimal selection of cloud computing platforms and databases. The proposed benchmarking framework is modular and extensible, allowing adaptation to other NoSQL technologies, cloud providers, and performance metrics. This research presents a novel, systematic methodology for evaluating NoSQL database performance in cloud environments, providing actionable insights for selecting high-performing, scalable solutions in big data applications. This modular design enables the addition of more database technologies, deployment options, and performance standards in the future, thereby supporting broader research and real-world applications in distributed systems and cloud computing.
Downloads
[1] Deepa, N., Pham, Q. V., Nguyen, D. C., Bhattacharya, S., Prabadevi, B., Gadekallu, T. R., ... & Pathirana, P. N. (2022). A survey on blockchain for big data: Approaches, opportunities, and future directions. Future Generation Computer Systems, 131, 209-226. doi:10.1016/j.future.2022.01.017.
[2] Souza, F., Tavares, E., & Araújo, C. (2025). A modelling approach for estimating energy consumption of NoSQL-based storage systems. Journal of Supercomputing, 81(6), 797. doi:10.1007/s11227-025-07298-4.
[3] Beckermann, B. M. (2025). Transactional YCSB: Benchmarking ACID-Compliant NoSQL Systems with Multi-Operation Transactions. Datenbanksysteme für Business, Technologie und Web (BTW 2025), 1019-1030. doi:10.18420/BTW2025-67.
[4] Adnan, K., & Akbar, R. (2019). An analytical study of information extraction from unstructured and multidimensional big data. Journal of Big Data, 6(1), 1-38. doi:10.1186/s40537-019-0254-8.
[5] Krishan, K., Gupta, G., & Bhathal, G. S. (2024). Striking the Balance: Comprehensive Insights into Data Consistency in NoSQL Realms. Proceedings of the 18th INDIAcom; 2024 11th International Conference on Computing for Sustainable Global Development, INDIACom 2024, 715–720. doi:10.23919/INDIACom61295.2024.10498626.
[6] Carvalho, I., Sá, F., & Bernardino, J. (2023). Performance Evaluation of NoSQL Document Databases: Couchbase, CouchDB, and MongoDB. Algorithms, 16(2), 78. doi:10.3390/a16020078.
[7] Gomes, C., Meuse, M. N., Nogueira, B., Maciel, P., & Tavares, E. (2023). NoSQL-based storage systems: influence of consistency on performance, availability and energy consumption. Journal of Supercomputing, 79(18), 21424–21448. doi:10.1007/s11227-023-05488-6.
[8] Ferreira, S., Mendonça, J., & Andrade, E. (2025). Experimental Performance Analysis of Data Consistency Levels in NoSQL Databases. Software - Practice and Experience, 55(6), 1059–1070. doi:10.1002/spe.3412.
[9] Pramanik, S., & Bandyopadhyay, S. K. (2023). Analysis of big data. Encyclopedia of data science and machine learning, IGI Global, 97-115. doi:10.4018/978-1-7998-9220-5.ch006.
[10] Weitzenboeck, E. M., Lison, P., Cyndecka, M., & Langford, M. (2022). The GDPR and unstructured data: is anonymization possible? International Data Privacy Law, 12(3), 184–206. doi:10.1093/idpl/ipac008.
[11] Sandhu, A. K. (2022). Big Data with Cloud Computing: Discussions and Challenges. Big Data Mining and Analytics, 5(1), 32–40. doi:10.26599/BDMA.2021.9020016.
[12] Rmis, A. M., & Topcu, A. E. (2020). Evaluating RIAK key value cluster for big data. Tehnicki Vjesnik, 27(1), 157–165. doi:10.17559/TV-20180916120558.
[13] Andreoli, R., Cucinotta, T., & De Oliveira, D. B. (2023). Priority-Driven Differentiated Performance for NoSQL Database-As-A-Service. IEEE Transactions on Cloud Computing, 11(4), 3469–3482. doi:10.1109/TCC.2023.3292031.
[14] Araújo, C., Oliveira, M., Nogueira, B., Maciel, P., & Tavares, E. (2024). Performability evaluation of NoSQL-based storage systems. Journal of Systems and Software, 208, 111885. doi:10.1016/j.jss.2023.111885.
[15] Bansal, N., Sachdeva, S., & Awasthi, L. K. (2024). Are NoSQL Databases Affected by Schema? IETE Journal of Research, 70(5), 4770–4791. doi:10.1080/03772063.2023.2237478.
[16] Aceto, G., Persico, V., & Pescapé, A. (2020). Industry 4.0 and Health: Internet of Things, Big Data, and Cloud Computing for Healthcare 4.0. Journal of Industrial Information Integration, 18, 100129. doi:10.1016/j.jii.2020.100129.
[17] Awaysheh, F. M., Aladwan, M. N., Alazab, M., Alawadi, S., Cabaleiro, J. C., & Pena, T. F. (2022). Security by Design for Big Data Frameworks Over Cloud Computing. IEEE Transactions on Engineering Management, 69(6), 3676–3693. doi:10.1109/TEM.2020.3045661.
[18] Gillis, A.S. (2022). DigitalOcean. Available online: https://www.techtarget.com/searchcloudcomputing/definition/DigitalOcean (accessed on August 2025).
[19] Al-Dhaqm, A., Ikuesan, R. A., Kebande, V. R., Razak, S. A., Grispos, G., Choo, K. K. R., Al-Rimy, B. A. S., & Alsewari, A. A. (2021). Digital Forensics Subdomains: The State of the Art and Future Directions. IEEE Access, 9, 152476–152502. doi:10.1109/ACCESS.2021.3124262.
[20] Singh, B., Martyr, R., Medland, T., Astin, J., Hunter, G., & Nebel, J. C. (2022). Cloud based evaluation of databases for stock market data. Journal of Cloud Computing, 11(1), 53. doi:10.1186/s13677-022-00323-4.
[21] Barkat, A., Dos Santos, A. D., & Ho, T. T. N. (2015). Open stack and cloud stack: Open source solutions for building public and private clouds. Proceedings - 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC 2014, 429–436. doi:10.1109/SYNASC.2014.64.
[22] Osman, A. M. S. (2019). A novel big data analytics framework for smart cities. Future Generation Computer Systems, 91, 620–633. doi:10.1016/j.future.2018.06.046.
[23] Martinez-Mosquera, D., Navarrete, R., Luján-Mora, S., Recalde, L., & Andrade-Cabrera, A. (2024). Integrating OLAP with NoSQL Databases in Big Data Environments: Systematic Mapping. Big Data and Cognitive Computing, 8(6), 64. doi:10.3390/bdcc8060064.
[24] Kanchan, S., Kaur, P., & Apoorva, P. (2020). Empirical Evaluation of NoSQL and Relational Database Systems. Recent Advances in Computer Science and Communications, 14(8), 2637–2650. doi:10.2174/2666255813999200612113208.
[25] Alzoubi, Y. I., Topcu, A. E., & Erkaya, A. E. (2023). Machine Learning-Based Text Classification Comparison: Turkish Language Context. Applied Sciences (Switzerland), 13(16), 9428. doi:10.3390/app13169428.
[26] Topcu, A. E., Alzoubi, Y. I., & Karacabey, H. A. (2023). Text Analysis of Smart Cities: A Big Data-based Model. International Journal of Intelligent Systems and Applications in Engineering, 11(4), 724–733.
[27] Obschonka, M., & Audretsch, D. B. (2020). Artificial intelligence and big data in entrepreneurship: a new era has begun. Small Business Economics, 55(3), 529–539. doi:10.1007/s11187-019-00202-4.
[28] Luan, H., Geczy, P., Lai, H., Gobert, J., Yang, S. J. H., Ogata, H., Baltes, J., Guerra, R., Li, P., & Tsai, C. C. (2020). Challenges and Future Directions of Big Data and Artificial Intelligence in Education. Frontiers in Psychology, 11, 580820. doi:10.3389/fpsyg.2020.580820.
[29] Kenitar, S. B., Arioua, M., & Yahyaoui, M. (2023). A Novel Approach of Latency and Energy Efficiency Analysis of IIoT with SQL and NoSQL Databases Communication. IEEE Access, 11, 129247–129257. doi:10.1109/ACCESS.2023.3332483.
[30] Hofmann, E. (2017). Big data and supply chain decisions: the impact of volume, variety and velocity properties on the bullwhip effect. International Journal of Production Research, 55(17), 5108–5126. doi:10.1080/00207543.2015.1061222.
[31] Khan, W., Kumar, T., Zhang, C., Raj, K., Roy, A. M., & Luo, B. (2023). SQL and NoSQL Database Software Architecture Performance Analysis and Assessments—A Systematic Literature Review. Big Data and Cognitive Computing, 7(2), 97. doi:10.3390/bdcc7020097.
[32] Mishra, A., Jabar, T. S., Alzoubi, Y. I., & Mishra, K. N. (2023). Enhancing privacy-preserving mechanisms in Cloud storage: A novel conceptual framework. Concurrency and Computation: Practice and Experience, 35(26), 7831. doi:10.1002/cpe.7831.
[33] Park, J., & Lee, D. H. (2022). Parallelly Running and Privacy-Preserving k-Nearest Neighbor Classification in Outsourced Cloud Computing Environments. Electronics (Switzerland), 11(24), 4132. doi:10.3390/electronics11244132.
[34] Zeghib, N. E. I., Alwan, A. A., Abualkishik, A. Z., & Gulzar, Y. (2022). Multi-Route Plan for Reliable Services in Fog-Based Healthcare Monitoring Systems. International Journal of Grid and High Performance Computing, 14(1), 1–20. doi:10.4018/IJGHPC.304908.
[35] Riak. (2025). Riak – a distributed , decentralised data storage system. Available online: https://github.com/basho/riak (accessed on August 2025).
[36] Topcu, A. E., & Rmis, A. M. (2020). Analysis and evaluation of the Riak cluster environment in distributed databases. Computer Standards and Interfaces, 72, 103452. doi:10.1016/j.csi.2020.103452.
[37] Eyada, M. M., Saber, W., El Genidy, M. M., & Amer, F. (2020). Performance Evaluation of IoT Data Management Using MongoDB Versus MySQL Databases in Different Cloud Environments. IEEE Access, 8, 110656–110668. doi:10.1109/ACCESS.2020.3002164.
[38] MongoDB. (2025). What is MongoDB? Available online: https://www.mongodb.com/docs/manual/ (accessed on August 2025).
[39] da Silva, L. F., & Lima, J. V. F. (2023). An evaluation of relational and NoSQL distributed databases on a low-power cluster. Journal of Supercomputing, 79(12), 13402–13420. doi:10.1007/s11227-023-05166-7.
[40] Khan, S., Liu, X., Ali, S. A., & Alam, M. (2023). Bivariate, cluster, and suitability analysis of NoSQL solutions for big graph applications. Advances in Computers, 128, 39–105. doi:10.1016/bs.adcom.2021.09.006.
[41] Kim, S., Hoang, Y., Yu, T. T., & Kanwar, Y. S. (2023). GeoYCSB: A Benchmark Framework for the Performance and Scalability Evaluation of Geospatial NoSQL Databases. Big Data Research, 31, 100368. doi:10.1016/j.bdr.2023.100368.
[42] Nurhadi, Kadir, R. B. A., & Surin, E. S. B. M. (2021). Evaluation of NoSQL Databases Features and Capabilities for Smart City Data Lake Management. Lecture Notes in Electrical Engineering: Vol. 739 LNEE, 383–392. doi:10.1007/978-981-33-6385-4_35.
[43] Seghier, N. Ben, & Kazar, O. (2021). Performance Benchmarking and Comparison of NoSQL Databases: Redis vs MongoDB vs Cassandra Using YCSB Tool. Proceedings - 2021 IEEE International Conference on Recent Advances in Mathematics and Informatics, ICRAMI 2021, 9585956. doi:10.1109/ICRAMI52622.2021.9585956.
[44] Celesti, A., Lay-Ekuakille, A., Wan, J., Fazio, M., Celesti, F., Romano, A., Bramanti, P., & Villari, M. (2020). Information management in IoT cloud-based tele-rehabilitation as a service for smart cities: Comparison of NoSQL approaches. Measurement: Journal of the International Measurement Confederation, 151, 107218. doi:10.1016/j.measurement.2019.107218.
[45] Kausar, M. A., & Nasar, M. (2019). SQL Versus NoSQL Databases to Assess Their Appropriateness for Big Data Application. Recent Advances in Computer Science and Communications, 14(4), 1098–1108. doi:10.2174/2213275912666191028111632.
[46] Copper, B. F. (2020). Core YCSB Properties. GitHub. Available online: https://github.com/brianfrankcooper/YCSB/wiki/Core-Properties (accessed on August 2025).
[47] Cribbs, S. (2025). Schema design in Riak – introduction. Available online: https://riak.com/posts/technical/schema-design-in-riak-introduction/index.html (accessed on August 2025).
[48] Capris, T., Melo, P., Garcia, N. M., Pires, I. M., & Zdravevski, E. (2022). Comparison of SQL and NoSQL databases with different workloads: MongoDB vs MySQL evaluation. 2022 International Conference on Data Analytics for Business and Industry, ICDABI 2022, 214–218. doi:10.1109/ICDABI56818.2022.10041513.
[49] Antas, J., Silva, R. R., & Bernardino, J. (2022). Assessment of SQL and NoSQL Systems to Store and Mine COVID-19 Data. Computers, 11(2), 29. doi:10.3390/computers11020029.
[50] Negi, S., Rauthan, M. M. S., Vaisla, K. S., & Panwar, N. (2021). CMODLB: an efficient load balancing approach in cloud computing environment. Journal of Supercomputing, 77(8), 8787–8839. doi:10.1007/s11227-020-03601-7.
[51] Fernandez, R. (2023). Google cloud platform: What is it, and should you use it? Available online: https://www.techrepublic.com/article/google-cloud-platform-the-smart-persons-guide/ (accessed on August 2025).
- This work (including HTML and PDF Files) is licensed under a Creative Commons Attribution 4.0 International License.





















