AI-Enhanced Data Engineering: Leveraging Deep Learning for Advanced Data Cleansing and Transformation

Authors

  • Sunil Kumar Mudusu Lead AI Data Engineer, Church Mutual Insurance Company, S.I, USA Author

DOI:

https://doi.org/10.15662/jm1gp434

Keywords:

Cleansing, Deep Learning, Transformation, Data Engineering

Abstract

In this paper, we study the integration of the deep learning techniques into the data engineering for the advanced data cleansing and transformation. Using AI, the study explores how these methods apply automation in identifying and fixing data inconsistencies, missing values and outliers, creating a more accurate and accurate data for further analysis and business intelligence purposes.

References

[1] Gudivada, V., Apon, A., & Ding, J. (2017). Data quality considerations for big data and machine learning: Going beyond data cleaning and transformations. International Journal on Advances in Software, 10(1), 1-20. https://personales.upv.es/thinkmind/dl/journals/soft/soft_v10_n12_2017/soft_v10_n12_2017_1.pdf

[2] KUNUNGO, S., RAMABHOTLA, S., & BHOYAR, M. (2018). The Integration of Data Engineering and Cloud Computing in the Age of Machine Learning and Artificial Intelligence. https://www.irejournals.com/formatedpaper/1700696.pdf

[3] Hsieh, W., Bi, Z., Chen, K., Peng, B., Zhang, S., Xu, J., ... & Liu, M. (2024). Deep Learning, Machine Learning, Advancing Big Data Analytics and Management. arXiv preprint arXiv:2412.02187. https://doi.org/10.48550/arXiv.2412.02187

[4] Nesterov, V. (2024). Optimization of big data processing and analysis processes in the field of data analytics through the integration of data engineering and artificial intelligence. Computer-integrated technologies: education, science, production, (54), 160-164. https://doi.org/10.36910/6775-2524-0560-2024-54-19

[5] Jesmeen, M. Z. H., Hossen, J., Sayeed, S., Ho, C. K., Tawsif, K., Rahman, A., & Arif, E. (2018). A survey on cleaning dirty data using machine learning paradigm for big data analytics. Indones. J. Electr. Eng. Comput. Sci, 10(3), 1234-1243. https://doi.org/10.11591/ijeecs.v10.i3.pp1234-1243

[6] Cui, C., Chou, S. H. S., Brattain, L., Lehman, C. D., & Samir, A. E. (2019). Data Engineering for Machine Learning in Women's Imaging and Beyond. American Journal of Roentgenology, 213(1), 216-226. https://doi.org/10.2214/AJR.18.20464

[7] Krishnan, S., & Wu, E. (2019). Alphaclean: Automatic generation of data cleaning pipelines. arXiv preprint arXiv:1904.11827. https://doi.org/10.48550/arXiv.1904.11827

[8] Kozina, A. (2024). Data transformation in decision support using deep learning (Doctoral dissertation, Department of Process Management). https://www.wir.ue.wroc.pl/info/phd/UEWR1fcc49e180774c1f9471f17e83df77ef/

[9] Neira-Rodado, D., Nugent, C., Cleland, I., Velasquez, J., & Viloria, A. (2020). Evaluating the impact of a two- stage multivariate data cleansing approach to improve to the performance of machine learning classifiers: a case study in human activity recognition. Sensors, 20(7), 1858. https://doi.org/10.3390/s20071858

[10] Abdullah, F. B., & Hassan, M. A. B. (2023). From Data to Insights: A Comprehensive Study of Data Preparation, Transformation, and Visualization Techniques in Big Data Analytics. Journal of Computational Social Dynamics, 8(9), 25-31. https://vectoral.org/index.php/JCSD/article/view/36

[11] Kaggle. (2024, May 8). Healthcare Dataset. https://www.kaggle.com/datasets/prasad22/healthcare-dataset

[12] UCSD. (2023). Amazon Reviews’23. https://amazon-reviews-2023.github.io/

Downloads

Published

2025-02-11

How to Cite

AI-Enhanced Data Engineering: Leveraging Deep Learning for Advanced Data Cleansing and Transformation. (2025). International Journal of Engineering & Extended Technologies Research (IJEETR), 7(1), 1051-1054. https://doi.org/10.15662/jm1gp434