AI-Powered Data Engineering Frameworks for Smart Manufacturing Quality Control
DOI:
https://doi.org/10.15662/IJEETR.2024.0606020Keywords:
AI-Powered Quality Control, Smart Manufacturing Systems, Industrial AI Architectures, Manufacturing Data Engineering Frameworks, Defect Detection Algorithms, Root-Cause Analysis Modeling, Process Reliability Optimization, Machine Learning in Production, Industrial IoT (IIoT) Data Pipelines, End-to-End Quality Control Architecture, Data Governance in Manufacturing, Quality Control Data Engineering-as-a-Service, Real-Time Manufacturing Analytics, Supervised and Unsupervised Learning Integration, Time Series Forecasting for Maintenance, Data Quality Requirements in Smart Factories, Automated Inspection Systems, Big Data in Industrial Environments, Self-Driving Factory Paradigm, Intelligent Production Process OptimizationAbstract
AI-Powered Data Engineering Frameworks for Smart Manufacturing Quality Control presents an evidence-based, formal analysis of AI methods, data pipelines, and governance to improve defect detection and process reliability in smart manufacturing quality control. The contributions cover data engineering prerequisites—including data sources, quality requirements, acquisition approaches, ingestion methods, latency considerations, and integration—together with key decision-supporting AI techniques, a comprehensive system architecture for end-to-end quality control, and high-level data governance requirements.
Exploiting artificial intelligence (AI) to enhance manufacturer-automated quality control processes enables self-driving factories with reduced defect rates. AI methods are implemented for defect detection, correlation, and root-cause forecasting, closing the gaps between Machine Learning, Big Data, and IoT. Data quality proves decisive for these operations, raising specialized Data Engineering requirements across the entire analytical pipeline and including Quality Control Data Engineering-as-a-Service. By framing the analysis within the broader context of smart factory data engineering, a comprehensive set of Quality Control data quality requirements emerges and combinations of supervised, unsupervised, and time series methods are explored to tackle both defect detection and repair procedure prediction.
References
[1] Lee, J., Bagheri, B., & Kao, H. A. (2015). A cyber-physical systems architecture for Industry 4.0-based manufacturing systems. Manufacturing Letters, 3, 18–23.
[2] Kalisetty, S. (2024). Deep learning frameworks for multi-modal data fusion in retail supply chains: enhancing forecast accuracy and agility.
[3] Lasi, H., Fettke, P., Kemper, H. G., et al. (2014). Industry 4.0. Business & Information Systems Engineering, 6(4), 239–242.
[4] Nagabhyru, K. C. (2024). Data Engineering in the Age of Large Language Models: Transforming Data Access, Curation, and Enterprise Interpretation. Computer Fraud and Security.
[5] Qi, Q., & Tao, F. (2018). Digital twin and big data towards smart manufacturing. Enterprise Information Systems, 12(9–10), 1105–1121.
[6] Aitha, A. R. (2024). Generative AI-Powered Fraud Detection in Workers' Compensation: A DevOps-Based Multi-Cloud Architecture Leveraging, Deep Learning, and Explainable AI. Deep Learning, and Explainable AI (July 26, 2024).
19. Spackman, K. A., Campbell, K. E., & Côté, R. A. (1997). SNOMED RT. JAMIA, 4(6), 640–649.
[7] Wan, J., Tang, S., Li, D., et al. (2018). A manufacturing big data solution for active preventive maintenance. IEEE Transactions on Industrial Informatics, 13(4), 2039–2047.
[8] Varri, D. B. S. (2022). A Framework for Cloud-Integrated Database Hardening in Hybrid AWS-Azure Environments: Security Posture Automation Through Wiz-Driven Insights. International Journal of Scientific Research and Modern Technology, 1(12), 216-226.
[9] Wang, S., Wan, J., Zhang, D., et al. (2016). Towards smart factory for Industry 4.0. International Journal of Distributed Sensor Networks, 12(1), 1–12.
[10] Garapati, R. S. (2023). Optimizing Energy Consumption in Smart Build-ings Through Web-Integrated AI and Cloud-Driven Control Systems.
[11] Lu, Y. (2017). Industry 4.0: A survey on technologies and applications. Journal of Industrial Information Integration, 6, 1–10.
[12] Nagubandi, A. R. (2023). Advanced Multi-Agent AI Systems for Autonomous Reconciliation Across Enterprise Multi-Counterparty Derivatives, Collateral, and Accounting Platforms. International Journal of Finance (IJFIN)-ABDC Journal Quality List, 36(6), 653-674.
[13] Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.
[14] Vardhan Kumar Bandi, V. D. (2024). Automated Feature Engineering Systems in Large-Scale Healthcare Data Environments. Journal of Neonatal Surgery, 13(1), 2127–2141. Retrieved from https://www.jneonatalsurg.com/index.php/jns/article/view/10004.
[15] Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097–1105.
[16] Gottimukkala, V. R. R. (2023). Privacy-Preserving Machine Learning Models for Transaction Monitoring in Global Banking Networks. International Journal of Finance (IJFIN)-ABDC Journal Quality List, 36(6), 633-652.
[17] Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. CVPR Proceedings, 1251–1258.
[18] Davuluri, P. N. Integrating Artificial Intelligence into Event-Driven Financial Crime Compliance Platforms.
[19] Bergmann, P., Fauser, M., Sattlegger, D., & Steger, C. (2019). MVTec AD—A comprehensive real-world dataset for unsupervised anomaly detection. CVPR Proceedings, 9592–9600.
[20] Kushvanth Chowdary Nagabhyru. (2023). Accelerating Digital Transformation with AI Driven Data Engineering: Industry Case Studies from Cloud and IoT Domains. Educational Administration: Theory and Practice, 29(4), 5898–5910. https://doi.org/10.53555/kuey.v29i4.10932
[21] Ruff, L., Vandermeulen, R. A., Görnitz, N., et al. (2018). Deep one-class classification. ICML Proceedings, 4393–4402.
[22] Meda, R. (2023). Intelligent Infrastructure for Real-Time Inventory and Logistics in Retail Supply Chains. Educational Administration: Theory and Practice.
[23] Schlegl, T., Seeböck, P., Waldstein, S. M., et al. (2017). Unsupervised anomaly detection with GANs. Information Processing in Medical Imaging, 146–157.
[24] Amistapuram, K. (2024). Generative AI in Insurance: Automating Claims Documentation and Customer Communication. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 15(3), 461–475. https://doi.org/10.61841/turcomat.v15i3.15474
[25] Breunig, M. M., Kriegel, H. P., Ng, R. T., & Sander, J. (2000). LOF: Identifying density-based local outliers. ACM SIGMOD, 93–104.
[26] Rongali, S. K., & Kumar Kakarala, M. R. (2024). Existing challenges in ethical AI: Addressing algorithmic bias, transparency, accountability and regulatory compliance.
[27] Agentic AI in Data Pipelines: Self Optimizing Systems for Continuous Data Quality, Performance and Governance. (2024). American Data Science Journal for Advanced Computations (ADSJAC) ISSN: 3067-4166, 2(1).
[28] Bishop, C. M. (2006). Pattern recognition and machine learning. Springer.
[29] Pandugula, C., Kalisetty, S., & Polineni, T. N. S. (2024). Omni-channel Retail: Leveraging Machine Learning for Personalized Customer Experiences and Transaction Optimization. Utilitas Mathematica, 121, 389-401.
[30] Batini, C., & Scannapieco, M. (2016). Data and information quality. Springer.
[31] Kalisetty, S. (2023). The Role of Circular Supply Chains in Achieving Sustainability Goals: A 2023 Perspective on Recycling, Reuse, and Resource Optimization. Reuse, and Resource Optimization (June 15, 2023).
[32] Sculley, D., Holt, G., Golovin, D., et al. (2015). Hidden technical debt in machine learning systems. Advances in Neural Information Processing Systems, 28, 2503–2511.
[33] Segireddy, A. R. (2024). Machine Learning-Driven Anomaly Detection in CI/CD Pipelines for Financial Applications. Journal of Computational Analysis and Applications, 33(8).
[34] Amershi, S., Begel, A., Bird, C., et al. (2019). Software engineering for machine learning. IEEE Software, 36(5), 56–67.
[35] Varri, D. B. S. (2024). Adaptive and Autonomous Security Frameworks Using Generative AI for Cloud Ecosystems. Available at SSRN 5774785.
[36] Breck, E., Cai, S., Nielsen, E., et al. (2017). The ML test score. IEEE Big Data Proceedings, 1123–1132.
[37] Keerthi Amistapuram. (2024). Federated Learning for Cross-Carrier Insurance Fraud Detection: Secure Multi-Institutional Collaboration. Journal of Computational Analysis and Applications (JoCAAA), 33(08), 6727–6738. Retrieved from https://www.eudoxuspress.com/index.php/pub/article/view/3934
[38] Zaharia, M., Das, T., Li, H., et al. (2012). Discretized streams: Fault-tolerant streaming computation. USENIX NSDI, 423–438.
[39] Kreps, J., Narkhede, N., & Rao, J. (2011). Kafka: A distributed messaging system. NetDB Workshop.
[40] Inala, R. Revolutionizing Customer Master Data in Insurance Technology Platforms: An AI and MDM Architecture Perspective.
[41] Carbone, P., Katsifodimos, A., Ewen, S., et al. (2015). Apache Flink: Stream and batch processing. IEEE Data Engineering Bulletin, 38(4), 28–38.
[42] Singireddy, J. (2024). AI-Enhanced Tax Preparation and Filing: Automating Complex Regulatory Compliance. European Data Science Journal (EDSJ) p-ISSN 3050-9572 en e-ISSN 3050-9580, 2(1).
[43] van der Aalst, W. M. P. (2016). Process mining: Data science in action (2nd ed.). Springer.
[44] Varri, D. B. S. (2023). Advanced Threat Intelligence Modeling for Proactive Cyber Defense Systems. Available at SSRN 5774926.
[45] Tao, F., Qi, Q., Liu, A., & Kusiak, A. (2018). Data-driven smart manufacturing. Journal of Manufacturing Systems, 48, 157–169.
[46] Paleti, S. (2024). Transforming Financial Risk Management with AI and Data Engineering in the Modern Banking Sector. American Journal of Analytics and Artificial Intelligence (ajaai) with ISSN 3067-283X, 2(1).
[47] Wan, J., Cai, H., & Zhou, K. (2015). Industrie 4.0: Enabling technologies. IEEE Access, 3, 1567–1579.
[48] Kalisetty, S., & Singireddy, J. (2023). Optimizing Tax Preparation and Filing Services: A Comparative Study of Traditional Methods and AI Augmented Tax Compliance Frameworks. Available at SSRN 5206185.
[49] Zhang, C., Yang, J., & Chen, Y. (2023). AI-enabled defect detection in smart factories using hybrid deep learning models. IEEE Access, 11, 94532–94545.
[50] Sheelam, G. K., & Koppolu, H. K. R. (2024). From Transistors to Intelligence: Semiconductor Architectures Empowering Agentic AI in 5G and Beyond. Journal of Computational Analy- sis and Applications(JoCAAA), 33(08), 4518-4537.
[51] Li, X., Sun, Q., & Wang, H. (2024). Real-time industrial anomaly detection with edge-cloud collaborative learning. IEEE Transactions on Industrial Informatics, 20(2), 1324–1336.
[52] Aitha, A. R. (2023). CloudBased Micro services Architecture for Seamless Insurance Policy Administration. International Journal of Finance (IJFIN)-ABDC Journal Quality List, 36(6), 607-632.
[53] Li, X., Sun, Q., & Wang, H. (2024). Real-time industrial anomaly detection with edge–cloud collaborative learning. IEEE Transactions on Industrial Informatics, 20(2), 1324–1336.
[54] Kolla, S. H. (2024). RETRIEVAL-AUGMENTED GENERATION WITH SMALL LLMS FOR KNOWLEDGE-DRIVEN DECISION AUTOMATION IN ENTERPRISE SERVICE PLATFORMS. Turkish Journal of Computer and Mathematics Education (TURCOMAT), 15(3), 476–486. https://doi.org/10.61841/turcomat.v15i3.15497.
[55] Zhang, C., Yang, J., & Chen, Y. (2023). AI-enabled defect detection in smart factories using hybrid deep learning models. IEEE Access, 11, 94532–94545.
[56] Guntupalli, R. (2024). Enhancing Cloud Security with AI: A Deep Learning Approach to Identify and Prevent Cyberattacks in Multi-Tenant Environments. Available at SSRN 5329132.
[57] Tao, F., Qi, Q., Liu, A., & Kusiak, A. (2018). Data-driven smart manufacturing. Journal of Manufacturing Systems, 48, 157–169.
[58] Kolla, S. K. (2021). Designing Scalable Healthcare Data Pipelines for Multi-Hospital Networks. World Journal of Clinical Medicine Research, 1(1), 1–14. Retrieved from https://www.scipublications.com/journal/index.php/wjcmr/article/view/1376
[59] Lee, J., Bagheri, B., & Kao, H. A. (2015). A cyber-physical systems architecture for Industry 4.0-based manufacturing systems. Manufacturing Letters, 3, 18–23.
[60] Yandamuri, U. S. AI-Driven Decision Support Systems for Operational Optimization in Hospitality Technology.
[61] Lu, Y. (2017). Industry 4.0: A survey on technologies, applications and open research issues. Journal of Industrial Information Integration, 6, 1–10.
[62] Koppolu, H. K. R., & Sheelam, G. K. (2024). Machine Learning-Driven Optimization in 6G Telecommunications: The Role of Intelligent Wireless and Semiconductor Innovation. Global Research Development (GRD) ISSN: 2455-5703, 9(12).
[63] Wan, J., Tang, S., Li, D., et al. (2018). A manufacturing big data solution for active preventive maintenance. IEEE Transactions on Industrial Informatics, 13(4), 2039–2047.
[64] Lahari Pandiri, "AI-Powered Fraud Detection Systems in Professional and Contractors Insurance Claims," International Journal of Innovative Research in Electrical, Electronics, Instrumentation and Control Engineering (IJIREEICE), DOI 10.17148/IJIREEICE.2024.121206.
[65] Wang, S., Wan, J., Zhang, D., Li, D., & Zhang, C. (2016). Towards smart factory for Industry 4.0: A self-organized multi-agent system with big data-based feedback and coordination. International Journal of Distributed Sensor Networks, 12(1), 1–12.
[66] Rongali, S. K. (2023). Explainable Artificial Intelligence (XAI) Framework for Transparent Clinical Decision Support Systems. International Journal of Medical Toxicology and Legal Medicine, 26(3), 22-31.
[67] Bergmann, P., Fauser, M., Sattlegger, D., & Steger, C. (2019). MVTec AD—A comprehensive real-world dataset for unsupervised anomaly detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9592–9600.
[68] Inala, R. AI-Powered Investment Decision Support Systems: Building Smart Data Products with Embedded Governance Controls.
[69] Ruff, L., Vandermeulen, R. A., Görnitz, N., et al. (2018). Deep one-class classification. Proceedings of the 35th International Conference on Machine Learning, 4393–4402.
[70] Kolla, S. K. (2021). Architectural Frameworks for Large-Scale Electronic Health Record Data Platforms. Current Research in Public Health, 1(1), 1–19. Retrieved from https://www.scipublications.com/journal/index.php/crph/article/view/1372.
[71] Schlegl, T., Seeböck, P., Waldstein, S. M., Schmidt-Erfurth, U., & Langs, G. (2017). Unsupervised anomaly detection with generative adversarial networks. Information Processing in Medical Imaging, 146–157.
[72] Guntupalli, R. (2024). AI-Powered Infrastructure Management in Cloud Computing: Automating Security Compliance and Performance Monitoring. Available at SSRN 5329147.
[73] Breunig, M. M., Kriegel, H. P., Ng, R. T., & Sander, J. (2000). LOF: Identifying density-based local outliers. Proceedings of the ACM SIGMOD International Conference on Management of Data, 93–104.
[74] Uday Surendra Yandamuri. (2023). An Intelligent Analytics Framework Combining Big Data and Machine Learning for Business Forecasting. International Journal Of Finance, 36(6), 682-706. https://doi.org/10.5281/zenodo.18095256
[75] Sculley, D., Holt, G., Golovin, D., et al. (2015). Hidden technical debt in machine learning systems. Advances in Neural Information Processing Systems, 28, 2503–2511.
[76] Meda, R. (2024). Agentic AI in Multi-Tiered Paint Supply Chains: A Case Study on Efficiency and Responsiveness. Journal of Compu-tational Analysis and Applications (JoCAAA), 33(08), 3994-4015.
[77] Amershi, S., Begel, A., Bird, C., et al. (2019). Software engineering for machine learning: A case study. IEEE Software, 36(5), 56–67.
[78] Rongali, S. K. (2024). Federated and Generative AI Models for Secure, Cross-Institutional Healthcare Data Interoperability. Journal of Neonatal Surgery, 13(1), 1683-1694.
[79] Breck, E., Cai, S., Nielsen, E., Salib, M., & Sculley, D. (2017). The ML test score: A rubric for ML production readiness and technical debt reduction. Proceedings of IEEE Big Data, 1123–1132.
[80] Meda, R. (2024). Agentic AI in Multi-Tiered Paint Supply Chains: A Case Study on Efficiency and Responsiveness. Journal of Compu-tational Analysis and Applications (JoCAAA), 33(08), 3994-4015.
[81] Zaharia, M., Das, T., Li, H., et al. (2012). Discretized streams: Fault-tolerant streaming computation at scale. Proceedings of the USENIX Symposium on Networked Systems Design and Implementation, 423–438.
[82] Velangani Divya Vardhan Kumar Bandi. (2024). Intelligent Data Platforms For Personalized Retail Analytics At Scale. Metallurgical and Materials Engineering, 30(4), 1011–1027. Retrieved from https://metall-mater-eng.com/index.php/home/article/view/1011-1027
[83] Carbone, P., Katsifodimos, A., Ewen, S., et al. (2015). Apache Flink: Stream and batch processing in a single engine. IEEE Data Engineering Bulletin, 38(4), 28–38.
[84] Keerthi Amistapuram. (2023). Privacy-Preserving Machine Learning Models for Sensitive Customer Data in Insurance Systems. Educational Administration: Theory and Practice, 29(4), 5950–5958. https://doi.org/10.53555/kuey.v29i4.10965
[85] van der Aalst, W. M. P. (2016). Process mining: Data science in action (2nd ed.). Springer.
[86] Chava, K. (2024). The Role of Cloud Computing in Accelerating AI-Driven Innovations in Healthcare Systems. European Advanced Journal for Emerging Technologies (EAJET)-p-ISSN 3050-9734 en e-ISSN 3050-9742, 2(1).
[87] Batini, C., & Scannapieco, M. (2016). Data and information quality: Dimensions, principles and techniques. Springer.
[88] Siva Hemanth Kolla. (2023). Deep Learning–Driven Retrieval-Augmented Generation for Enterprise ITSM Automation: A Governance-Aligned Large Language Model Architecture. Journal of Computational Analysis and Applications (JoCAAA), 31(4), 2489–2502. Retrieved from https://www.eudoxuspress.com/index.php/pub/article/view/4774
[89] Li, Z., Wang, Y., & Zhang, H. (2024). AI-driven visual inspection and quality prediction in Industry 4.0 manufacturing systems. Journal of Manufacturing Systems, 72, 210–224.
[90] Davuluri, P. S. L. N. (2024). AI-Driven Data Governance Frameworks for Automated Regulatory Reporting and Audit Readiness. Metallurgical and Materials Engineering, 30(4), 996–1010. Retrieved from https://metall-mater-eng.com/index.php/home/article/view/1936
[91] Jiang, G., Solbrig, H. R., Chute, C. G. (2014). HL7 FHIR. JAMIA, 21(3), 391–400.
[92] Sasi Kumar Kolla. (2023). Big Data–Driven Machine Learning Frameworks for Clinical Risk Prediction. International Journal of Medical Toxicology and Legal Medicine, 26(3 and 4), 44–59. Retrieved from https://ijmtlm.org/index.php/journal/article/view/1456.
[93] Weber, G. M., Murphy, S. N., McMurry, A. J., et al. (2009). The Shared Health Research Information Network. JAMIA, 16(4), 458–466.
[94] Bandi, V. D. V. K. (2023). Production-Grade Machine Learning Pipelines For Healthcare Predictive Analytics. South Eastern European Journal of Public Health, 189–205. Retrieved from https://www.seejph.com/index.php/seejph/article/view/7057





