Design and FPGA Implementation of a Low-Power CNN Accelerator for Edge AI Applications
DOI:
https://doi.org/10.15662/IJEETR.2026.0802079Keywords:
Bit Separable Multiplier, Convolutional Neural Network, Energy Efficient Hardware, Field Programmable Gate Array, Radix-4 Booth Encoding.Abstract
Convolutional Neural Networks (CNNs) have become the cornerstone of modern edge AI applications, enabling real-time inference for vision, speech, and IoT systems. However, their deployment on resource-constrained devices remains limited by high computational cost and energy consumption, primarily due to the intensive multiply-accumulate (MAC) operations in convolutional layers. This work presents the design and FPGA implementation of a Bit-Separable Radix-4 Booth Multiplier tailored for power-efficient CNN accelerators. Unlike conventional Booth multipliers, the proposed design partitions operands into separable bit-groups to enhance parallelism while minimizing switching activity. The Radix-4 encoding reduces the partial product count by half, while bit-separability enables selective activation of sub-multipliers, thereby reducing dynamic power without sacrificing throughput. The architecture integrates approximate computing concepts by exploiting error-tolerant properties of CNNs, allowing operand gating and truncated accumulation in less significant bit segments. To further enhance energy efficiency, the multiplier is embedded into a tiled systolic MAC array with weight-stationary dataflow, reducing off-chip memory accesses. FPGA synthesis results on the Xilinx Zynq-7000 xc7z020 device demonstrate significant reductions in dynamic power consumption and LUT utilization compared to standard Booth and array multipliers, while maintaining competitive inference accuracy across benchmark CNN models. The proposed multiplier achieves up to 80% power savings and 30% resource efficiency with negligible accuracy loss (<1%) when used for quantized CNNs on edge workloads. This work highlights that Bit-Separable Radix-4 Booth multipliers can serve as the core arithmetic engine for low-power CNN accelerators, enabling scalable, energy-efficient, and high-performance edge AI deployments.
References
1. S. Park and D. Park, “Bit-separable radix-4 Booth multiplier for power-efficient CNN accelerator,” in Proc. IEEE Symp. Low-Power High-Speed Chips (COOL CHIPS), Tokyo, Japan, 2024, pp. 1–6, doi: 10.1109/COOLCHIPS61292.2024.10531170.
2. M. V. Subbarao, K. P. Vasavi, K. S. Subhanjili, M. Kanthi, B. Siri, and J. Preethi, “Design and analysis of an enhanced CNN accelerator for deep learning applications,” in Proc. 3rd Int. Conf. Data Science and Network Security (ICDSNS), Tiptur, India, 2025, pp. 1–6, doi: 10.1109/ICDSNS65743.2025.11168681.
3. A. Rosi, N. Suresh, B. V. S. Kumar, M. Ramesh, C. Murugamani, and A. K. Konduru, “Design of efficient AI accelerator using spiking neural network,” in Proc. Int. Conf. Recent Advances in Electrical, Electronics, Ubiquitous Communication, and Computational Intelligence (RAEEUCCI), Chennai, India, 2025, pp. 1–7, doi: 10.1109/RAEEUCCI63961.2025.11048220.
4. Z. Lin, K. Itoyama, K. Nakadai, and H. Amano, “FPGA-based low power acceleration of HARK sound source localization,” in Proc. IEEE Symp. Low-Power High-Speed Chips (COOL CHIPS), Tokyo, Japan, 2024, pp. 1–6, doi: 10.1109/COOLCHIPS61292.2024.10531180.
5. Z. Tang, C. Zhang, X. Zhou, S. Zhu, A. Zhang, and Z. Shi, “A phase offset blind estimation algorithm for QAM modulation and its FPGA implementation,” IEEE Commun. Lett., vol. 29, no. 6, pp. 1395–1399, Jun. 2025, doi: 10.1109/LCOMM.2025.3562854.
6. M. M. Basha, P. S. R. Shashank, G. Rushikesh, K. V. Reddy, G. G. Kumar, and S. Gundala, “Distributed arithmetic based FIR filter: FPGA implementation,” in Proc. 15th Int. Conf. Computing, Communication and Networking Technologies (ICCCNT), Kamand, India, 2024, pp. 1–4, doi: 10.1109/ICCCNT61001.2024.10725340.
7. F. You et al., “A low-power ABR characteristic waveform automatic detection algorithm design and FPGA implementation,” in Proc. Int. Conf. Microelectronics (ICM), Doha, Qatar, 2024, pp. 1–5, doi: 10.1109/ICM63406.2024.10815903.
8. S. C. Inguva and J. B. Seventline, “FPGA-based implementation of low-power CORDIC architecture,” in Proc. Int. Conf. Intelligent Sustainable Systems (ICISS), Palladam, India, 2019, pp. 389–395, doi: 10.1109/ISS1.2019.8907946.
9. C.Nagarajan and M.Madheswaran - ‘Stability Analysis of Series Parallel Resonant Converter with Fuzzy Logic Controller Using State Space Techniques’- Taylor &Francis, Electric Power Components and Systems, Vol.39 (8), pp.780-793, May 2011. DOI: 10.1080/15325008.2010.541746
10. C.Nagarajan and M.Madheswaran - ‘Experimental verification and stability state space analysis of CLL-T Series Parallel Resonant Converter’ - Journal of Electrical Engineering, Vol.63 (6), pp.365-372, Dec.2012. DOI: 10.2478/v10187-012-0054-2
11. C.Nagarajan and M.Madheswaran - ‘Performance Analysis of LCL-T Resonant Converter with Fuzzy/PID Using State Space Analysis’- Springer, Electrical Engineering, Vol.93 (3), pp.167-178, September 2011. DOI 10.1007/s00202-011-0203-9
12. S.Tamilselvi, R.Prakash, C.Nagarajan,“Solar System Integrated Smart Grid Utilizing Hybrid Coot-Genetic Algorithm Optimized ANN Controller” Iranian Journal Of Science And Technology-Transactions Of Electrical Engineering, DOI10.1007/s40998-025-00917-z,2025
13. S.Tamilselvi, R.Prakash, C.Nagarajan,“ Adaptive sliding mode control of multilevel grid-connected inverters using reinforcement learning for enhanced LVRT performance” Electric Power Systems Research 253 (2026) 112428, doi.org/10.1016/j.epsr.2025.112428
14. S.Thirunavukkarasu, C. Nagarajan, 2024, “Performance Investigation on OCF and SCF study in BLDC machine using FTANN Controller," Journal of Electrical Engineering And Technology, Volume 20, pages 2675–2688, (2025), doi.org/10.1007/s42835-024-02126-w
15. C. Nagarajan, M.Madheswaran and D.Ramasubramanian- ‘Development of DSP based Robust Control Method for General Resonant Converter Topologies using Transfer Function Model’- Acta Electrotechnica et Informatica Journal , Vol.13 (2), pp.18-31,April-June.2013, DOI: 10.2478/aeei-2013-0025.
16. C.Nagarajan and M.Madheswaran - ‘DSP Based Fuzzy Controller for Series Parallel Resonant converter’- Springer, Frontiers of Electrical and Electronic Engineering, Vol. 7(4), pp. 438-446, Dec.12. DOI 10.1007/s11460-012-0212-0.
17. C.Nagarajan and M.Madheswaran - ‘Experimental Study and steady state stability analysis of CLL-T Series Parallel Resonant Converter with Fuzzy controller using State Space Analysis’- Iranian Journal of Electrical & Electronic Engineering, Vol.8 (3), pp.259-267, September 2012.
18. C.Nagarajan and M.Madheswaran, “Analysis and Simulation of LCL Series Resonant Full Bridge Converter Using PWM Technique with Load Independent Operation” has been presented in ICTES’08, a IEEE / IET International Conference organized by M.G.R.University, Chennai.Vol.no.1, pp.190-195, Dec.2007
19. Suganthi Mullainathan, Ramesh Natarajan, “An SPSS and CNN modelling based quality assessment using ceramic materials and membrane filtration techniques”, Revista Materia (Rio J.) Vol. 30, 2025, DOI: https://doi.org/10.1590/1517-7076-RMAT-2024-0721
20. M Suganthi, N Ramesh, “Treatment of water using natural zeolite as membrane filter”, Journal of Environmental Protection and Ecology, Volume 23, Issue 2, pp: 520-530,2022
21. B. Khabbazan and S. Mirzakuchaki, “Design and implementation of a low-power embedded CNN accelerator on a low-end FPGA,” in Proc. 22nd Euromicro Conf. Digital System Design (DSD), Kallithea, Greece, 2019, pp. 647–650, doi: 10.1109/DSD.2019.00102.
22. G. Li, J. Zhang, M. Zhang, and H. Corporaal, “An efficient FPGA implementation for real-time and low-power UAV object detection,” in Proc. IEEE Int. Symp. Circuits and Systems (ISCAS), Austin, TX, USA, 2022, pp. 1387–1391, doi: 10.1109/ISCAS48785.2022.9937449.
23. L. Guo, P. Lin, L. Guo, and B. Liu, “Implementation of a CRNN-based low-power keyword recognition system on FPGA,” in Proc. IEEE 14th Int. Conf. ASIC (ASICON), Kunming, China, 2021, pp. 1–4, doi: 10.1109/ASICON52560.2021.9620311.
24. B. Rashidi, B. Rashidi, and M. Pourormazd, “Design and implementation of low power digital FIR filter based on low power multipliers and adders on Xilinx FPGA,” in Proc. 3rd Int. Conf. Electronics Computer Technology (ICECT), Kanyakumari, India, 2011, pp. 18–22, doi: 10.1109/ICECTECH.2011.5941647.
25. K. Khalil, A. Kumar, and M. Bayoumi, “Low-power convolutional neural network accelerator on FPGA,” in Proc. IEEE 5th Int. Conf. Artificial Intelligence Circuits and Systems (AICAS), Hangzhou, China, 2023, pp. 1–5, doi: 10.1109/AICAS57966.2023.10168646.
26. R. Sampson et al., “FPGA implementation of low-power 3D ultrasound beamformer,” in Proc. IEEE Int. Ultrasonics Symp. (IUS), Taipei, Taiwan, 2015, pp. 1–4, doi: 10.1109/ULTSYM.2015.0514.
27. A. Meena, S. Shiyamala, S. V. Kumar, M. A. Saleemnawaz, and R. Uthirasamy, “Low power enhanced Trivium implementation using parallel-pipeline technique,” in Proc. Int. Conf. Smart Technologies and Systems for Next Generation Computing (ICSTSN), Villupuram, India, 2022, pp. 1–4, doi: 10.1109/ICSTSN53084.2022.9761343.
28. R. Badiei, S. Timarchi, and A. Zakaleh, “Low-power resource-efficient FPGA implementation of modified FitzHugh–Nagumo neuron for spiking neural networks,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 72, no. 11, pp. 1780–1784, Nov. 2025, doi: 10.1109/TCSII.2025.3615935.
29. Sugumar, R. (2025). Designing Resilient and Scalable Cloud-Native Frameworks for Generative AI Content Production. International Journal of Research Publications in Engineering, Technology and Management (IJRPETM), 8(6), 13268-13279.
30. Soundappan, S. J. (2020). Big Data Analytics in Healthcare: Applications for Pandemic Forecastin. International Journal of Advanced Research in Computer Science & Technology (IJARCST), 3(1), 2248-2253.
31. Aarthi, K., Thirumoorthy, P., Tamizharasu, K., Manoja, R., Kalyanasundaram, P., & Rajasekar, M. (2025, September). Improved Network lifetime using Cluster based Power-Aware Balanced Routing Protocol for Device to Device Communication. In 2025 6th International Conference on Electronics and Sustainable Communication Systems (ICESC) (pp. 1005-1010). IEEE.
32. Mathew, A. Cybersecurity 5.0: From Firewalls to Fully Autonomous Digital Protection.
33. Rengarajan, A. (2025). Cloud-Based AI-Driven Threat Detection Framework for Smart Grid Cybersecurity. International Journal of Future Innovative Science and Technology (IJFIST), 8(6), 16065.
34. Anbazhagan, K. (2025). Next-Generation Enterprise Cloud AI for Healthcare: Secure CNN Pipelines and Privacy Controls. International Journal of Future Innovative Science and Technology (IJFIST), 8(6), 15980.
35. Socrates, S., Shanmugapriya, M., Murugeshwari, B., & Angalaeswari, S. (2024). Efficient Design for Implantable Device Constant Current Induction Doubly Fed Generating Incorporating Grid Connectivity. In Intelligent Solutions for Sustainable Power Grids (pp. 382-392). IGI Global Scientific Publishing.
36. Sugumar, R. (2026). Performance Optimization Frameworks for Financial Web Platforms with Real-Time Transaction Processing. International Journal of Engineering & Extended Technologies Research (IJEETR), 8(2), 600-611.
37. Anbazhagan, K. (2025). AI Driven Zero Trust Security Model for Enterprise Data Protection and Intelligent Infrastructure Management. International Journal of Technology, Management and Humanities, 11(03), 101-107.
38. Prabha, P. S., & Rengarajan, A. (2025). ENHANCING CLOUD RESOURCE ALLOCATION WITH VISION TRANSFORMER, DEEP REINFORCEMENT LEARNING, AND IMPROVED SHRIKE OPTIMIZATION ALGORITHM. Corrosion Management ISSN: 1355-5243, 35(2), 233-245.
39. Vimal, V. R., & Banerjee, J. S. (2025). Integrating PSO, GA, and ACO for Optimized ECG Feature Selection and Classification of Cardiac Disorders. SGS-Engineering & Sciences, 1(5).
40. Gopinathan, V. R. (2023). Cloud-First AI Security Architecture for Protecting Enterprise Digital Ecosystems and Financial Networks. International Journal of Research and Applied Innovations, 6(6), 10031-10039.
41. Mathew, A. A Secure, Trustworthy, and Regulated Framework for AI Agents in Distributed Networks.
42. Anbazhagan, K. (2025). Secure AI Enabled Enterprise Ecosystems for Fraud Prevention Compliance Automation and Real Time Analytics. International Journal of Multidisciplinary Research in Science, Engineering, Technology & Management, 1(4), 6-13.
43. Soundappan, S. J. (2026). Building Trustworthy AI: Explainability and Security in Modern Cloud-Native Data-Driven Ecosystem Platforms. International Journal of Engineering & Extended Technologies Research (IJEETR), 8(2), 570-579.
44. Sugumar, R. (2025). Cyber-Secure Cloud Architecture Integrating Network and API Controls for Risk-Aware SAP Healthcare Data Platforms. International Journal of Humanities and Information Technology, 7(4), 53-60.
45. Vimal, V. R., & Banerjee, J. S. (2025). Integrating PSO, GA, and ACO for Optimized ECG Feature Selection and Classification of Cardiac Disorders. SGS-Engineering & Sciences, 1(5).





