Back to Top

Paper Title

Design of a Hardware-Software Co-Optimization Framework for Efficient Execution of Convolutional Layers in Deep Neural Networks

Keywords

  • convolutional neural networks
  • hardware acceleration
  • co-optimization
  • embedded ai
  • fpga
  • layer profiling
  • deep learning
  • edge computing.

Article Type

Research Article

Journal

Journal:IACSE - International Journal of Artificial Intelligence and Machine Learning

Issue

Volume : 3 | Issue : 1 | Page No : 1-6

Published On

February, 2022

Downloads

Abstract

The rapid expansion of deep learning applications has driven significant interest in optimizing the execution of convolutional neural networks (CNNs), particularly on edge and embedded devices. The convolutional layer, being the computational backbone of CNNs, is highly resource-intensive and requires efficient implementation strategies. This paper proposes a hardware-software co-optimization framework that jointly tunes computational graph mappings and hardware accelerator configurations to maximize throughput and minimize energy consumption. Design leverages parameter-aware scheduling and layer-specific profiling to bridge the performance-efficiency gap observed in traditional accelerator deployments. Empirical results demonstrate up to 2.4 improvement in latency and 1.9 reduction in energy usage over baseline FPGA-based implementations.

View more >>

Uploded Document Preview