Abstract
Deploying deep neural networks (DNNs) on low-power microcontroller units (MCUs) poses significant challenges due to constraints in memory, computational power, and energy consumption. This study evaluates quantization and pruning techniques to optimize DNN models for such environments. We conduct empirical benchmarking using representative networks on embedded platforms and compare performance trade-offs across accuracy, inference time, memory footprint, and power consumption. Our findings confirm that aggressive quantization and structured pruning significantly reduce resource usage with minimal accuracy degradation, demonstrating their suitability for edge intelligence applications.
View more >>