A Comprehensive Review of Processing-in-Memory Architectures for Deep Neural Networks
-
Published:2024-07-16
Issue:7
Volume:13
Page:174
-
ISSN:2073-431X
-
Container-title:Computers
-
language:en
-
Short-container-title:Computers
Author:
Kaur Rupinder1, Asad Arghavan1, Mohammadi Farah1
Affiliation:
1. Electrical, Computer and Biomedical Engineering Department, Toronto Metropolitan University, 350 Victoria St, Toronto, ON M5B 2K3, Canada
Abstract
This comprehensive review explores the advancements in processing-in-memory (PIM) techniques and chiplet-based architectures for deep neural networks (DNNs). It addresses the challenges of monolithic chip architectures and highlights the benefits of chiplet-based designs in terms of scalability and flexibility. This review emphasizes dataflow-awareness, communication optimization, and thermal considerations in PIM-enabled manycore architectures. It discusses tailored dataflow requirements for different machine learning workloads and presents a heterogeneous PIM system for energy-efficient neural network training. Additionally, it explores thermally efficient dataflow-aware monolithic 3D (M3D) NoC architectures for accelerating CNN inferencing. Overall, this review provides valuable insights into the development and evaluation of chiplet and PIM architectures, emphasizing improved performance, energy efficiency, and inference accuracy in deep learning applications.
Reference50 articles.
1. Liu, J., Zhao, H., Ogleari, M.A., Li, D., and Zhao, J. (2018, January 20–24). Processing-in-Memory for Energy-Efficient Neural Network Training: A Heterogeneous Approach. Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-51), Fukuoka, Japan. 2. Sharma, H., Narang, G., Doppa, J.R., Ogras, U., and Pande, P.P. (2024). Dataflow-Aware PIM-Enabled Manycore Architecture for Deep Learning Workloads. arXiv, Available online: https://arxiv.org/abs/2403.19073. 3. Narang, G., Ogbogu, C., Doppa, J., and Pande, P. (2024). TEFLON: Thermally Efficient Dataflow-Aware 3D NoC for Accelerating CNN Inferencing on Manycore PIM Architectures. ACM Trans. Embed. Comput. Syst., just accepted. 4. Joardar, B.K., Choi, W., Kim, R.G., Doppa, J.R., Pande, P.P., Marculescu, D., and Marculescu, R. (2017, January 19). 3D NoC-Enabled Heterogeneous Manycore Architectures for Accelerating CNN Training: Performance and Thermal Trade-Offs. Proceedings of the Eleventh IEEE/ACM International Symposium on Networks-on-Chip, Seoul, Republic of Korea. 5. Giannoula, C., Yang, P., Vega, I.F., Yang, J., Li, Y.X., Luna, J.G., Sadrosadati, M., Mutlu, O., and Pekhimenko, G. (2024). Accelerating Graph Neural Networks on Real Processing-In-Memory Systems. arXiv.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|