1. H. Zhang, Z. Zheng, S. Xu, W. Dai, Q. Ho, X. Liang, Z. Hu, J. Wei, P. Xie, E.P. Xing, Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters, in: 2017 USENIX Annual Technical Conference, USENIX ATC 17, 2017, pp. 181–193.
2. Image classification at supercomputer scale;Ying,2018
3. S. Wang, D. Li, Y. Cheng, J. Geng, Y. Wang, S. Wang, S. Xia, J. Wu, BML: A High-Performance, Low-Cost Gradient Synchronization Algorithm for DML Training, in: Proceedings of the 32nd International Conference on Neural Information Processing Systems, 2018, pp. 4243–4253.
4. Falcon: Addressing stragglers in heterogeneous parameter server via multiple parallelism;Zhou;IEEE Trans. Comput.,2021