PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation-Reference-Cited by-同舟云学术

PIT: Optimization of Dynamic Sparse Deep Learning Models via Permutation Invariant Transformation

Published:2023-10-23 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the 29th Symposium on Operating Systems Principles
language:
Short-container-title:

Author:

Zheng Ningxin¹^ORCID,Jiang Huiqiang¹^ORCID,Zhang Quanlu²^ORCID,Han Zhenhua¹^ORCID,Ma Lingxiao²^ORCID,Yang Yuqing¹^ORCID,Yang Fan²^ORCID,Zhang Chengruidong¹^ORCID,Qiu Lili¹^ORCID,Yang Mao²^ORCID,Zhou Lidong²^ORCID

Affiliation:

1. Microsoft Research, Shanghai, China

2. Microsoft Research, Beijing, China

Publisher

ACM

Reference72 articles.

1. Longformer. https://github.com/allenai/longformer , 2020 . Longformer. https://github.com/allenai/longformer, 2020.

2. Accelerating inference with sparsity using the nvidia ampere architecture and nvidia tensorrt. https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt/ , 2021 . Accelerating inference with sparsity using the nvidia ampere architecture and nvidia tensorrt. https://developer.nvidia.com/blog/accelerating-inference-with-sparsity-using-ampere-and-tensorrt/, 2021.

3. The api reference guide for cusparse , the cuda sparse matrix library. https://docs.nvidia.com/cuda/cusparse/index.html , 2021 . The api reference guide for cusparse, the cuda sparse matrix library. https://docs.nvidia.com/cuda/cusparse/index.html, 2021.

4. Cuda c++ programming guide. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#wmma , 2021 . Cuda c++ programming guide. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#wmma, 2021.

5. cusparselt : A high-performance cuda library for sparse matrix-matrix multiplication. https://docs.nvidia.com/cuda/cusparselt/index.html , 2021 . cusparselt: A high-performance cuda library for sparse matrix-matrix multiplication. https://docs.nvidia.com/cuda/cusparselt/index.html, 2021.