Affiliation:
1. Department of Computer Science, University of Illinois at Urbana–Champaign, Urbana, Illinois, USA
Abstract
Algebraic multigrid (AMG) is often viewed as a scalable [Formula: see text] solver for sparse linear systems. Yet, AMG lacks parallel scalability due to increasingly large costs associated with communication, both in the initial construction of a multigrid hierarchy and in the iterative solve phase. This work introduces a parallel implementation of AMG that reduces the cost of communication, yielding improved parallel scalability. It is common in Message Passing Interface (MPI), particularly in the MPI-everywhere approach, to arrange inter-process communication, so that communication is transported regardless of the location of the send and receive processes. Performance tests show notable differences in the cost of intra- and internode communication, motivating a restructuring of communication. In this case, the communication schedule takes advantage of the less costly intra-node communication, reducing both the number and the size of internode messages. Node-centric communication extends to the range of components in both the setup and solve phase of AMG, yielding an increase in the weak and strong scaling of the entire method.
Funder
National Nuclear Security Administration
Subject
Hardware and Architecture,Theoretical Computer Science,Software
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Reducing Operator Complexity of Galerkin Coarse-grid Operators with Machine Learning;SIAM Journal on Scientific Computing;2024-09-06
2. BoostN: Optimizing Imbalanced Neighborhood Communication on Homogeneous Many-Core System;Proceedings of the 53rd International Conference on Parallel Processing;2024-08-12
3. Exploiting mesh structure to improve multigrid performance for saddle-point problems;The International Journal of High Performance Computing Applications;2024-06-18
4. CommBench: Micro-Benchmarking Hierarchical Networks with Multi-GPU, Multi-NIC Nodes;Proceedings of the 38th ACM International Conference on Supercomputing;2024-05-30
5. Optimizing Irregular Communication with Neighborhood Collectives and Locality-Aware Parallelism;Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis;2023-11-12