Reducing communication in algebraic multigrid with multi-step node aware communication-Reference-Cited by-同舟云学术

Reducing communication in algebraic multigrid with multi-step node aware communication

Published:2020-06-11 Issue:5 Volume:34 Page:547-561
ISSN:1094-3420
Container-title:The International Journal of High Performance Computing Applications
language:en
Short-container-title:The International Journal of High Performance Computing Applications

Author:

Bienz Amanda^ORCID,Gropp William D,Olson Luke N¹

Affiliation:

1. Department of Computer Science, University of Illinois at Urbana–Champaign, Urbana, Illinois, USA

Abstract

Algebraic multigrid (AMG) is often viewed as a scalable [Formula: see text] solver for sparse linear systems. Yet, AMG lacks parallel scalability due to increasingly large costs associated with communication, both in the initial construction of a multigrid hierarchy and in the iterative solve phase. This work introduces a parallel implementation of AMG that reduces the cost of communication, yielding improved parallel scalability. It is common in Message Passing Interface (MPI), particularly in the MPI-everywhere approach, to arrange inter-process communication, so that communication is transported regardless of the location of the send and receive processes. Performance tests show notable differences in the cost of intra- and internode communication, motivating a restructuring of communication. In this case, the communication schedule takes advantage of the less costly intra-node communication, reducing both the number and the size of internode messages. Node-centric communication extends to the range of components in both the setup and solve phase of AMG, yielding an increase in the weak and strong scaling of the entire method.

Funder

National Nuclear Security Administration

Publisher

SAGE Publications

Subject

Hardware and Architecture,Theoretical Computer Science,Software

Link

http://journals.sagepub.com/doi/pdf/10.1177/1094342020925535

Reference31 articles.

1. Ultrascalable Implicit Finite Element Analyses in Solid Mechanics with over a Half a Billion Degrees of Freedom

2. Asynchronous Task-Based Parallelization of Algebraic Multigrid

3. Challenges of Scaling Algebraic Multigrid Across Modern Multicore Architectures

4. On the Performance of an Algebraic Multigrid Solver on Multicore Clusters

5. Application-specific topology-aware mapping for three dimensional topologies

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Reducing Operator Complexity of Galerkin Coarse-grid Operators with Machine Learning;SIAM Journal on Scientific Computing;2024-09-06

2. BoostN: Optimizing Imbalanced Neighborhood Communication on Homogeneous Many-Core System;Proceedings of the 53rd International Conference on Parallel Processing;2024-08-12

3. Exploiting mesh structure to improve multigrid performance for saddle-point problems;The International Journal of High Performance Computing Applications;2024-06-18

4. CommBench: Micro-Benchmarking Hierarchical Networks with Multi-GPU, Multi-NIC Nodes;Proceedings of the 38th ACM International Conference on Supercomputing;2024-05-30

5. Optimizing Irregular Communication with Neighborhood Collectives and Locality-Aware Parallelism;Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis;2023-11-12