Affiliation:
1. Argonne National Laboratory, IL, USA
2. Inria, Rennes - Bretagne Atlantique Research Centre, France
3. University of Illinois at Urbana Champaign, IL
4. University of Wisconsin - Madison, WI
Abstract
With exascale computing on the horizon, reducing performance variability in data management tasks (storage, visualization, analysis, etc.) is becoming a key challenge in sustaining high performance. This variability significantly impacts the overall application performance at scale and its predictability over time.
In this article, we present Damaris, a system that leverages
dedicated cores
in multicore nodes to offload data management tasks, including I/O, data compression, scheduling of data movements, in situ analysis, and visualization. We evaluate Damaris with the CM1 atmospheric simulation and the Nek5000 computational fluid dynamic simulation on four platforms, including NICS’s Kraken and NCSA’s Blue Waters. Our results show that (1) Damaris fully hides the I/O variability as well as all I/O-related costs, thus making simulation performance predictable; (2) it increases the sustained write throughput by a factor of up to 15 compared with standard I/O approaches; (3) it allows almost perfect scalability of the simulation up to over 9,000 cores, as opposed to state-of-the-art approaches that fail to scale; and (4) it enables a seamless connection to the VisIt visualization software to perform in situ analysis and visualization in a way that impacts neither the performance of the simulation nor its variability.
In addition, we extended our implementation of Damaris to also support the use of
dedicated nodes
and conducted a thorough comparison of the two approaches—dedicated cores and dedicated nodes—for I/O tasks with the aforementioned applications.
Publisher
Association for Computing Machinery (ACM)
Subject
Computational Theory and Mathematics,Computer Science Applications,Hardware and Architecture,Modelling and Simulation,Software
Cited by
27 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Extreme-scale workflows: A perspective from the JLESC international community;Future Generation Computer Systems;2024-12
2. From complex data to clear insights: visualizing molecular dynamics trajectories;Frontiers in Bioinformatics;2024-04-11
3. CAPIO: a Middleware for Transparent I/O Streaming in Data- Intensive Workflows;2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC);2023-12-18
4. Detecting interference between applications and improving the scheduling using malleable application clones;The International Journal of High Performance Computing Applications;2023-12-13
5. Dask-Extended External Tasks for HPC/ML In transit Workflows;Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis;2023-11-12