The I/O Trace Initiative: Building a Collaborative I/O Archive to Advance HPC

Abstract

HPC application developers and administrators need to understand the complex interplay between compute clusters and storage systems to make effective optimization decisions. Ad hoc investigations of this interplay based on isolated case studies can lead to conclusions that are incorrect or difficult to generalize. The I/O Trace Initiative aims to improve the scientific community’s understanding of I/O operations by building a searchable collaborative archive of I/O traces from a wide range of applications and machines, with a focus on high-performance computing and scalable AI/ML. This initiative advances the accessibility of I/O trace data by enabling users to locate and compare traces based on user-specified criteria. It also provides a visual analytics platform for in-depth analysis, paving the way for the development of advanced performance optimization techniques. By acting as a hub for trace data, the initiative fosters collaborative research by encouraging data sharing and collective learning.

Publication
Proceedings of the SC'23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (PDSW)
Reza Salkhordeh
Reza Salkhordeh
Postdoctoral researcher

My research interests include operating systems, solid-state drives, and data storage systems.