•   When: Friday, January 31, 2020 from 02:00 PM to 03:00 PM
  •   Speakers: Bing Xie
  •   Location: Engineering Building 4201
  •   Export to iCal

Supercomputer I/O systems are built around scientific codes. These codes issue periodic write bursts to the file systems for various purposes and with various I/O patterns. From the application’s viewpoint, if its I/O system does not absorb data fast enough, then memory to buffer the output is exhausted, forcing the computation to stall before it can output more data.   Output stalls leave precious CPU resources underutilized, extending application runtimes and compromising system throughput.  In this talk, I will discuss the study on the write performance of production supercomputers, ranging from quantitative I/O behavior analysis to predictive performance modeling with machine learning techniques. In particular, I will talk about the challenges of benchmarking, profiling and modeling on the write performance of supercomputer I/O systems under production load, and discuss the techniques and methods I proposed to analyze the target systems based on the system design, deployment and configuration. Moreover, I will also show my works on data management among heterogeneous filesystems and resource management for workflows on elastic virtual infrastructure, emphasizing on the challenges, opportunities and my approach.
 
Biography:
Bing Xie is an HPC Systems Engineer at Oak Ridge National Laboratory. Bing received her Ph.D. in 2017 from the Computer Science Department at Duke University, where she was advised by  Jeff Chase. Her research develops performance analysis and prediction methods for supercomputer I/O systems . More broadly,  her research interests span distributed systems, storage systems, high-performance computing,  and cloud computing. Her papers are published at HPDC, SC, ACM TOS, etc. Among her works, the petascale filesystem study was nominated for Best Paper and also for Best Student Paper at SC’12.

Posted 4 years, 9 months ago