•   When: Wednesday, March 09, 2022 from 11:00 AM to 12:00 PM
  •   Speakers: Keren Zhou
  •   Location: ZOOM only
  •   Export to iCal

Abstract:

GPUs have emerged as a key component for accelerating applications in various domains, including deep learning, data analytics, and scientific simulations. While GPUs provide superior compute power and higher memory bandwidth than CPUs, writing efficient GPU code to achieve maximum possible performance is challenging because of the sophisticated programming models and architectural features. GPUs' performance tools are designed to pinpoint performance bottlenecks in GPU-accelerated applications and provide performance insights for users. However, existing performance tools are insufficient to identify hotspots and offer insights for complex applications.

 

In this talk, I will describe some novel GPU performance tools that I have developed to measure, analyze, and optimize GPU-accelerated applications. Our GPU profiler employs sophisticated concurrent data structures and asynchronous processing for low overhead measurement of performance metrics at runtime. Our analysis tool attributes metrics to GPU program contexts including call chains, loops, and inlined code, to help users understand GPU performance. Finally, our advisor tool interprets performance bottlenecks and offers optimization suggestions to understand inefficiencies in hotspots. Guided by insightful performance reports generated by our tools, we identified and optimized performance hotspots in many HPC and machine learning applications.

 

Bio:

Keren Zhou is a Ph.D. candidate at Rice University, advised by Prof. John Mellor-Crummey. Keren received an M.S. degree in computer science from the Institute of Computing Technology, Chinese Academy of Sciences, in 2017. His research interests include program analysis, parallel algorithms, and performance tools for HPC and machine learning applications. He has contributed to many open-source software projects, including HPCToolkit, Dyninst, and PyTorch, and is collaborating with NVIDIA on improving their CUPTI and Sanitizer APIs. He has published papers in peer-reviewed conferences and journals, including TPDS, PARCO, ICS, SC, PPoPP, CGO, and ASPLOS. He is a recipient of the 2020 ACM-IEEE CS George Michael Memorial HPC Fellowship.

Posted 2 years, 2 months ago