Facilitating Electronic Structure Computations on GPU-based Exascale Platforms
Exascale Computing Project via YouTube
Overview
Syllabus
HPC Best Practices Webinar Series
Algorithms and performance portability for electroni structure
Speeding up electronic structure calculations to ena
Running MD on exascale platforms
Main numerical kernels for electronic structure calculations
Developing alternative solvers based on polynomial matrices
Implementation divided into two libraries
Using OpenMP for GPU offloading
General implementation strategy
Computer Science challenges
BML: supported (shared memory) matrix formats
BML: Supporting multiple data types in a C code
BML: Fortran interface is important for targeted application codes
BML: Unit test/Continuous integration
Offloading to GPU
Offloading strategy
GPU offloading with OpenMP
Challenges in interfacing with optimized vendor libra
Using a synthetic Hamiltonian matrix for Performanc Benchmarking
rocSPARSE performance on Crusher @ OLCF
Chebyshev expansions for modest matrix sizes (metals)
Chebyshev expansion of Density Matrix
Exploiting GPU concurrency in calculating Chebysh terms
Distributing computation
Balancing computational cost and accuracy with matrix thresholding
A non-intrusive implementation
What about wavefunction-based solver? (Planewaves...)
Numerical Discretization of DFT problem
Parallel scaling/performance on Summit
Lesson learned: Efficiently using GPUs requires a lo work!
Acknowledgments
Taught by
Exascale Computing Project