next up previous
Next: Parallel Implementation Up: NMR Chemical Shift Calculations Previous: Chemical Shift Calculations

Computational Kernels

Fortunately, there are only two performance critical operations that the linear system solver and the iterative eigen solver require:

The matrix-vector product is the more difficult part of the two kernels. We can cut the computational cost and avoid the storage of H by exploiting its special structure. As an operator, it can be written as:

 

Since we keep the vectors in Fourier space, the term is a diagonal matrix, and easy to apply. The second term (``nonlocal potential'') is a sum of rank one matrices, which is also simply applied in Fourier space as a matrix-matrix multiplication. The local potential term in (10) is diagonal in a realspace basis. Therefore, it is applied to a wave function by first inverse Fourier transforming (operation count ), then multiplying with a diagonal operator in realspace (operation count ), and finally Fourier transforming back to the plane wave basis. Since , and m and N are proportional to the number of atoms , the second term applied to a set of wave functions requires operations. The local term only scales like , but has a large prefactor in front, such that it dominates the computational cost for systems with less than 50 atoms.

Optimizing the kernel separately from the algorithm can lead to poor performance. We realized that the Fourier transform involved in has a large startup cost for bringing the real space work arrays and the local potential operator into cache. We gained a factor of 4 improvement by writing a blocked version of the conjugate gradient linear solver that solves all m equations in (7) simultaneously, and therefore applies to the full set of wave functions. This results in better cache utilization and avoids the low memory bandwidth of RISC-based architectures. Such an algorithmic change was not necessary for the conjugate gradient eigen solver, since all operations there happen naturally in a blocked fashion.



next up previous
Next: Parallel Implementation Up: NMR Chemical Shift Calculations Previous: Chemical Shift Calculations



Bernd Pfrommer
Mon May 26 12:08:17 PDT 1997