A modified TreePM code
We discuss the performance characteristics of using the modification of the tree code suggested by Barnes in the context of the TreePM code. The optimization involves identifying groups of particles and using only one tree walk to compute the force for all the particles in the group. This modification has been in use in our implementation of the TreePM code for some time, and has also been used by others in codes that make use of tree structures. We present the first detailed study of the performance characteristics of this optimization. We show that the modification, if tuned properly, can speed up the TreePM code by a significant amount. We also combine this modification with the use of individual time steps and indicate how to combine these two schemes in an optimal fashion. We find that the combination is at least a factor of two faster than the modified TreePM without individual time steps. Overall performance is often faster by a larger factor because the scheme for the groups optimizes the use of cache for large simulations.
- There are currently no refbacks.
Print ISSN: 1674-4527
Online ISSN: 2397-6209