PKDGRAV Performance Testing *

Brandon Allgood
Department of Physics
University of California, Santa Cruz, CA 95064
allgood@maxwell.ucsc.edu

Joachim Stadel
Department of Astronomy
University of Washington, Seattle, WA 98195
stadel@astro.washington.edu

Thomas Quinn
Department of Astronomy
University of Washington, Seattle, WA 98195
trq@astro.washington.edu


Here we present the performance of PKDGRAV, a parallel N-body code, using MPI on clusters of Intel based PCs. We explore the results taken from two separate Beowulf class clusters at the University of Washington and at the Albuquerque High Performance Computing Center, using different hardware configurations and MPI implementations (MPICH and LAM/MPI). This is an on going project with the next phase being the implementation of VIA and GAMMA message passing layers on the local Astrolab cluster (UW). First we tested the performance of PKDGRAV on Astrolab (the Beowulf cluster at the University of Washington) using MPICH on 2, 4, 8, 12, and 16 nodes. We used a highly evolved particle distribution containing 1.3 million particles in a 50h-1 Mpc cube. We repeated this procedure using the LAM implementation. Next, we performed three types of runs on the Roadrunner cluster at the Albuquerque High Performance Computing Center. The Roadrunner cluster is a 64 node (two processors/node) Beowulf cluster, which uses MPICH with either ethernet and Myrinet passing layers. More about both the Astrolab and Roadrunner clusters can be found in the Cluster Specs section. Two sets of runs used ethernet; the first used one processor/node, the second used both processors on a node. The last run tested the performance of Myrinet, using only one processor per node. All of these were done using the highly evolved distribution. Finally, we run a set over Myrinet, using a smooth distribution of particles, in order to compare it's scaling to that of the highly evolved distribution. All of this has been done to find bottle necks in the codes performance if they exist. This information will be used to improve PKDGRAV and the Astrolab cluster.


The rest of the web page is organized as follows. The PKDGRAV section gives a brief survey of the major aspects of the hierarchical tree-algorithm employed in PKDGRAV. In the Cluster Specs section we present a table of specifications for each cluster. In the Performance section we present the test results and we examine the scaling of each run and compare their performances. In the last two sections we present a bibliography and outside links to points of interest.



* This work was partially supported by National Computational Science Alliance under grant number 1999033 and utilized the UNM-Alliance Roadrunner Supercluster at the Albuquerque High Performance Computing Center.