MPI, MPT, message passing, ccNUMA, Altix, memory access, shared memory, high bandwidth, low latency
The SGI® Message Passing Toolkit (MPT) software has implemented algorithms that provide extremely high-performance message passing on SGI Altix® systems based on the SGI NUMAlink™ interconnect technology. Using Linux® OS infrastructure and SGI XPMEM cross-host memory-mapping software, SGI MPI delivers extremely high MPI performance on shared-memory single host/SMP Altix systems as well as multihost superclusters. This paper outlines the Altix hardware features, OS features, and library software algorithms that have been developed to provide the low-latency and high-bandwidth capabilities. We present high-performance features like direct copy send/receive, collectives, and the ultralow-latency SHMEM™ data transfer library. We include MPI benchmark results, including an MPI ping pong latency that ranges from 1.2 to 2.3 microseconds on a 512-CPU Altix system with 1.5 GHz Intel® Itanium® 2 Processors.