Title: A three-dimensional approach to parallel matrix multiplication
Abstract: A three-dimensional (3D) matrix multiplication algorithm for massively parallel processing systems is presented. The P processors are configured as a “virtual” processing cube with dimensions p <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</inf> , p <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</inf> , and p <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</inf> proportional to the matrices' dimensions—M, N, and K. Each processor performs a single local matrix multiplication of size M/p <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</inf> × N/p <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</inf> × K/p <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</inf> . Before the local computation can be carried out, each subcube must receive a single submatrix of A and B. After the single matrix multiplication has completed, K/p <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</inf> submatrices of this product must be sent to their respective destination processors and then summed together with the resulting matrix C. The 3D parallel matrix multiplication approach has a factor of P <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1/6</sup> less communication than the 2D parallel algorithms. This algorithm has been implemented on IBM POWERparallel™ SP2™ systems (up to 216 nodes) and has yielded close to the peak performance of the machine. The algorithm has been combined with Winograd's variant of Strassen's algorithm to achieve performance which exceeds the theoretical peak of the system. (We assume the MFLOPS rate of matrix multiplication to be 2MNK.)
Publication Year: 1995
Publication Date: 1995-09-01
Language: en
Type: article
Indexed In: ['crossref']
Access and Citation
Cited By Count: 157
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot