@achakraborty

Anatomy of High-performance Matrix Multiplication

, and . ACM Trans. Math. Softw., 34 (3): 12:1--12:25 (May 2008)
DOI: 10.1145/1356052.1356053

Abstract

We present the basic principles that underlie the high-performance implementation of the matrix-matrix multiplication that is part of the widely used GotoBLAS library. Design decisions are justified by successively refining a model of architectures with multilevel memories. A simple but effective algorithm for executing this operation results. Implementations on a broad selection of architectures are shown to achieve near-peak performance.

Description

Anatomy of high-performance matrix multiplication

Links and resources

Tags

community

  • @achakraborty
  • @dblp
@achakraborty's tags highlighted