Author of the publication

Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis.

, , , , , , , and . PMBS@SC, volume 8966 of Lecture Notes in Computer Science, page 129-148. Springer, (2014)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Sparse Matrix Code Dependence Analysis Simplification at Compile Time., , , , , , , and . CoRR, (2018)Exploiting Superword-Level Locality in Multimedia Extension Architectures., , and . J. Instruction-Level Parallelism, (2003)Characterizing the Memory Behavior of Compiler-Parallelized Applications., , , and . IEEE Trans. Parallel Distributed Syst., 7 (12): 1224-1237 (1996)Data-driven Mixed Precision Sparse Matrix Vector Multiplication for GPUs., , and . ACM Trans. Archit. Code Optim., 16 (4): 51:1-51:24 (2020)ytopt: Autotuning Scientific Applications for Energy Efficiency at Large Scales., , , , , , , , , and . CoRR, (2023)Autotuning PolyBench Benchmarks with LLVM Clang/Polly Loop Optimization Pragmas Using Bayesian Optimization (extended version)., , , , , , and . CoRR, (2021)Domain-Specific Optimization of Signal Recognition Targeting FPGAs., , , , and . ACM Trans. Reconfigurable Technol. Syst., 4 (2): 17:1-17:26 (2011)Towards making autotuning mainstream., , , , , , , , and . Int. J. High Perform. Comput. Appl., 27 (4): 379-393 (2013)Adaptive parallelism in compiler-parallelized code., and . Concurr. Pract. Exp., 10 (14): 1235-1250 (1998)Hierarchical parallelization and optimization of high-order stencil computations on multicore clusters., , , , , , , , , and . J. Supercomput., 62 (2): 946-966 (2012)