Cache optimization and performance modeling of batched, small, and rectangular matrix multiplication on Intel, AMD, and Fujitsu processors

From MaRDI portal
Publication:6601380

DOI10.1145/3595178MaRDI QIDQ6601380

Rio Yokota, George Bosilca, Sameer Deshmukh

Publication date: 10 September 2024

Published in: ACM Transactions on Mathematical Software (Search for Journal in Brave)






Cites Work







This page was built for publication: Cache optimization and performance modeling of batched, small, and rectangular matrix multiplication on Intel, AMD, and Fujitsu processors

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6601380)