Improving blocked matrix-matrix multiplication routine by utilizing AVX-512 instructions on intel knights landing and xeon scalable processors

Abstract In high-performance computing, the general matrix-matrix multiplication (xGEMM) routine is the core of the Level 3 BLAS kernel for effective matrix-matrix multiplication operations. The performance of parallel xGEMM (PxGEMM) is significantly affected by two main factors: the flop rate that...
Ausführliche Beschreibung

Gespeichert in:
Autor*in:

Park, Yoosang [verfasserIn]

Kim, Raehyun

Nguyen, Thi My Tuyen

Choi, Jaeyoung

Format:

E-Artikel

Sprache:

Englisch

Erschienen:

2021

Schlagwörter:

Parallel matrix-matrix multiplication

Parallel BLAS

ScaLAPACK

Intel Xeon Phi

Intel Skylake-SP

AVX-512

Anmerkung:

© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021

Übergeordnetes Werk:

Enthalten in: Cluster computing - Dordrecht [u.a.] : Springer Science + Business Media B.V, 1998, 26(2021), 5 vom: 12. Apr., Seite 2539-2549

Übergeordnetes Werk:

volume:26 ; year:2021 ; number:5 ; day:12 ; month:04 ; pages:2539-2549

Links:

Volltext

DOI / URN:

10.1007/s10586-021-03274-8

Katalog-ID:

SPR052882918

Nicht das Richtige dabei?

Schreiben Sie uns!