The position-based compression techniques for DNN model

Abstract In deep neural network (DNN) accelerators, it is expensive to transfer model parameters from the main memory to the processing elements. Data movement accounts for a large number of the inference latency and energy consumption. In this paper, we present three position-based techniques to co...
Ausführliche Beschreibung

Gespeichert in:
Autor*in:

Tang, Minghua [verfasserIn]

Russo, Enrico

Palesi, Maurizio

Format:

Artikel

Sprache:

Englisch

Erschienen:

2023

Schlagwörter:

Deep neural networks

Deep neural network accelerator

Weights compression

Anmerkung:

© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. corrected publication 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Übergeordnetes Werk:

Enthalten in: The journal of supercomputing - Springer US, 1987, 79(2023), 15 vom: 08. Mai, Seite 17445-17474

Übergeordnetes Werk:

volume:79 ; year:2023 ; number:15 ; day:08 ; month:05 ; pages:17445-17474

Links:

Volltext

DOI / URN:

10.1007/s11227-023-05339-4

Katalog-ID:

OLC2145310800

Nicht das Richtige dabei?

Schreiben Sie uns!