DBT: multimodal emotion recognition based on dual-branch transformer

Abstract There are very few labeled datasets in speech emotion recognition. The reason is that emotion is subjective and requires much time for labeling experts to identify emotion categories, while the wav2vec2.0 model is a general model for obtaining speech representations through self-supervised...
Ausführliche Beschreibung

Gespeichert in:
Autor*in:

Yi, Yufan [verfasserIn]

Tian, Yan

He, Cong

Fan, Yajing

Hu, Xinli

Xu, Yiping

Format:

E-Artikel

Sprache:

Englisch

Erschienen:

2022

Schlagwörter:

wav2vec2.0

Model fine-tuning

Adaptive interlayer fusion

Weighted label smoothing

Weighted DS strategy

Anmerkung:

© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Übergeordnetes Werk:

Enthalten in: The journal of supercomputing - Dordrecht [u.a.] : Springer Science + Business Media B.V, 1987, 79(2022), 8 vom: 21. Dez., Seite 8611-8633

Übergeordnetes Werk:

volume:79 ; year:2022 ; number:8 ; day:21 ; month:12 ; pages:8611-8633

Links:

Volltext

DOI / URN:

10.1007/s11227-022-05001-5

Katalog-ID:

SPR049955497

Nicht das Richtige dabei?

Schreiben Sie uns!