Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework

This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Pham, Lam [verfasserIn] Phan, Huy Nguyen, Truc Palaniappan, Ramaswamy Mertins, Alfred McLoughlin, Ian

Format:	E-Artikel
Sprache:	Englisch

Erschienen:	2021transfer abstract

Schlagwörter:	Acoustic scene classification Encoder-decoder network High-level features Multi-spectrogram Low-level features

Übergeordnetes Werk:	Enthalten in: Modelling SARS-CoV-2 transmission in a UK university setting - Hill, Edward M. ELSEVIER, 2021, a review journal, Orlando, Fla
Übergeordnetes Werk:	volume:110 ; year:2021 ; pages:0

Links:	Volltext

DOI / URN:	10.1016/j.dsp.2020.102943

Katalog-ID:	ELV052806324

Internformat


LEADER	01000caa a22002652 4500
001	ELV052806324
003	DE-627
005	20230626033810.0
007	cr uuu---uuuuu
008	210910s2021 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1016/j.dsp.2020.102943 \|2 doi
028	5	2	\|a /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica
035			\|a (DE-627)ELV052806324
035			\|a (ELSEVIER)S1051-2004(20)30288-8
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 610 \|q VZ
084			\|a 44.75 \|2 bkl
100	1		\|a Pham, Lam \|e verfasserin \|4 aut
245	1	0	\|a Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework
264		1	\|c 2021transfer abstract
336			\|a nicht spezifiziert \|b zzz \|2 rdacontent
337			\|a nicht spezifiziert \|b z \|2 rdamedia
338			\|a nicht spezifiziert \|b zu \|2 rdacarrier
520			\|a This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.
520			\|a This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.
650		7	\|a Acoustic scene classification \|2 Elsevier
650		7	\|a Encoder-decoder network \|2 Elsevier
650		7	\|a High-level features \|2 Elsevier
650		7	\|a Multi-spectrogram \|2 Elsevier
650		7	\|a Low-level features \|2 Elsevier
700	1		\|a Phan, Huy \|4 oth
700	1		\|a Nguyen, Truc \|4 oth
700	1		\|a Palaniappan, Ramaswamy \|4 oth
700	1		\|a Mertins, Alfred \|4 oth
700	1		\|a McLoughlin, Ian \|4 oth
773	0	8	\|i Enthalten in \|n Academic Press \|a Hill, Edward M. ELSEVIER \|t Modelling SARS-CoV-2 transmission in a UK university setting \|d 2021 \|d a review journal \|g Orlando, Fla \|w (DE-627)ELV006540295
773	1	8	\|g volume:110 \|g year:2021 \|g pages:0
856	4	0	\|u https://doi.org/10.1016/j.dsp.2020.102943 \|3 Volltext
912			\|a GBV_USEFLAG_U
912			\|a GBV_ELV
912			\|a SYSFLAG_U
912			\|a SSG-OLC-PHA
936	b	k	\|a 44.75 \|j Infektionskrankheiten \|j parasitäre Krankheiten \|x Medizin \|q VZ
951			\|a AR
952			\|d 110 \|j 2021 \|h 0

Indexfelder

author_variant	l p lp
matchkey_str	phamlamphanhuynguyentrucpalaniappanramas:2021----:outcutccncasfctouigmlipcrgae
hierarchy_sort_str	2021transfer abstract
bklnumber	44.75
publishDate	2021
allfields	10.1016/j.dsp.2020.102943 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica (DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8 DE-627 ger DE-627 rakwb eng 610 VZ 44.75 bkl Pham, Lam verfasserin aut Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework 2021transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier Phan, Huy oth Nguyen, Truc oth Palaniappan, Ramaswamy oth Mertins, Alfred oth McLoughlin, Ian oth Enthalten in Academic Press Hill, Edward M. ELSEVIER Modelling SARS-CoV-2 transmission in a UK university setting 2021 a review journal Orlando, Fla (DE-627)ELV006540295 volume:110 year:2021 pages:0 https://doi.org/10.1016/j.dsp.2020.102943 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA 44.75 Infektionskrankheiten parasitäre Krankheiten Medizin VZ AR 110 2021 0
spelling	10.1016/j.dsp.2020.102943 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica (DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8 DE-627 ger DE-627 rakwb eng 610 VZ 44.75 bkl Pham, Lam verfasserin aut Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework 2021transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier Phan, Huy oth Nguyen, Truc oth Palaniappan, Ramaswamy oth Mertins, Alfred oth McLoughlin, Ian oth Enthalten in Academic Press Hill, Edward M. ELSEVIER Modelling SARS-CoV-2 transmission in a UK university setting 2021 a review journal Orlando, Fla (DE-627)ELV006540295 volume:110 year:2021 pages:0 https://doi.org/10.1016/j.dsp.2020.102943 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA 44.75 Infektionskrankheiten parasitäre Krankheiten Medizin VZ AR 110 2021 0
allfields_unstemmed	10.1016/j.dsp.2020.102943 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica (DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8 DE-627 ger DE-627 rakwb eng 610 VZ 44.75 bkl Pham, Lam verfasserin aut Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework 2021transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier Phan, Huy oth Nguyen, Truc oth Palaniappan, Ramaswamy oth Mertins, Alfred oth McLoughlin, Ian oth Enthalten in Academic Press Hill, Edward M. ELSEVIER Modelling SARS-CoV-2 transmission in a UK university setting 2021 a review journal Orlando, Fla (DE-627)ELV006540295 volume:110 year:2021 pages:0 https://doi.org/10.1016/j.dsp.2020.102943 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA 44.75 Infektionskrankheiten parasitäre Krankheiten Medizin VZ AR 110 2021 0
allfieldsGer	10.1016/j.dsp.2020.102943 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica (DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8 DE-627 ger DE-627 rakwb eng 610 VZ 44.75 bkl Pham, Lam verfasserin aut Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework 2021transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier Phan, Huy oth Nguyen, Truc oth Palaniappan, Ramaswamy oth Mertins, Alfred oth McLoughlin, Ian oth Enthalten in Academic Press Hill, Edward M. ELSEVIER Modelling SARS-CoV-2 transmission in a UK university setting 2021 a review journal Orlando, Fla (DE-627)ELV006540295 volume:110 year:2021 pages:0 https://doi.org/10.1016/j.dsp.2020.102943 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA 44.75 Infektionskrankheiten parasitäre Krankheiten Medizin VZ AR 110 2021 0
allfieldsSound	10.1016/j.dsp.2020.102943 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica (DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8 DE-627 ger DE-627 rakwb eng 610 VZ 44.75 bkl Pham, Lam verfasserin aut Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework 2021transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier Phan, Huy oth Nguyen, Truc oth Palaniappan, Ramaswamy oth Mertins, Alfred oth McLoughlin, Ian oth Enthalten in Academic Press Hill, Edward M. ELSEVIER Modelling SARS-CoV-2 transmission in a UK university setting 2021 a review journal Orlando, Fla (DE-627)ELV006540295 volume:110 year:2021 pages:0 https://doi.org/10.1016/j.dsp.2020.102943 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA 44.75 Infektionskrankheiten parasitäre Krankheiten Medizin VZ AR 110 2021 0
language	English
source	Enthalten in Modelling SARS-CoV-2 transmission in a UK university setting Orlando, Fla volume:110 year:2021 pages:0
sourceStr	Enthalten in Modelling SARS-CoV-2 transmission in a UK university setting Orlando, Fla volume:110 year:2021 pages:0
format_phy_str_mv	Article
bklname	Infektionskrankheiten parasitäre Krankheiten
institution	findex.gbv.de
topic_facet	Acoustic scene classification Encoder-decoder network High-level features Multi-spectrogram Low-level features
dewey-raw	610
isfreeaccess_bool	false
container_title	Modelling SARS-CoV-2 transmission in a UK university setting
authorswithroles_txt_mv	Pham, Lam @@aut@@ Phan, Huy @@oth@@ Nguyen, Truc @@oth@@ Palaniappan, Ramaswamy @@oth@@ Mertins, Alfred @@oth@@ McLoughlin, Ian @@oth@@
publishDateDaySort_date	2021-01-01T00:00:00Z
hierarchy_top_id	ELV006540295
dewey-sort	3610
id	ELV052806324
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">ELV052806324</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230626033810.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">210910s2021 xx \|\|\|\|\|o 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1016/j.dsp.2020.102943</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">/cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)ELV052806324</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ELSEVIER)S1051-2004(20)30288-8</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">610</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">44.75</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Pham, Lam</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2021transfer abstract</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zzz</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">z</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zu</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Acoustic scene classification</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Encoder-decoder network</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">High-level features</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Multi-spectrogram</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Low-level features</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Phan, Huy</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Nguyen, Truc</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Palaniappan, Ramaswamy</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Mertins, Alfred</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">McLoughlin, Ian</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="n">Academic Press</subfield><subfield code="a">Hill, Edward M. ELSEVIER</subfield><subfield code="t">Modelling SARS-CoV-2 transmission in a UK university setting</subfield><subfield code="d">2021</subfield><subfield code="d">a review journal</subfield><subfield code="g">Orlando, Fla</subfield><subfield code="w">(DE-627)ELV006540295</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:110</subfield><subfield code="g">year:2021</subfield><subfield code="g">pages:0</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1016/j.dsp.2020.102943</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ELV</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHA</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">44.75</subfield><subfield code="j">Infektionskrankheiten</subfield><subfield code="j">parasitäre Krankheiten</subfield><subfield code="x">Medizin</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">110</subfield><subfield code="j">2021</subfield><subfield code="h">0</subfield></datafield></record></collection>
author	Pham, Lam
spellingShingle	Pham, Lam ddc 610 bkl 44.75 Elsevier Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework
authorStr	Pham, Lam
ppnlink_with_tag_str_mv	@@773@@(DE-627)ELV006540295
format	electronic Article
dewey-ones	610 - Medicine & health
delete_txt_mv	keep
author_role	aut
collection	elsevier
remote_str	true
illustrated	Not Illustrated
topic_title	610 VZ 44.75 bkl Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier
topic	ddc 610 bkl 44.75 Elsevier Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features
topic_unstemmed	ddc 610 bkl 44.75 Elsevier Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features
topic_browse	ddc 610 bkl 44.75 Elsevier Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features
format_facet	Elektronische Aufsätze Aufsätze Elektronische Ressource
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	zu
author2_variant	h p hp t n tn r p rp a m am i m im
hierarchy_parent_title	Modelling SARS-CoV-2 transmission in a UK university setting
hierarchy_parent_id	ELV006540295
dewey-tens	610 - Medicine & health
hierarchy_top_title	Modelling SARS-CoV-2 transmission in a UK university setting
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)ELV006540295
title	Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework
ctrlnum	(DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8
title_full	Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework
author_sort	Pham, Lam
journal	Modelling SARS-CoV-2 transmission in a UK university setting
journalStr	Modelling SARS-CoV-2 transmission in a UK university setting
lang_code	eng
isOA_bool	false
dewey-hundreds	600 - Technology
recordtype	marc
publishDateSort	2021
contenttype_str_mv	zzz
container_start_page	0
author_browse	Pham, Lam
container_volume	110
class	610 VZ 44.75 bkl
format_se	Elektronische Aufsätze
author-letter	Pham, Lam
doi_str_mv	10.1016/j.dsp.2020.102943
dewey-full	610
title_sort	robust acoustic scene classification using a multi-spectrogram encoder-decoder framework
title_auth	Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework
abstract	This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.
abstractGer	This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.
abstract_unstemmed	This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.
collection_details	GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA
title_short	Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework
url	https://doi.org/10.1016/j.dsp.2020.102943
remote_bool	true
author2	Phan, Huy Nguyen, Truc Palaniappan, Ramaswamy Mertins, Alfred McLoughlin, Ian
author2Str	Phan, Huy Nguyen, Truc Palaniappan, Ramaswamy Mertins, Alfred McLoughlin, Ian
ppnlink	ELV006540295
mediatype_str_mv	z
isOA_txt	false
hochschulschrift_bool	false
author2_role	oth oth oth oth oth
doi_str	10.1016/j.dsp.2020.102943
up_date	2024-07-06T17:12:28.432Z
_version_	1803850558666178560
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">ELV052806324</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230626033810.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">210910s2021 xx \|\|\|\|\|o 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1016/j.dsp.2020.102943</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">/cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)ELV052806324</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ELSEVIER)S1051-2004(20)30288-8</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">610</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">44.75</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Pham, Lam</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2021transfer abstract</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zzz</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">z</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zu</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Acoustic scene classification</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Encoder-decoder network</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">High-level features</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Multi-spectrogram</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Low-level features</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Phan, Huy</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Nguyen, Truc</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Palaniappan, Ramaswamy</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Mertins, Alfred</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">McLoughlin, Ian</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="n">Academic Press</subfield><subfield code="a">Hill, Edward M. ELSEVIER</subfield><subfield code="t">Modelling SARS-CoV-2 transmission in a UK university setting</subfield><subfield code="d">2021</subfield><subfield code="d">a review journal</subfield><subfield code="g">Orlando, Fla</subfield><subfield code="w">(DE-627)ELV006540295</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:110</subfield><subfield code="g">year:2021</subfield><subfield code="g">pages:0</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1016/j.dsp.2020.102943</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ELV</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHA</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">44.75</subfield><subfield code="j">Infektionskrankheiten</subfield><subfield code="j">parasitäre Krankheiten</subfield><subfield code="x">Medizin</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">110</subfield><subfield code="j">2021</subfield><subfield code="h">0</subfield></datafield></record></collection>
score	7.4014626

Nicht das Richtige dabei?

Schreiben Sie uns!

Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?