Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework
This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through...
Ausführliche Beschreibung
Autor*in: |
Pham, Lam [verfasserIn] |
---|
Format: |
E-Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2021transfer abstract |
---|
Schlagwörter: |
---|
Übergeordnetes Werk: |
Enthalten in: Modelling SARS-CoV-2 transmission in a UK university setting - Hill, Edward M. ELSEVIER, 2021, a review journal, Orlando, Fla |
---|---|
Übergeordnetes Werk: |
volume:110 ; year:2021 ; pages:0 |
Links: |
---|
DOI / URN: |
10.1016/j.dsp.2020.102943 |
---|
Katalog-ID: |
ELV052806324 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | ELV052806324 | ||
003 | DE-627 | ||
005 | 20230626033810.0 | ||
007 | cr uuu---uuuuu | ||
008 | 210910s2021 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1016/j.dsp.2020.102943 |2 doi | |
028 | 5 | 2 | |a /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica |
035 | |a (DE-627)ELV052806324 | ||
035 | |a (ELSEVIER)S1051-2004(20)30288-8 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 610 |q VZ |
084 | |a 44.75 |2 bkl | ||
100 | 1 | |a Pham, Lam |e verfasserin |4 aut | |
245 | 1 | 0 | |a Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework |
264 | 1 | |c 2021transfer abstract | |
336 | |a nicht spezifiziert |b zzz |2 rdacontent | ||
337 | |a nicht spezifiziert |b z |2 rdamedia | ||
338 | |a nicht spezifiziert |b zu |2 rdacarrier | ||
520 | |a This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. | ||
520 | |a This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. | ||
650 | 7 | |a Acoustic scene classification |2 Elsevier | |
650 | 7 | |a Encoder-decoder network |2 Elsevier | |
650 | 7 | |a High-level features |2 Elsevier | |
650 | 7 | |a Multi-spectrogram |2 Elsevier | |
650 | 7 | |a Low-level features |2 Elsevier | |
700 | 1 | |a Phan, Huy |4 oth | |
700 | 1 | |a Nguyen, Truc |4 oth | |
700 | 1 | |a Palaniappan, Ramaswamy |4 oth | |
700 | 1 | |a Mertins, Alfred |4 oth | |
700 | 1 | |a McLoughlin, Ian |4 oth | |
773 | 0 | 8 | |i Enthalten in |n Academic Press |a Hill, Edward M. ELSEVIER |t Modelling SARS-CoV-2 transmission in a UK university setting |d 2021 |d a review journal |g Orlando, Fla |w (DE-627)ELV006540295 |
773 | 1 | 8 | |g volume:110 |g year:2021 |g pages:0 |
856 | 4 | 0 | |u https://doi.org/10.1016/j.dsp.2020.102943 |3 Volltext |
912 | |a GBV_USEFLAG_U | ||
912 | |a GBV_ELV | ||
912 | |a SYSFLAG_U | ||
912 | |a SSG-OLC-PHA | ||
936 | b | k | |a 44.75 |j Infektionskrankheiten |j parasitäre Krankheiten |x Medizin |q VZ |
951 | |a AR | ||
952 | |d 110 |j 2021 |h 0 |
author_variant |
l p lp |
---|---|
matchkey_str |
phamlamphanhuynguyentrucpalaniappanramas:2021----:outcutccncasfctouigmlipcrgae |
hierarchy_sort_str |
2021transfer abstract |
bklnumber |
44.75 |
publishDate |
2021 |
allfields |
10.1016/j.dsp.2020.102943 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica (DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8 DE-627 ger DE-627 rakwb eng 610 VZ 44.75 bkl Pham, Lam verfasserin aut Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework 2021transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier Phan, Huy oth Nguyen, Truc oth Palaniappan, Ramaswamy oth Mertins, Alfred oth McLoughlin, Ian oth Enthalten in Academic Press Hill, Edward M. ELSEVIER Modelling SARS-CoV-2 transmission in a UK university setting 2021 a review journal Orlando, Fla (DE-627)ELV006540295 volume:110 year:2021 pages:0 https://doi.org/10.1016/j.dsp.2020.102943 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA 44.75 Infektionskrankheiten parasitäre Krankheiten Medizin VZ AR 110 2021 0 |
spelling |
10.1016/j.dsp.2020.102943 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica (DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8 DE-627 ger DE-627 rakwb eng 610 VZ 44.75 bkl Pham, Lam verfasserin aut Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework 2021transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier Phan, Huy oth Nguyen, Truc oth Palaniappan, Ramaswamy oth Mertins, Alfred oth McLoughlin, Ian oth Enthalten in Academic Press Hill, Edward M. ELSEVIER Modelling SARS-CoV-2 transmission in a UK university setting 2021 a review journal Orlando, Fla (DE-627)ELV006540295 volume:110 year:2021 pages:0 https://doi.org/10.1016/j.dsp.2020.102943 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA 44.75 Infektionskrankheiten parasitäre Krankheiten Medizin VZ AR 110 2021 0 |
allfields_unstemmed |
10.1016/j.dsp.2020.102943 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica (DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8 DE-627 ger DE-627 rakwb eng 610 VZ 44.75 bkl Pham, Lam verfasserin aut Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework 2021transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier Phan, Huy oth Nguyen, Truc oth Palaniappan, Ramaswamy oth Mertins, Alfred oth McLoughlin, Ian oth Enthalten in Academic Press Hill, Edward M. ELSEVIER Modelling SARS-CoV-2 transmission in a UK university setting 2021 a review journal Orlando, Fla (DE-627)ELV006540295 volume:110 year:2021 pages:0 https://doi.org/10.1016/j.dsp.2020.102943 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA 44.75 Infektionskrankheiten parasitäre Krankheiten Medizin VZ AR 110 2021 0 |
allfieldsGer |
10.1016/j.dsp.2020.102943 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica (DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8 DE-627 ger DE-627 rakwb eng 610 VZ 44.75 bkl Pham, Lam verfasserin aut Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework 2021transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier Phan, Huy oth Nguyen, Truc oth Palaniappan, Ramaswamy oth Mertins, Alfred oth McLoughlin, Ian oth Enthalten in Academic Press Hill, Edward M. ELSEVIER Modelling SARS-CoV-2 transmission in a UK university setting 2021 a review journal Orlando, Fla (DE-627)ELV006540295 volume:110 year:2021 pages:0 https://doi.org/10.1016/j.dsp.2020.102943 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA 44.75 Infektionskrankheiten parasitäre Krankheiten Medizin VZ AR 110 2021 0 |
allfieldsSound |
10.1016/j.dsp.2020.102943 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica (DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8 DE-627 ger DE-627 rakwb eng 610 VZ 44.75 bkl Pham, Lam verfasserin aut Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework 2021transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier Phan, Huy oth Nguyen, Truc oth Palaniappan, Ramaswamy oth Mertins, Alfred oth McLoughlin, Ian oth Enthalten in Academic Press Hill, Edward M. ELSEVIER Modelling SARS-CoV-2 transmission in a UK university setting 2021 a review journal Orlando, Fla (DE-627)ELV006540295 volume:110 year:2021 pages:0 https://doi.org/10.1016/j.dsp.2020.102943 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA 44.75 Infektionskrankheiten parasitäre Krankheiten Medizin VZ AR 110 2021 0 |
language |
English |
source |
Enthalten in Modelling SARS-CoV-2 transmission in a UK university setting Orlando, Fla volume:110 year:2021 pages:0 |
sourceStr |
Enthalten in Modelling SARS-CoV-2 transmission in a UK university setting Orlando, Fla volume:110 year:2021 pages:0 |
format_phy_str_mv |
Article |
bklname |
Infektionskrankheiten parasitäre Krankheiten |
institution |
findex.gbv.de |
topic_facet |
Acoustic scene classification Encoder-decoder network High-level features Multi-spectrogram Low-level features |
dewey-raw |
610 |
isfreeaccess_bool |
false |
container_title |
Modelling SARS-CoV-2 transmission in a UK university setting |
authorswithroles_txt_mv |
Pham, Lam @@aut@@ Phan, Huy @@oth@@ Nguyen, Truc @@oth@@ Palaniappan, Ramaswamy @@oth@@ Mertins, Alfred @@oth@@ McLoughlin, Ian @@oth@@ |
publishDateDaySort_date |
2021-01-01T00:00:00Z |
hierarchy_top_id |
ELV006540295 |
dewey-sort |
3610 |
id |
ELV052806324 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">ELV052806324</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230626033810.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">210910s2021 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1016/j.dsp.2020.102943</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">/cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)ELV052806324</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ELSEVIER)S1051-2004(20)30288-8</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">610</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">44.75</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Pham, Lam</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2021transfer abstract</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zzz</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">z</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zu</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Acoustic scene classification</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Encoder-decoder network</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">High-level features</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Multi-spectrogram</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Low-level features</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Phan, Huy</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Nguyen, Truc</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Palaniappan, Ramaswamy</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Mertins, Alfred</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">McLoughlin, Ian</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="n">Academic Press</subfield><subfield code="a">Hill, Edward M. ELSEVIER</subfield><subfield code="t">Modelling SARS-CoV-2 transmission in a UK university setting</subfield><subfield code="d">2021</subfield><subfield code="d">a review journal</subfield><subfield code="g">Orlando, Fla</subfield><subfield code="w">(DE-627)ELV006540295</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:110</subfield><subfield code="g">year:2021</subfield><subfield code="g">pages:0</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1016/j.dsp.2020.102943</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ELV</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHA</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">44.75</subfield><subfield code="j">Infektionskrankheiten</subfield><subfield code="j">parasitäre Krankheiten</subfield><subfield code="x">Medizin</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">110</subfield><subfield code="j">2021</subfield><subfield code="h">0</subfield></datafield></record></collection>
|
author |
Pham, Lam |
spellingShingle |
Pham, Lam ddc 610 bkl 44.75 Elsevier Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework |
authorStr |
Pham, Lam |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)ELV006540295 |
format |
electronic Article |
dewey-ones |
610 - Medicine & health |
delete_txt_mv |
keep |
author_role |
aut |
collection |
elsevier |
remote_str |
true |
illustrated |
Not Illustrated |
topic_title |
610 VZ 44.75 bkl Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features Elsevier |
topic |
ddc 610 bkl 44.75 Elsevier Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features |
topic_unstemmed |
ddc 610 bkl 44.75 Elsevier Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features |
topic_browse |
ddc 610 bkl 44.75 Elsevier Acoustic scene classification Elsevier Encoder-decoder network Elsevier High-level features Elsevier Multi-spectrogram Elsevier Low-level features |
format_facet |
Elektronische Aufsätze Aufsätze Elektronische Ressource |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
zu |
author2_variant |
h p hp t n tn r p rp a m am i m im |
hierarchy_parent_title |
Modelling SARS-CoV-2 transmission in a UK university setting |
hierarchy_parent_id |
ELV006540295 |
dewey-tens |
610 - Medicine & health |
hierarchy_top_title |
Modelling SARS-CoV-2 transmission in a UK university setting |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)ELV006540295 |
title |
Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework |
ctrlnum |
(DE-627)ELV052806324 (ELSEVIER)S1051-2004(20)30288-8 |
title_full |
Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework |
author_sort |
Pham, Lam |
journal |
Modelling SARS-CoV-2 transmission in a UK university setting |
journalStr |
Modelling SARS-CoV-2 transmission in a UK university setting |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
600 - Technology |
recordtype |
marc |
publishDateSort |
2021 |
contenttype_str_mv |
zzz |
container_start_page |
0 |
author_browse |
Pham, Lam |
container_volume |
110 |
class |
610 VZ 44.75 bkl |
format_se |
Elektronische Aufsätze |
author-letter |
Pham, Lam |
doi_str_mv |
10.1016/j.dsp.2020.102943 |
dewey-full |
610 |
title_sort |
robust acoustic scene classification using a multi-spectrogram encoder-decoder framework |
title_auth |
Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework |
abstract |
This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. |
abstractGer |
This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. |
abstract_unstemmed |
This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks. |
collection_details |
GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OLC-PHA |
title_short |
Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework |
url |
https://doi.org/10.1016/j.dsp.2020.102943 |
remote_bool |
true |
author2 |
Phan, Huy Nguyen, Truc Palaniappan, Ramaswamy Mertins, Alfred McLoughlin, Ian |
author2Str |
Phan, Huy Nguyen, Truc Palaniappan, Ramaswamy Mertins, Alfred McLoughlin, Ian |
ppnlink |
ELV006540295 |
mediatype_str_mv |
z |
isOA_txt |
false |
hochschulschrift_bool |
false |
author2_role |
oth oth oth oth oth |
doi_str |
10.1016/j.dsp.2020.102943 |
up_date |
2024-07-06T17:12:28.432Z |
_version_ |
1803850558666178560 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">ELV052806324</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230626033810.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">210910s2021 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1016/j.dsp.2020.102943</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">/cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001272.pica</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)ELV052806324</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ELSEVIER)S1051-2004(20)30288-8</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">610</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">44.75</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Pham, Lam</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Robust acoustic scene classification using a multi-spectrogram encoder-decoder framework</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2021transfer abstract</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zzz</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">z</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zu</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">This article proposes an encoder-decoder network model for Acoustic Scene Classification (ASC), the task of identifying the scene of an audio recording from its acoustic signature. We make use of multiple low-level spectrogram features at the front-end, transformed into higher level features through a well-trained CNN-DNN front-end encoder. The high-level features and their combination (via a trained feature combiner) are then fed into different decoder models comprising random forest regression, DNNs and a mixture of experts, for back-end classification. We conduct extensive experiments to evaluate the performance of this framework on various ASC datasets, including LITIS Rouen and IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE) 2016 Task 1, 2017 Task 1, 2018 Tasks 1A & 1B and 2019 Tasks 1A & 1B. The experimental results highlight two main contributions; the first is an effective method for high-level feature extraction from multi-spectrogram input via the novel CNN-DNN architecture encoder network, and the second is the proposed decoder which enables the framework to achieve competitive results on various datasets. The fact that a single framework is highly competitive for several different challenges is an indicator of its robustness for performing general ASC tasks.</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Acoustic scene classification</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Encoder-decoder network</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">High-level features</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Multi-spectrogram</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Low-level features</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Phan, Huy</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Nguyen, Truc</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Palaniappan, Ramaswamy</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Mertins, Alfred</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">McLoughlin, Ian</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="n">Academic Press</subfield><subfield code="a">Hill, Edward M. ELSEVIER</subfield><subfield code="t">Modelling SARS-CoV-2 transmission in a UK university setting</subfield><subfield code="d">2021</subfield><subfield code="d">a review journal</subfield><subfield code="g">Orlando, Fla</subfield><subfield code="w">(DE-627)ELV006540295</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:110</subfield><subfield code="g">year:2021</subfield><subfield code="g">pages:0</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1016/j.dsp.2020.102943</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ELV</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHA</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">44.75</subfield><subfield code="j">Infektionskrankheiten</subfield><subfield code="j">parasitäre Krankheiten</subfield><subfield code="x">Medizin</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">110</subfield><subfield code="j">2021</subfield><subfield code="h">0</subfield></datafield></record></collection>
|
score |
7.4014626 |