Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further...
Ausführliche Beschreibung
Autor*in: |
Chen, Haiyuan [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2022 |
---|
Schlagwörter: |
---|
Anmerkung: |
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 |
---|
Übergeordnetes Werk: |
Enthalten in: Applied intelligence - Springer US, 1991, 52(2022), 13 vom: 17. März, Seite 15673-15689 |
---|---|
Übergeordnetes Werk: |
volume:52 ; year:2022 ; number:13 ; day:17 ; month:03 ; pages:15673-15689 |
Links: |
---|
DOI / URN: |
10.1007/s10489-022-03232-w |
---|
Katalog-ID: |
OLC2079660861 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | OLC2079660861 | ||
003 | DE-627 | ||
005 | 20230506072430.0 | ||
007 | tu | ||
008 | 221221s2022 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1007/s10489-022-03232-w |2 doi | |
035 | |a (DE-627)OLC2079660861 | ||
035 | |a (DE-He213)s10489-022-03232-w-p | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 004 |q VZ |
100 | 1 | |a Chen, Haiyuan |e verfasserin |4 aut | |
245 | 1 | 0 | |a Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
500 | |a © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 | ||
520 | |a Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. | ||
650 | 4 | |a Attention mechanism | |
650 | 4 | |a Feature filtering | |
650 | 4 | |a Fine-grained visual classification | |
650 | 4 | |a Self-supervised learning | |
700 | 1 | |a Cheng, Lianglun |4 aut | |
700 | 1 | |a Huang, Guoheng |0 (orcid)0000-0002-3640-3229 |4 aut | |
700 | 1 | |a Zhang, Ganghan |4 aut | |
700 | 1 | |a Lan, Jiaying |4 aut | |
700 | 1 | |a Yu, Zhiwen |4 aut | |
700 | 1 | |a Pun, Chi-Man |4 aut | |
700 | 1 | |a Ling, Wing-Kuen |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Applied intelligence |d Springer US, 1991 |g 52(2022), 13 vom: 17. März, Seite 15673-15689 |w (DE-627)130990515 |w (DE-600)1080229-0 |w (DE-576)029154286 |x 0924-669X |7 nnns |
773 | 1 | 8 | |g volume:52 |g year:2022 |g number:13 |g day:17 |g month:03 |g pages:15673-15689 |
856 | 4 | 1 | |u https://doi.org/10.1007/s10489-022-03232-w |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-MAT | ||
951 | |a AR | ||
952 | |d 52 |j 2022 |e 13 |b 17 |c 03 |h 15673-15689 |
author_variant |
h c hc l c lc g h gh g z gz j l jl z y zy c m p cmp w k l wkl |
---|---|
matchkey_str |
article:0924669X:2022----::ierievsacasfctowtmliclfauebsdnefuevsd |
hierarchy_sort_str |
2022 |
publishDate |
2022 |
allfields |
10.1007/s10489-022-03232-w doi (DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p DE-627 ger DE-627 rakwb eng 004 VZ Chen, Haiyuan verfasserin aut Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism 2022 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning Cheng, Lianglun aut Huang, Guoheng (orcid)0000-0002-3640-3229 aut Zhang, Ganghan aut Lan, Jiaying aut Yu, Zhiwen aut Pun, Chi-Man aut Ling, Wing-Kuen aut Enthalten in Applied intelligence Springer US, 1991 52(2022), 13 vom: 17. März, Seite 15673-15689 (DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286 0924-669X nnns volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 https://doi.org/10.1007/s10489-022-03232-w lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT AR 52 2022 13 17 03 15673-15689 |
spelling |
10.1007/s10489-022-03232-w doi (DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p DE-627 ger DE-627 rakwb eng 004 VZ Chen, Haiyuan verfasserin aut Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism 2022 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning Cheng, Lianglun aut Huang, Guoheng (orcid)0000-0002-3640-3229 aut Zhang, Ganghan aut Lan, Jiaying aut Yu, Zhiwen aut Pun, Chi-Man aut Ling, Wing-Kuen aut Enthalten in Applied intelligence Springer US, 1991 52(2022), 13 vom: 17. März, Seite 15673-15689 (DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286 0924-669X nnns volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 https://doi.org/10.1007/s10489-022-03232-w lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT AR 52 2022 13 17 03 15673-15689 |
allfields_unstemmed |
10.1007/s10489-022-03232-w doi (DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p DE-627 ger DE-627 rakwb eng 004 VZ Chen, Haiyuan verfasserin aut Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism 2022 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning Cheng, Lianglun aut Huang, Guoheng (orcid)0000-0002-3640-3229 aut Zhang, Ganghan aut Lan, Jiaying aut Yu, Zhiwen aut Pun, Chi-Man aut Ling, Wing-Kuen aut Enthalten in Applied intelligence Springer US, 1991 52(2022), 13 vom: 17. März, Seite 15673-15689 (DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286 0924-669X nnns volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 https://doi.org/10.1007/s10489-022-03232-w lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT AR 52 2022 13 17 03 15673-15689 |
allfieldsGer |
10.1007/s10489-022-03232-w doi (DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p DE-627 ger DE-627 rakwb eng 004 VZ Chen, Haiyuan verfasserin aut Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism 2022 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning Cheng, Lianglun aut Huang, Guoheng (orcid)0000-0002-3640-3229 aut Zhang, Ganghan aut Lan, Jiaying aut Yu, Zhiwen aut Pun, Chi-Man aut Ling, Wing-Kuen aut Enthalten in Applied intelligence Springer US, 1991 52(2022), 13 vom: 17. März, Seite 15673-15689 (DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286 0924-669X nnns volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 https://doi.org/10.1007/s10489-022-03232-w lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT AR 52 2022 13 17 03 15673-15689 |
allfieldsSound |
10.1007/s10489-022-03232-w doi (DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p DE-627 ger DE-627 rakwb eng 004 VZ Chen, Haiyuan verfasserin aut Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism 2022 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning Cheng, Lianglun aut Huang, Guoheng (orcid)0000-0002-3640-3229 aut Zhang, Ganghan aut Lan, Jiaying aut Yu, Zhiwen aut Pun, Chi-Man aut Ling, Wing-Kuen aut Enthalten in Applied intelligence Springer US, 1991 52(2022), 13 vom: 17. März, Seite 15673-15689 (DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286 0924-669X nnns volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 https://doi.org/10.1007/s10489-022-03232-w lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT AR 52 2022 13 17 03 15673-15689 |
language |
English |
source |
Enthalten in Applied intelligence 52(2022), 13 vom: 17. März, Seite 15673-15689 volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 |
sourceStr |
Enthalten in Applied intelligence 52(2022), 13 vom: 17. März, Seite 15673-15689 volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning |
dewey-raw |
004 |
isfreeaccess_bool |
false |
container_title |
Applied intelligence |
authorswithroles_txt_mv |
Chen, Haiyuan @@aut@@ Cheng, Lianglun @@aut@@ Huang, Guoheng @@aut@@ Zhang, Ganghan @@aut@@ Lan, Jiaying @@aut@@ Yu, Zhiwen @@aut@@ Pun, Chi-Man @@aut@@ Ling, Wing-Kuen @@aut@@ |
publishDateDaySort_date |
2022-03-17T00:00:00Z |
hierarchy_top_id |
130990515 |
dewey-sort |
14 |
id |
OLC2079660861 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2079660861</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230506072430.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">221221s2022 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10489-022-03232-w</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2079660861</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s10489-022-03232-w-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Chen, Haiyuan</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Attention mechanism</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Feature filtering</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Fine-grained visual classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Self-supervised learning</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cheng, Lianglun</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Huang, Guoheng</subfield><subfield code="0">(orcid)0000-0002-3640-3229</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zhang, Ganghan</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Lan, Jiaying</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yu, Zhiwen</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Pun, Chi-Man</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Ling, Wing-Kuen</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Applied intelligence</subfield><subfield code="d">Springer US, 1991</subfield><subfield code="g">52(2022), 13 vom: 17. März, Seite 15673-15689</subfield><subfield code="w">(DE-627)130990515</subfield><subfield code="w">(DE-600)1080229-0</subfield><subfield code="w">(DE-576)029154286</subfield><subfield code="x">0924-669X</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:52</subfield><subfield code="g">year:2022</subfield><subfield code="g">number:13</subfield><subfield code="g">day:17</subfield><subfield code="g">month:03</subfield><subfield code="g">pages:15673-15689</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s10489-022-03232-w</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">52</subfield><subfield code="j">2022</subfield><subfield code="e">13</subfield><subfield code="b">17</subfield><subfield code="c">03</subfield><subfield code="h">15673-15689</subfield></datafield></record></collection>
|
author |
Chen, Haiyuan |
spellingShingle |
Chen, Haiyuan ddc 004 misc Attention mechanism misc Feature filtering misc Fine-grained visual classification misc Self-supervised learning Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism |
authorStr |
Chen, Haiyuan |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)130990515 |
format |
Article |
dewey-ones |
004 - Data processing & computer science |
delete_txt_mv |
keep |
author_role |
aut aut aut aut aut aut aut aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
0924-669X |
topic_title |
004 VZ Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning |
topic |
ddc 004 misc Attention mechanism misc Feature filtering misc Fine-grained visual classification misc Self-supervised learning |
topic_unstemmed |
ddc 004 misc Attention mechanism misc Feature filtering misc Fine-grained visual classification misc Self-supervised learning |
topic_browse |
ddc 004 misc Attention mechanism misc Feature filtering misc Fine-grained visual classification misc Self-supervised learning |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
hierarchy_parent_title |
Applied intelligence |
hierarchy_parent_id |
130990515 |
dewey-tens |
000 - Computer science, knowledge & systems |
hierarchy_top_title |
Applied intelligence |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286 |
title |
Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism |
ctrlnum |
(DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p |
title_full |
Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism |
author_sort |
Chen, Haiyuan |
journal |
Applied intelligence |
journalStr |
Applied intelligence |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works |
recordtype |
marc |
publishDateSort |
2022 |
contenttype_str_mv |
txt |
container_start_page |
15673 |
author_browse |
Chen, Haiyuan Cheng, Lianglun Huang, Guoheng Zhang, Ganghan Lan, Jiaying Yu, Zhiwen Pun, Chi-Man Ling, Wing-Kuen |
container_volume |
52 |
class |
004 VZ |
format_se |
Aufsätze |
author-letter |
Chen, Haiyuan |
doi_str_mv |
10.1007/s10489-022-03232-w |
normlink |
(ORCID)0000-0002-3640-3229 |
normlink_prefix_str_mv |
(orcid)0000-0002-3640-3229 |
dewey-full |
004 |
title_sort |
fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism |
title_auth |
Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism |
abstract |
Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 |
abstractGer |
Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 |
abstract_unstemmed |
Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT |
container_issue |
13 |
title_short |
Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism |
url |
https://doi.org/10.1007/s10489-022-03232-w |
remote_bool |
false |
author2 |
Cheng, Lianglun Huang, Guoheng Zhang, Ganghan Lan, Jiaying Yu, Zhiwen Pun, Chi-Man Ling, Wing-Kuen |
author2Str |
Cheng, Lianglun Huang, Guoheng Zhang, Ganghan Lan, Jiaying Yu, Zhiwen Pun, Chi-Man Ling, Wing-Kuen |
ppnlink |
130990515 |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
doi_str |
10.1007/s10489-022-03232-w |
up_date |
2024-07-04T01:42:05.825Z |
_version_ |
1803610830496858112 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2079660861</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230506072430.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">221221s2022 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10489-022-03232-w</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2079660861</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s10489-022-03232-w-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Chen, Haiyuan</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Attention mechanism</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Feature filtering</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Fine-grained visual classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Self-supervised learning</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cheng, Lianglun</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Huang, Guoheng</subfield><subfield code="0">(orcid)0000-0002-3640-3229</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zhang, Ganghan</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Lan, Jiaying</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yu, Zhiwen</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Pun, Chi-Man</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Ling, Wing-Kuen</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Applied intelligence</subfield><subfield code="d">Springer US, 1991</subfield><subfield code="g">52(2022), 13 vom: 17. März, Seite 15673-15689</subfield><subfield code="w">(DE-627)130990515</subfield><subfield code="w">(DE-600)1080229-0</subfield><subfield code="w">(DE-576)029154286</subfield><subfield code="x">0924-669X</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:52</subfield><subfield code="g">year:2022</subfield><subfield code="g">number:13</subfield><subfield code="g">day:17</subfield><subfield code="g">month:03</subfield><subfield code="g">pages:15673-15689</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s10489-022-03232-w</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">52</subfield><subfield code="j">2022</subfield><subfield code="e">13</subfield><subfield code="b">17</subfield><subfield code="c">03</subfield><subfield code="h">15673-15689</subfield></datafield></record></collection>
|
score |
7.4013433 |