Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism

Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Chen, Haiyuan [verfasserIn] Cheng, Lianglun Huang, Guoheng Zhang, Ganghan Lan, Jiaying Yu, Zhiwen Pun, Chi-Man Ling, Wing-Kuen

Format:	Artikel
Sprache:	Englisch

Erschienen:	2022

Schlagwörter:	Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning

Anmerkung:	© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022

Übergeordnetes Werk:	Enthalten in: Applied intelligence - Springer US, 1991, 52(2022), 13 vom: 17. März, Seite 15673-15689
Übergeordnetes Werk:	volume:52 ; year:2022 ; number:13 ; day:17 ; month:03 ; pages:15673-15689

Links:	Volltext

DOI / URN:	10.1007/s10489-022-03232-w

Katalog-ID:	OLC2079660861

Internformat


LEADER	01000caa a22002652 4500
001	OLC2079660861
003	DE-627
005	20230506072430.0
007	tu
008	221221s2022 xx \|\|\|\|\| 00\| \|\|eng c
024	7		\|a 10.1007/s10489-022-03232-w \|2 doi
035			\|a (DE-627)OLC2079660861
035			\|a (DE-He213)s10489-022-03232-w-p
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 004 \|q VZ
100	1		\|a Chen, Haiyuan \|e verfasserin \|4 aut
245	1	0	\|a Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
264		1	\|c 2022
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
500			\|a © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022
520			\|a Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%.
650		4	\|a Attention mechanism
650		4	\|a Feature filtering
650		4	\|a Fine-grained visual classification
650		4	\|a Self-supervised learning
700	1		\|a Cheng, Lianglun \|4 aut
700	1		\|a Huang, Guoheng \|0 (orcid)0000-0002-3640-3229 \|4 aut
700	1		\|a Zhang, Ganghan \|4 aut
700	1		\|a Lan, Jiaying \|4 aut
700	1		\|a Yu, Zhiwen \|4 aut
700	1		\|a Pun, Chi-Man \|4 aut
700	1		\|a Ling, Wing-Kuen \|4 aut
773	0	8	\|i Enthalten in \|t Applied intelligence \|d Springer US, 1991 \|g 52(2022), 13 vom: 17. März, Seite 15673-15689 \|w (DE-627)130990515 \|w (DE-600)1080229-0 \|w (DE-576)029154286 \|x 0924-669X \|7 nnns
773	1	8	\|g volume:52 \|g year:2022 \|g number:13 \|g day:17 \|g month:03 \|g pages:15673-15689
856	4	1	\|u https://doi.org/10.1007/s10489-022-03232-w \|z lizenzpflichtig \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_OLC
912			\|a SSG-OLC-MAT
951			\|a AR
952			\|d 52 \|j 2022 \|e 13 \|b 17 \|c 03 \|h 15673-15689

Indexfelder

author_variant	h c hc l c lc g h gh g z gz j l jl z y zy c m p cmp w k l wkl
matchkey_str	article:0924669X:2022----::ierievsacasfctowtmliclfauebsdnefuevsd
hierarchy_sort_str	2022
publishDate	2022
allfields	10.1007/s10489-022-03232-w doi (DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p DE-627 ger DE-627 rakwb eng 004 VZ Chen, Haiyuan verfasserin aut Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism 2022 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning Cheng, Lianglun aut Huang, Guoheng (orcid)0000-0002-3640-3229 aut Zhang, Ganghan aut Lan, Jiaying aut Yu, Zhiwen aut Pun, Chi-Man aut Ling, Wing-Kuen aut Enthalten in Applied intelligence Springer US, 1991 52(2022), 13 vom: 17. März, Seite 15673-15689 (DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286 0924-669X nnns volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 https://doi.org/10.1007/s10489-022-03232-w lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT AR 52 2022 13 17 03 15673-15689
spelling	10.1007/s10489-022-03232-w doi (DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p DE-627 ger DE-627 rakwb eng 004 VZ Chen, Haiyuan verfasserin aut Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism 2022 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning Cheng, Lianglun aut Huang, Guoheng (orcid)0000-0002-3640-3229 aut Zhang, Ganghan aut Lan, Jiaying aut Yu, Zhiwen aut Pun, Chi-Man aut Ling, Wing-Kuen aut Enthalten in Applied intelligence Springer US, 1991 52(2022), 13 vom: 17. März, Seite 15673-15689 (DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286 0924-669X nnns volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 https://doi.org/10.1007/s10489-022-03232-w lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT AR 52 2022 13 17 03 15673-15689
allfields_unstemmed	10.1007/s10489-022-03232-w doi (DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p DE-627 ger DE-627 rakwb eng 004 VZ Chen, Haiyuan verfasserin aut Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism 2022 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning Cheng, Lianglun aut Huang, Guoheng (orcid)0000-0002-3640-3229 aut Zhang, Ganghan aut Lan, Jiaying aut Yu, Zhiwen aut Pun, Chi-Man aut Ling, Wing-Kuen aut Enthalten in Applied intelligence Springer US, 1991 52(2022), 13 vom: 17. März, Seite 15673-15689 (DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286 0924-669X nnns volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 https://doi.org/10.1007/s10489-022-03232-w lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT AR 52 2022 13 17 03 15673-15689
allfieldsGer	10.1007/s10489-022-03232-w doi (DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p DE-627 ger DE-627 rakwb eng 004 VZ Chen, Haiyuan verfasserin aut Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism 2022 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning Cheng, Lianglun aut Huang, Guoheng (orcid)0000-0002-3640-3229 aut Zhang, Ganghan aut Lan, Jiaying aut Yu, Zhiwen aut Pun, Chi-Man aut Ling, Wing-Kuen aut Enthalten in Applied intelligence Springer US, 1991 52(2022), 13 vom: 17. März, Seite 15673-15689 (DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286 0924-669X nnns volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 https://doi.org/10.1007/s10489-022-03232-w lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT AR 52 2022 13 17 03 15673-15689
allfieldsSound	10.1007/s10489-022-03232-w doi (DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p DE-627 ger DE-627 rakwb eng 004 VZ Chen, Haiyuan verfasserin aut Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism 2022 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning Cheng, Lianglun aut Huang, Guoheng (orcid)0000-0002-3640-3229 aut Zhang, Ganghan aut Lan, Jiaying aut Yu, Zhiwen aut Pun, Chi-Man aut Ling, Wing-Kuen aut Enthalten in Applied intelligence Springer US, 1991 52(2022), 13 vom: 17. März, Seite 15673-15689 (DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286 0924-669X nnns volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689 https://doi.org/10.1007/s10489-022-03232-w lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT AR 52 2022 13 17 03 15673-15689
language	English
source	Enthalten in Applied intelligence 52(2022), 13 vom: 17. März, Seite 15673-15689 volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689
sourceStr	Enthalten in Applied intelligence 52(2022), 13 vom: 17. März, Seite 15673-15689 volume:52 year:2022 number:13 day:17 month:03 pages:15673-15689
format_phy_str_mv	Article
institution	findex.gbv.de
topic_facet	Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning
dewey-raw	004
isfreeaccess_bool	false
container_title	Applied intelligence
authorswithroles_txt_mv	Chen, Haiyuan @@aut@@ Cheng, Lianglun @@aut@@ Huang, Guoheng @@aut@@ Zhang, Ganghan @@aut@@ Lan, Jiaying @@aut@@ Yu, Zhiwen @@aut@@ Pun, Chi-Man @@aut@@ Ling, Wing-Kuen @@aut@@
publishDateDaySort_date	2022-03-17T00:00:00Z
hierarchy_top_id	130990515
dewey-sort	14
id	OLC2079660861
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2079660861</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230506072430.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">221221s2022 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10489-022-03232-w</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2079660861</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s10489-022-03232-w-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Chen, Haiyuan</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Attention mechanism</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Feature filtering</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Fine-grained visual classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Self-supervised learning</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cheng, Lianglun</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Huang, Guoheng</subfield><subfield code="0">(orcid)0000-0002-3640-3229</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zhang, Ganghan</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Lan, Jiaying</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yu, Zhiwen</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Pun, Chi-Man</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Ling, Wing-Kuen</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Applied intelligence</subfield><subfield code="d">Springer US, 1991</subfield><subfield code="g">52(2022), 13 vom: 17. März, Seite 15673-15689</subfield><subfield code="w">(DE-627)130990515</subfield><subfield code="w">(DE-600)1080229-0</subfield><subfield code="w">(DE-576)029154286</subfield><subfield code="x">0924-669X</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:52</subfield><subfield code="g">year:2022</subfield><subfield code="g">number:13</subfield><subfield code="g">day:17</subfield><subfield code="g">month:03</subfield><subfield code="g">pages:15673-15689</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s10489-022-03232-w</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">52</subfield><subfield code="j">2022</subfield><subfield code="e">13</subfield><subfield code="b">17</subfield><subfield code="c">03</subfield><subfield code="h">15673-15689</subfield></datafield></record></collection>
author	Chen, Haiyuan
spellingShingle	Chen, Haiyuan ddc 004 misc Attention mechanism misc Feature filtering misc Fine-grained visual classification misc Self-supervised learning Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
authorStr	Chen, Haiyuan
ppnlink_with_tag_str_mv	@@773@@(DE-627)130990515
format	Article
dewey-ones	004 - Data processing & computer science
delete_txt_mv	keep
author_role	aut aut aut aut aut aut aut aut
collection	OLC
remote_str	false
illustrated	Not Illustrated
issn	0924-669X
topic_title	004 VZ Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism Attention mechanism Feature filtering Fine-grained visual classification Self-supervised learning
topic	ddc 004 misc Attention mechanism misc Feature filtering misc Fine-grained visual classification misc Self-supervised learning
topic_unstemmed	ddc 004 misc Attention mechanism misc Feature filtering misc Fine-grained visual classification misc Self-supervised learning
topic_browse	ddc 004 misc Attention mechanism misc Feature filtering misc Fine-grained visual classification misc Self-supervised learning
format_facet	Aufsätze Gedruckte Aufsätze
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	nc
hierarchy_parent_title	Applied intelligence
hierarchy_parent_id	130990515
dewey-tens	000 - Computer science, knowledge & systems
hierarchy_top_title	Applied intelligence
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)130990515 (DE-600)1080229-0 (DE-576)029154286
title	Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
ctrlnum	(DE-627)OLC2079660861 (DE-He213)s10489-022-03232-w-p
title_full	Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
author_sort	Chen, Haiyuan
journal	Applied intelligence
journalStr	Applied intelligence
lang_code	eng
isOA_bool	false
dewey-hundreds	000 - Computer science, information & general works
recordtype	marc
publishDateSort	2022
contenttype_str_mv	txt
container_start_page	15673
author_browse	Chen, Haiyuan Cheng, Lianglun Huang, Guoheng Zhang, Ganghan Lan, Jiaying Yu, Zhiwen Pun, Chi-Man Ling, Wing-Kuen
container_volume	52
class	004 VZ
format_se	Aufsätze
author-letter	Chen, Haiyuan
doi_str_mv	10.1007/s10489-022-03232-w
normlink	(ORCID)0000-0002-3640-3229
normlink_prefix_str_mv	(orcid)0000-0002-3640-3229
dewey-full	004
title_sort	fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
title_auth	Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
abstract	Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022
abstractGer	Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022
abstract_unstemmed	Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT
container_issue	13
title_short	Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
url	https://doi.org/10.1007/s10489-022-03232-w
remote_bool	false
author2	Cheng, Lianglun Huang, Guoheng Zhang, Ganghan Lan, Jiaying Yu, Zhiwen Pun, Chi-Man Ling, Wing-Kuen
author2Str	Cheng, Lianglun Huang, Guoheng Zhang, Ganghan Lan, Jiaying Yu, Zhiwen Pun, Chi-Man Ling, Wing-Kuen
ppnlink	130990515
mediatype_str_mv	n
isOA_txt	false
hochschulschrift_bool	false
doi_str	10.1007/s10489-022-03232-w
up_date	2024-07-04T01:42:05.825Z
_version_	1803610830496858112
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2079660861</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230506072430.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">221221s2022 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10489-022-03232-w</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2079660861</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s10489-022-03232-w-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Chen, Haiyuan</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Although the existing Fine-Grained Visual Classification (FGVC) researches has made some progress, there are still some deficiencies need to be refined. Specifically, 1. The feature maps are used directly by most methods after they are extracted from the original images, which lacks further processing of feature maps and may lead irrelevant features to negatively affect network performance; 2. In many methods, the utilize of feature maps is relatively simple, and the relationship between feature maps that helpful for accurate classification is ignored. 3. Due to the high similarity between subcategories as well as the randomness and instability of training, the network prediction results may sometimes not accurate enough. To this end, we propose an efficient Self-supervised Attention Filtering and Multi-scale Features Network (SA-MFN) to improve the accuracy of FGVC, which consists of three modules. The first one is the Self-supervised Attention Map Filter, which is proposed to extract the initial attention maps of subcategories and filter out the most distinguishable and representative local attention maps. The second module is the Multi-scale Attention Map Generator, which extracts a global spatial feature map from the filtered attention maps and then concatenates it with the filtered attention maps. The third module is the Reiterative Prediction, in which the first prediction result of the network is re-utilized by this module to improve the accuracy and stability. Experimental results show that our SA-MFN outperforms the state-of-the-art methods on multiple fine-grained classification datasets, especially on the dataset of Stanford Cars, the proposed network achieves the accuracy of 94.7%.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Attention mechanism</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Feature filtering</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Fine-grained visual classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Self-supervised learning</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cheng, Lianglun</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Huang, Guoheng</subfield><subfield code="0">(orcid)0000-0002-3640-3229</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Zhang, Ganghan</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Lan, Jiaying</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yu, Zhiwen</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Pun, Chi-Man</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Ling, Wing-Kuen</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Applied intelligence</subfield><subfield code="d">Springer US, 1991</subfield><subfield code="g">52(2022), 13 vom: 17. März, Seite 15673-15689</subfield><subfield code="w">(DE-627)130990515</subfield><subfield code="w">(DE-600)1080229-0</subfield><subfield code="w">(DE-576)029154286</subfield><subfield code="x">0924-669X</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:52</subfield><subfield code="g">year:2022</subfield><subfield code="g">number:13</subfield><subfield code="g">day:17</subfield><subfield code="g">month:03</subfield><subfield code="g">pages:15673-15689</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s10489-022-03232-w</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">52</subfield><subfield code="j">2022</subfield><subfield code="e">13</subfield><subfield code="b">17</subfield><subfield code="c">03</subfield><subfield code="h">15673-15689</subfield></datafield></record></collection>
score	7.4013433

Nicht das Richtige dabei?

Schreiben Sie uns!

Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?