Coarse2Fine: a two-stage training method for fine-grained visual classification

Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discr...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Eshratifar, Amir Erfan [verfasserIn] Eigen, David Gormish, Michael Pedram, Massoud

Format:	Artikel
Sprache:	Englisch

Erschienen:	2021

Schlagwörter:	Fine-grained visual classification Weakly supervised object localization Visual attention networks

Anmerkung:	© The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021

Übergeordnetes Werk:	Enthalten in: Machine vision and applications - Springer Berlin Heidelberg, 1988, 32(2021), 2 vom: 25. Feb.
Übergeordnetes Werk:	volume:32 ; year:2021 ; number:2 ; day:25 ; month:02

Links:	Volltext

DOI / URN:	10.1007/s00138-021-01180-y

Katalog-ID:	OLC2123987204

Internformat


LEADER	01000naa a22002652 4500
001	OLC2123987204
003	DE-627
005	20230505085136.0
007	tu
008	230505s2021 xx \|\|\|\|\| 00\| \|\|eng c
024	7		\|a 10.1007/s00138-021-01180-y \|2 doi
035			\|a (DE-627)OLC2123987204
035			\|a (DE-He213)s00138-021-01180-y-p
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 004 \|q VZ
084			\|a 11 \|2 ssgn
100	1		\|a Eshratifar, Amir Erfan \|e verfasserin \|0 (orcid)0000-0002-1339-7671 \|4 aut
245	1	0	\|a Coarse2Fine: a two-stage training method for fine-grained visual classification
264		1	\|c 2021
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
500			\|a © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021
520			\|a Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets.
650		4	\|a Fine-grained visual classification
650		4	\|a Weakly supervised object localization
650		4	\|a Visual attention networks
700	1		\|a Eigen, David \|4 aut
700	1		\|a Gormish, Michael \|4 aut
700	1		\|a Pedram, Massoud \|4 aut
773	0	8	\|i Enthalten in \|t Machine vision and applications \|d Springer Berlin Heidelberg, 1988 \|g 32(2021), 2 vom: 25. Feb. \|w (DE-627)129248843 \|w (DE-600)59385-0 \|w (DE-576)017944139 \|x 0932-8092 \|7 nnns
773	1	8	\|g volume:32 \|g year:2021 \|g number:2 \|g day:25 \|g month:02
856	4	1	\|u https://doi.org/10.1007/s00138-021-01180-y \|z lizenzpflichtig \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_OLC
912			\|a SSG-OLC-MAT
912			\|a GBV_ILN_2018
912			\|a GBV_ILN_4277
951			\|a AR
952			\|d 32 \|j 2021 \|e 2 \|b 25 \|c 02

Indexfelder

author_variant	a e e ae aee d e de m g mg m p mp
matchkey_str	article:09328092:2021----::orefnawsaeriigehdofngandi
hierarchy_sort_str	2021
publishDate	2021
allfields	10.1007/s00138-021-01180-y doi (DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p DE-627 ger DE-627 rakwb eng 004 VZ 11 ssgn Eshratifar, Amir Erfan verfasserin (orcid)0000-0002-1339-7671 aut Coarse2Fine: a two-stage training method for fine-grained visual classification 2021 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. Fine-grained visual classification Weakly supervised object localization Visual attention networks Eigen, David aut Gormish, Michael aut Pedram, Massoud aut Enthalten in Machine vision and applications Springer Berlin Heidelberg, 1988 32(2021), 2 vom: 25. Feb. (DE-627)129248843 (DE-600)59385-0 (DE-576)017944139 0932-8092 nnns volume:32 year:2021 number:2 day:25 month:02 https://doi.org/10.1007/s00138-021-01180-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277 AR 32 2021 2 25 02
spelling	10.1007/s00138-021-01180-y doi (DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p DE-627 ger DE-627 rakwb eng 004 VZ 11 ssgn Eshratifar, Amir Erfan verfasserin (orcid)0000-0002-1339-7671 aut Coarse2Fine: a two-stage training method for fine-grained visual classification 2021 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. Fine-grained visual classification Weakly supervised object localization Visual attention networks Eigen, David aut Gormish, Michael aut Pedram, Massoud aut Enthalten in Machine vision and applications Springer Berlin Heidelberg, 1988 32(2021), 2 vom: 25. Feb. (DE-627)129248843 (DE-600)59385-0 (DE-576)017944139 0932-8092 nnns volume:32 year:2021 number:2 day:25 month:02 https://doi.org/10.1007/s00138-021-01180-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277 AR 32 2021 2 25 02
allfields_unstemmed	10.1007/s00138-021-01180-y doi (DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p DE-627 ger DE-627 rakwb eng 004 VZ 11 ssgn Eshratifar, Amir Erfan verfasserin (orcid)0000-0002-1339-7671 aut Coarse2Fine: a two-stage training method for fine-grained visual classification 2021 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. Fine-grained visual classification Weakly supervised object localization Visual attention networks Eigen, David aut Gormish, Michael aut Pedram, Massoud aut Enthalten in Machine vision and applications Springer Berlin Heidelberg, 1988 32(2021), 2 vom: 25. Feb. (DE-627)129248843 (DE-600)59385-0 (DE-576)017944139 0932-8092 nnns volume:32 year:2021 number:2 day:25 month:02 https://doi.org/10.1007/s00138-021-01180-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277 AR 32 2021 2 25 02
allfieldsGer	10.1007/s00138-021-01180-y doi (DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p DE-627 ger DE-627 rakwb eng 004 VZ 11 ssgn Eshratifar, Amir Erfan verfasserin (orcid)0000-0002-1339-7671 aut Coarse2Fine: a two-stage training method for fine-grained visual classification 2021 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. Fine-grained visual classification Weakly supervised object localization Visual attention networks Eigen, David aut Gormish, Michael aut Pedram, Massoud aut Enthalten in Machine vision and applications Springer Berlin Heidelberg, 1988 32(2021), 2 vom: 25. Feb. (DE-627)129248843 (DE-600)59385-0 (DE-576)017944139 0932-8092 nnns volume:32 year:2021 number:2 day:25 month:02 https://doi.org/10.1007/s00138-021-01180-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277 AR 32 2021 2 25 02
allfieldsSound	10.1007/s00138-021-01180-y doi (DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p DE-627 ger DE-627 rakwb eng 004 VZ 11 ssgn Eshratifar, Amir Erfan verfasserin (orcid)0000-0002-1339-7671 aut Coarse2Fine: a two-stage training method for fine-grained visual classification 2021 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. Fine-grained visual classification Weakly supervised object localization Visual attention networks Eigen, David aut Gormish, Michael aut Pedram, Massoud aut Enthalten in Machine vision and applications Springer Berlin Heidelberg, 1988 32(2021), 2 vom: 25. Feb. (DE-627)129248843 (DE-600)59385-0 (DE-576)017944139 0932-8092 nnns volume:32 year:2021 number:2 day:25 month:02 https://doi.org/10.1007/s00138-021-01180-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277 AR 32 2021 2 25 02
language	English
source	Enthalten in Machine vision and applications 32(2021), 2 vom: 25. Feb. volume:32 year:2021 number:2 day:25 month:02
sourceStr	Enthalten in Machine vision and applications 32(2021), 2 vom: 25. Feb. volume:32 year:2021 number:2 day:25 month:02
format_phy_str_mv	Article
institution	findex.gbv.de
topic_facet	Fine-grained visual classification Weakly supervised object localization Visual attention networks
dewey-raw	004
isfreeaccess_bool	false
container_title	Machine vision and applications
authorswithroles_txt_mv	Eshratifar, Amir Erfan @@aut@@ Eigen, David @@aut@@ Gormish, Michael @@aut@@ Pedram, Massoud @@aut@@
publishDateDaySort_date	2021-02-25T00:00:00Z
hierarchy_top_id	129248843
dewey-sort	14
id	OLC2123987204
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">OLC2123987204</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230505085136.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">230505s2021 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s00138-021-01180-y</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2123987204</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s00138-021-01180-y-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">11</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Eshratifar, Amir Erfan</subfield><subfield code="e">verfasserin</subfield><subfield code="0">(orcid)0000-0002-1339-7671</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Coarse2Fine: a two-stage training method for fine-grained visual classification</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2021</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Fine-grained visual classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Weakly supervised object localization</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Visual attention networks</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Eigen, David</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Gormish, Michael</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Pedram, Massoud</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Machine vision and applications</subfield><subfield code="d">Springer Berlin Heidelberg, 1988</subfield><subfield code="g">32(2021), 2 vom: 25. Feb.</subfield><subfield code="w">(DE-627)129248843</subfield><subfield code="w">(DE-600)59385-0</subfield><subfield code="w">(DE-576)017944139</subfield><subfield code="x">0932-8092</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:32</subfield><subfield code="g">year:2021</subfield><subfield code="g">number:2</subfield><subfield code="g">day:25</subfield><subfield code="g">month:02</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s00138-021-01180-y</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2018</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4277</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">32</subfield><subfield code="j">2021</subfield><subfield code="e">2</subfield><subfield code="b">25</subfield><subfield code="c">02</subfield></datafield></record></collection>
author	Eshratifar, Amir Erfan
spellingShingle	Eshratifar, Amir Erfan ddc 004 ssgn 11 misc Fine-grained visual classification misc Weakly supervised object localization misc Visual attention networks Coarse2Fine: a two-stage training method for fine-grained visual classification
authorStr	Eshratifar, Amir Erfan
ppnlink_with_tag_str_mv	@@773@@(DE-627)129248843
format	Article
dewey-ones	004 - Data processing & computer science
delete_txt_mv	keep
author_role	aut aut aut aut
collection	OLC
remote_str	false
illustrated	Not Illustrated
issn	0932-8092
topic_title	004 VZ 11 ssgn Coarse2Fine: a two-stage training method for fine-grained visual classification Fine-grained visual classification Weakly supervised object localization Visual attention networks
topic	ddc 004 ssgn 11 misc Fine-grained visual classification misc Weakly supervised object localization misc Visual attention networks
topic_unstemmed	ddc 004 ssgn 11 misc Fine-grained visual classification misc Weakly supervised object localization misc Visual attention networks
topic_browse	ddc 004 ssgn 11 misc Fine-grained visual classification misc Weakly supervised object localization misc Visual attention networks
format_facet	Aufsätze Gedruckte Aufsätze
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	nc
hierarchy_parent_title	Machine vision and applications
hierarchy_parent_id	129248843
dewey-tens	000 - Computer science, knowledge & systems
hierarchy_top_title	Machine vision and applications
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)129248843 (DE-600)59385-0 (DE-576)017944139
title	Coarse2Fine: a two-stage training method for fine-grained visual classification
ctrlnum	(DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p
title_full	Coarse2Fine: a two-stage training method for fine-grained visual classification
author_sort	Eshratifar, Amir Erfan
journal	Machine vision and applications
journalStr	Machine vision and applications
lang_code	eng
isOA_bool	false
dewey-hundreds	000 - Computer science, information & general works
recordtype	marc
publishDateSort	2021
contenttype_str_mv	txt
author_browse	Eshratifar, Amir Erfan Eigen, David Gormish, Michael Pedram, Massoud
container_volume	32
class	004 VZ 11 ssgn
format_se	Aufsätze
author-letter	Eshratifar, Amir Erfan
doi_str_mv	10.1007/s00138-021-01180-y
normlink	(ORCID)0000-0002-1339-7671
normlink_prefix_str_mv	(orcid)0000-0002-1339-7671
dewey-full	004
title_sort	coarse2fine: a two-stage training method for fine-grained visual classification
title_auth	Coarse2Fine: a two-stage training method for fine-grained visual classification
abstract	Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021
abstractGer	Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021
abstract_unstemmed	Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277
container_issue	2
title_short	Coarse2Fine: a two-stage training method for fine-grained visual classification
url	https://doi.org/10.1007/s00138-021-01180-y
remote_bool	false
author2	Eigen, David Gormish, Michael Pedram, Massoud
author2Str	Eigen, David Gormish, Michael Pedram, Massoud
ppnlink	129248843
mediatype_str_mv	n
isOA_txt	false
hochschulschrift_bool	false
doi_str	10.1007/s00138-021-01180-y
up_date	2024-07-03T21:11:46.289Z
_version_	1803593823070191617
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">OLC2123987204</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230505085136.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">230505s2021 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s00138-021-01180-y</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2123987204</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s00138-021-01180-y-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">11</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Eshratifar, Amir Erfan</subfield><subfield code="e">verfasserin</subfield><subfield code="0">(orcid)0000-0002-1339-7671</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Coarse2Fine: a two-stage training method for fine-grained visual classification</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2021</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Fine-grained visual classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Weakly supervised object localization</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Visual attention networks</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Eigen, David</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Gormish, Michael</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Pedram, Massoud</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Machine vision and applications</subfield><subfield code="d">Springer Berlin Heidelberg, 1988</subfield><subfield code="g">32(2021), 2 vom: 25. Feb.</subfield><subfield code="w">(DE-627)129248843</subfield><subfield code="w">(DE-600)59385-0</subfield><subfield code="w">(DE-576)017944139</subfield><subfield code="x">0932-8092</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:32</subfield><subfield code="g">year:2021</subfield><subfield code="g">number:2</subfield><subfield code="g">day:25</subfield><subfield code="g">month:02</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s00138-021-01180-y</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2018</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4277</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">32</subfield><subfield code="j">2021</subfield><subfield code="e">2</subfield><subfield code="b">25</subfield><subfield code="c">02</subfield></datafield></record></collection>
score	7.4005337

Nicht das Richtige dabei?

Schreiben Sie uns!

Coarse2Fine: a two-stage training method for fine-grained visual classification

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?