Coarse2Fine: a two-stage training method for fine-grained visual classification
Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discr...
Ausführliche Beschreibung
Autor*in: |
Eshratifar, Amir Erfan [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2021 |
---|
Schlagwörter: |
Fine-grained visual classification |
---|
Anmerkung: |
© The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 |
---|
Übergeordnetes Werk: |
Enthalten in: Machine vision and applications - Springer Berlin Heidelberg, 1988, 32(2021), 2 vom: 25. Feb. |
---|---|
Übergeordnetes Werk: |
volume:32 ; year:2021 ; number:2 ; day:25 ; month:02 |
Links: |
---|
DOI / URN: |
10.1007/s00138-021-01180-y |
---|
Katalog-ID: |
OLC2123987204 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | OLC2123987204 | ||
003 | DE-627 | ||
005 | 20230505085136.0 | ||
007 | tu | ||
008 | 230505s2021 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1007/s00138-021-01180-y |2 doi | |
035 | |a (DE-627)OLC2123987204 | ||
035 | |a (DE-He213)s00138-021-01180-y-p | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 004 |q VZ |
084 | |a 11 |2 ssgn | ||
100 | 1 | |a Eshratifar, Amir Erfan |e verfasserin |0 (orcid)0000-0002-1339-7671 |4 aut | |
245 | 1 | 0 | |a Coarse2Fine: a two-stage training method for fine-grained visual classification |
264 | 1 | |c 2021 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
500 | |a © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 | ||
520 | |a Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. | ||
650 | 4 | |a Fine-grained visual classification | |
650 | 4 | |a Weakly supervised object localization | |
650 | 4 | |a Visual attention networks | |
700 | 1 | |a Eigen, David |4 aut | |
700 | 1 | |a Gormish, Michael |4 aut | |
700 | 1 | |a Pedram, Massoud |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Machine vision and applications |d Springer Berlin Heidelberg, 1988 |g 32(2021), 2 vom: 25. Feb. |w (DE-627)129248843 |w (DE-600)59385-0 |w (DE-576)017944139 |x 0932-8092 |7 nnns |
773 | 1 | 8 | |g volume:32 |g year:2021 |g number:2 |g day:25 |g month:02 |
856 | 4 | 1 | |u https://doi.org/10.1007/s00138-021-01180-y |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-MAT | ||
912 | |a GBV_ILN_2018 | ||
912 | |a GBV_ILN_4277 | ||
951 | |a AR | ||
952 | |d 32 |j 2021 |e 2 |b 25 |c 02 |
author_variant |
a e e ae aee d e de m g mg m p mp |
---|---|
matchkey_str |
article:09328092:2021----::orefnawsaeriigehdofngandi |
hierarchy_sort_str |
2021 |
publishDate |
2021 |
allfields |
10.1007/s00138-021-01180-y doi (DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p DE-627 ger DE-627 rakwb eng 004 VZ 11 ssgn Eshratifar, Amir Erfan verfasserin (orcid)0000-0002-1339-7671 aut Coarse2Fine: a two-stage training method for fine-grained visual classification 2021 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. Fine-grained visual classification Weakly supervised object localization Visual attention networks Eigen, David aut Gormish, Michael aut Pedram, Massoud aut Enthalten in Machine vision and applications Springer Berlin Heidelberg, 1988 32(2021), 2 vom: 25. Feb. (DE-627)129248843 (DE-600)59385-0 (DE-576)017944139 0932-8092 nnns volume:32 year:2021 number:2 day:25 month:02 https://doi.org/10.1007/s00138-021-01180-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277 AR 32 2021 2 25 02 |
spelling |
10.1007/s00138-021-01180-y doi (DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p DE-627 ger DE-627 rakwb eng 004 VZ 11 ssgn Eshratifar, Amir Erfan verfasserin (orcid)0000-0002-1339-7671 aut Coarse2Fine: a two-stage training method for fine-grained visual classification 2021 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. Fine-grained visual classification Weakly supervised object localization Visual attention networks Eigen, David aut Gormish, Michael aut Pedram, Massoud aut Enthalten in Machine vision and applications Springer Berlin Heidelberg, 1988 32(2021), 2 vom: 25. Feb. (DE-627)129248843 (DE-600)59385-0 (DE-576)017944139 0932-8092 nnns volume:32 year:2021 number:2 day:25 month:02 https://doi.org/10.1007/s00138-021-01180-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277 AR 32 2021 2 25 02 |
allfields_unstemmed |
10.1007/s00138-021-01180-y doi (DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p DE-627 ger DE-627 rakwb eng 004 VZ 11 ssgn Eshratifar, Amir Erfan verfasserin (orcid)0000-0002-1339-7671 aut Coarse2Fine: a two-stage training method for fine-grained visual classification 2021 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. Fine-grained visual classification Weakly supervised object localization Visual attention networks Eigen, David aut Gormish, Michael aut Pedram, Massoud aut Enthalten in Machine vision and applications Springer Berlin Heidelberg, 1988 32(2021), 2 vom: 25. Feb. (DE-627)129248843 (DE-600)59385-0 (DE-576)017944139 0932-8092 nnns volume:32 year:2021 number:2 day:25 month:02 https://doi.org/10.1007/s00138-021-01180-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277 AR 32 2021 2 25 02 |
allfieldsGer |
10.1007/s00138-021-01180-y doi (DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p DE-627 ger DE-627 rakwb eng 004 VZ 11 ssgn Eshratifar, Amir Erfan verfasserin (orcid)0000-0002-1339-7671 aut Coarse2Fine: a two-stage training method for fine-grained visual classification 2021 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. Fine-grained visual classification Weakly supervised object localization Visual attention networks Eigen, David aut Gormish, Michael aut Pedram, Massoud aut Enthalten in Machine vision and applications Springer Berlin Heidelberg, 1988 32(2021), 2 vom: 25. Feb. (DE-627)129248843 (DE-600)59385-0 (DE-576)017944139 0932-8092 nnns volume:32 year:2021 number:2 day:25 month:02 https://doi.org/10.1007/s00138-021-01180-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277 AR 32 2021 2 25 02 |
allfieldsSound |
10.1007/s00138-021-01180-y doi (DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p DE-627 ger DE-627 rakwb eng 004 VZ 11 ssgn Eshratifar, Amir Erfan verfasserin (orcid)0000-0002-1339-7671 aut Coarse2Fine: a two-stage training method for fine-grained visual classification 2021 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. Fine-grained visual classification Weakly supervised object localization Visual attention networks Eigen, David aut Gormish, Michael aut Pedram, Massoud aut Enthalten in Machine vision and applications Springer Berlin Heidelberg, 1988 32(2021), 2 vom: 25. Feb. (DE-627)129248843 (DE-600)59385-0 (DE-576)017944139 0932-8092 nnns volume:32 year:2021 number:2 day:25 month:02 https://doi.org/10.1007/s00138-021-01180-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277 AR 32 2021 2 25 02 |
language |
English |
source |
Enthalten in Machine vision and applications 32(2021), 2 vom: 25. Feb. volume:32 year:2021 number:2 day:25 month:02 |
sourceStr |
Enthalten in Machine vision and applications 32(2021), 2 vom: 25. Feb. volume:32 year:2021 number:2 day:25 month:02 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Fine-grained visual classification Weakly supervised object localization Visual attention networks |
dewey-raw |
004 |
isfreeaccess_bool |
false |
container_title |
Machine vision and applications |
authorswithroles_txt_mv |
Eshratifar, Amir Erfan @@aut@@ Eigen, David @@aut@@ Gormish, Michael @@aut@@ Pedram, Massoud @@aut@@ |
publishDateDaySort_date |
2021-02-25T00:00:00Z |
hierarchy_top_id |
129248843 |
dewey-sort |
14 |
id |
OLC2123987204 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">OLC2123987204</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230505085136.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">230505s2021 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s00138-021-01180-y</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2123987204</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s00138-021-01180-y-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">11</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Eshratifar, Amir Erfan</subfield><subfield code="e">verfasserin</subfield><subfield code="0">(orcid)0000-0002-1339-7671</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Coarse2Fine: a two-stage training method for fine-grained visual classification</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2021</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Fine-grained visual classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Weakly supervised object localization</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Visual attention networks</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Eigen, David</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Gormish, Michael</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Pedram, Massoud</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Machine vision and applications</subfield><subfield code="d">Springer Berlin Heidelberg, 1988</subfield><subfield code="g">32(2021), 2 vom: 25. Feb.</subfield><subfield code="w">(DE-627)129248843</subfield><subfield code="w">(DE-600)59385-0</subfield><subfield code="w">(DE-576)017944139</subfield><subfield code="x">0932-8092</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:32</subfield><subfield code="g">year:2021</subfield><subfield code="g">number:2</subfield><subfield code="g">day:25</subfield><subfield code="g">month:02</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s00138-021-01180-y</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2018</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4277</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">32</subfield><subfield code="j">2021</subfield><subfield code="e">2</subfield><subfield code="b">25</subfield><subfield code="c">02</subfield></datafield></record></collection>
|
author |
Eshratifar, Amir Erfan |
spellingShingle |
Eshratifar, Amir Erfan ddc 004 ssgn 11 misc Fine-grained visual classification misc Weakly supervised object localization misc Visual attention networks Coarse2Fine: a two-stage training method for fine-grained visual classification |
authorStr |
Eshratifar, Amir Erfan |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)129248843 |
format |
Article |
dewey-ones |
004 - Data processing & computer science |
delete_txt_mv |
keep |
author_role |
aut aut aut aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
0932-8092 |
topic_title |
004 VZ 11 ssgn Coarse2Fine: a two-stage training method for fine-grained visual classification Fine-grained visual classification Weakly supervised object localization Visual attention networks |
topic |
ddc 004 ssgn 11 misc Fine-grained visual classification misc Weakly supervised object localization misc Visual attention networks |
topic_unstemmed |
ddc 004 ssgn 11 misc Fine-grained visual classification misc Weakly supervised object localization misc Visual attention networks |
topic_browse |
ddc 004 ssgn 11 misc Fine-grained visual classification misc Weakly supervised object localization misc Visual attention networks |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
hierarchy_parent_title |
Machine vision and applications |
hierarchy_parent_id |
129248843 |
dewey-tens |
000 - Computer science, knowledge & systems |
hierarchy_top_title |
Machine vision and applications |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)129248843 (DE-600)59385-0 (DE-576)017944139 |
title |
Coarse2Fine: a two-stage training method for fine-grained visual classification |
ctrlnum |
(DE-627)OLC2123987204 (DE-He213)s00138-021-01180-y-p |
title_full |
Coarse2Fine: a two-stage training method for fine-grained visual classification |
author_sort |
Eshratifar, Amir Erfan |
journal |
Machine vision and applications |
journalStr |
Machine vision and applications |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works |
recordtype |
marc |
publishDateSort |
2021 |
contenttype_str_mv |
txt |
author_browse |
Eshratifar, Amir Erfan Eigen, David Gormish, Michael Pedram, Massoud |
container_volume |
32 |
class |
004 VZ 11 ssgn |
format_se |
Aufsätze |
author-letter |
Eshratifar, Amir Erfan |
doi_str_mv |
10.1007/s00138-021-01180-y |
normlink |
(ORCID)0000-0002-1339-7671 |
normlink_prefix_str_mv |
(orcid)0000-0002-1339-7671 |
dewey-full |
004 |
title_sort |
coarse2fine: a two-stage training method for fine-grained visual classification |
title_auth |
Coarse2Fine: a two-stage training method for fine-grained visual classification |
abstract |
Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 |
abstractGer |
Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 |
abstract_unstemmed |
Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets. © The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_2018 GBV_ILN_4277 |
container_issue |
2 |
title_short |
Coarse2Fine: a two-stage training method for fine-grained visual classification |
url |
https://doi.org/10.1007/s00138-021-01180-y |
remote_bool |
false |
author2 |
Eigen, David Gormish, Michael Pedram, Massoud |
author2Str |
Eigen, David Gormish, Michael Pedram, Massoud |
ppnlink |
129248843 |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
doi_str |
10.1007/s00138-021-01180-y |
up_date |
2024-07-03T21:11:46.289Z |
_version_ |
1803593823070191617 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">OLC2123987204</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230505085136.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">230505s2021 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s00138-021-01180-y</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2123987204</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s00138-021-01180-y-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">11</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Eshratifar, Amir Erfan</subfield><subfield code="e">verfasserin</subfield><subfield code="0">(orcid)0000-0002-1339-7671</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Coarse2Fine: a two-stage training method for fine-grained visual classification</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2021</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s), under exclusive licence to Springer-Verlag GmbH, DE part of Springer Nature 2021</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Small inter-class and large intra-class variations are the key challenges in fine-grained visual classification. Objects from different classes share visually similar structures, and objects in the same class can have different poses and viewpoints. Therefore, the proper extraction of discriminative local features (e.g., bird’s beak or car’s headlight) is crucial. Most of the recent successes on this problem are based upon the attention models which can localize and attend the local discriminative objects parts. In this work, we propose a training method for visual attention networks, Coarse2Fine, which creates a differentiable path from the attended feature maps to the input space. Coarse2Fine learns an inverse mapping function from the attended feature maps to the informative regions in the raw image, which will guide the attention maps to better attend the fine-grained features. Besides, we propose an initialization method for the attention weights. Our experiments show that Coarse2Fine reduces the classification error by up to 5.1% on common fine-grained datasets.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Fine-grained visual classification</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Weakly supervised object localization</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Visual attention networks</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Eigen, David</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Gormish, Michael</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Pedram, Massoud</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Machine vision and applications</subfield><subfield code="d">Springer Berlin Heidelberg, 1988</subfield><subfield code="g">32(2021), 2 vom: 25. Feb.</subfield><subfield code="w">(DE-627)129248843</subfield><subfield code="w">(DE-600)59385-0</subfield><subfield code="w">(DE-576)017944139</subfield><subfield code="x">0932-8092</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:32</subfield><subfield code="g">year:2021</subfield><subfield code="g">number:2</subfield><subfield code="g">day:25</subfield><subfield code="g">month:02</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s00138-021-01180-y</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2018</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4277</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">32</subfield><subfield code="j">2021</subfield><subfield code="e">2</subfield><subfield code="b">25</subfield><subfield code="c">02</subfield></datafield></record></collection>
|
score |
7.4005337 |