Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features

In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the lo...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Lin, Dongyun [verfasserIn] Li, Yiqun Cheng, Yi Prasad, Shitala Nwe, Tin Lay Dong, Sheng Guo, Aiyuan

Format:	E-Artikel
Sprache:	Englisch

Erschienen:	2022transfer abstract

Schlagwörter:	Cosine distance triplet-center loss View attention module View-based 3D object retrieval ArcFace loss Instance attention module

Übergeordnetes Werk:	Enthalten in: Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea - Wang, Jiliang ELSEVIER, 2018, Amsterdam [u.a.]
Übergeordnetes Werk:	volume:247 ; year:2022 ; day:8 ; month:07 ; pages:0

Links:	Volltext

DOI / URN:	10.1016/j.knosys.2022.108754

Katalog-ID:	ELV057662800

Internformat


LEADER	01000caa a22002652 4500
001	ELV057662800
003	DE-627
005	20230626045522.0
007	cr uuu---uuuuu
008	220808s2022 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1016/j.knosys.2022.108754 \|2 doi
028	5	2	\|a /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica
035			\|a (DE-627)ELV057662800
035			\|a (ELSEVIER)S0950-7051(22)00354-9
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 550 \|q VZ
084			\|a 38.00 \|2 bkl
100	1		\|a Lin, Dongyun \|e verfasserin \|4 aut
245	1	0	\|a Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features
264		1	\|c 2022transfer abstract
336			\|a nicht spezifiziert \|b zzz \|2 rdacontent
337			\|a nicht spezifiziert \|b z \|2 rdamedia
338			\|a nicht spezifiziert \|b zu \|2 rdacarrier
520			\|a In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.
520			\|a In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.
650		7	\|a Cosine distance triplet-center loss \|2 Elsevier
650		7	\|a View attention module \|2 Elsevier
650		7	\|a View-based 3D object retrieval \|2 Elsevier
650		7	\|a ArcFace loss \|2 Elsevier
650		7	\|a Instance attention module \|2 Elsevier
700	1		\|a Li, Yiqun \|4 oth
700	1		\|a Cheng, Yi \|4 oth
700	1		\|a Prasad, Shitala \|4 oth
700	1		\|a Nwe, Tin Lay \|4 oth
700	1		\|a Dong, Sheng \|4 oth
700	1		\|a Guo, Aiyuan \|4 oth
773	0	8	\|i Enthalten in \|n Elsevier Science \|a Wang, Jiliang ELSEVIER \|t Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea \|d 2018 \|g Amsterdam [u.a.] \|w (DE-627)ELV001104926
773	1	8	\|g volume:247 \|g year:2022 \|g day:8 \|g month:07 \|g pages:0
856	4	0	\|u https://doi.org/10.1016/j.knosys.2022.108754 \|3 Volltext
912			\|a GBV_USEFLAG_U
912			\|a GBV_ELV
912			\|a SYSFLAG_U
912			\|a SSG-OPC-GGO
936	b	k	\|a 38.00 \|j Geowissenschaften: Allgemeines \|q VZ
951			\|a AR
952			\|d 247 \|j 2022 \|b 8 \|c 0708 \|h 0

Indexfelder

author_variant	d l dl
matchkey_str	lindongyunliyiqunchengyiprasadshitalanwe:2022----:utve3ojcrtivleeaighageainfiwnis
hierarchy_sort_str	2022transfer abstract
bklnumber	38.00
publishDate	2022
allfields	10.1016/j.knosys.2022.108754 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica (DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9 DE-627 ger DE-627 rakwb eng 550 VZ 38.00 bkl Lin, Dongyun verfasserin aut Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features 2022transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier Li, Yiqun oth Cheng, Yi oth Prasad, Shitala oth Nwe, Tin Lay oth Dong, Sheng oth Guo, Aiyuan oth Enthalten in Elsevier Science Wang, Jiliang ELSEVIER Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea 2018 Amsterdam [u.a.] (DE-627)ELV001104926 volume:247 year:2022 day:8 month:07 pages:0 https://doi.org/10.1016/j.knosys.2022.108754 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO 38.00 Geowissenschaften: Allgemeines VZ AR 247 2022 8 0708 0
spelling	10.1016/j.knosys.2022.108754 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica (DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9 DE-627 ger DE-627 rakwb eng 550 VZ 38.00 bkl Lin, Dongyun verfasserin aut Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features 2022transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier Li, Yiqun oth Cheng, Yi oth Prasad, Shitala oth Nwe, Tin Lay oth Dong, Sheng oth Guo, Aiyuan oth Enthalten in Elsevier Science Wang, Jiliang ELSEVIER Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea 2018 Amsterdam [u.a.] (DE-627)ELV001104926 volume:247 year:2022 day:8 month:07 pages:0 https://doi.org/10.1016/j.knosys.2022.108754 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO 38.00 Geowissenschaften: Allgemeines VZ AR 247 2022 8 0708 0
allfields_unstemmed	10.1016/j.knosys.2022.108754 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica (DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9 DE-627 ger DE-627 rakwb eng 550 VZ 38.00 bkl Lin, Dongyun verfasserin aut Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features 2022transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier Li, Yiqun oth Cheng, Yi oth Prasad, Shitala oth Nwe, Tin Lay oth Dong, Sheng oth Guo, Aiyuan oth Enthalten in Elsevier Science Wang, Jiliang ELSEVIER Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea 2018 Amsterdam [u.a.] (DE-627)ELV001104926 volume:247 year:2022 day:8 month:07 pages:0 https://doi.org/10.1016/j.knosys.2022.108754 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO 38.00 Geowissenschaften: Allgemeines VZ AR 247 2022 8 0708 0
allfieldsGer	10.1016/j.knosys.2022.108754 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica (DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9 DE-627 ger DE-627 rakwb eng 550 VZ 38.00 bkl Lin, Dongyun verfasserin aut Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features 2022transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier Li, Yiqun oth Cheng, Yi oth Prasad, Shitala oth Nwe, Tin Lay oth Dong, Sheng oth Guo, Aiyuan oth Enthalten in Elsevier Science Wang, Jiliang ELSEVIER Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea 2018 Amsterdam [u.a.] (DE-627)ELV001104926 volume:247 year:2022 day:8 month:07 pages:0 https://doi.org/10.1016/j.knosys.2022.108754 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO 38.00 Geowissenschaften: Allgemeines VZ AR 247 2022 8 0708 0
allfieldsSound	10.1016/j.knosys.2022.108754 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica (DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9 DE-627 ger DE-627 rakwb eng 550 VZ 38.00 bkl Lin, Dongyun verfasserin aut Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features 2022transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier Li, Yiqun oth Cheng, Yi oth Prasad, Shitala oth Nwe, Tin Lay oth Dong, Sheng oth Guo, Aiyuan oth Enthalten in Elsevier Science Wang, Jiliang ELSEVIER Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea 2018 Amsterdam [u.a.] (DE-627)ELV001104926 volume:247 year:2022 day:8 month:07 pages:0 https://doi.org/10.1016/j.knosys.2022.108754 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO 38.00 Geowissenschaften: Allgemeines VZ AR 247 2022 8 0708 0
language	English
source	Enthalten in Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea Amsterdam [u.a.] volume:247 year:2022 day:8 month:07 pages:0
sourceStr	Enthalten in Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea Amsterdam [u.a.] volume:247 year:2022 day:8 month:07 pages:0
format_phy_str_mv	Article
bklname	Geowissenschaften: Allgemeines
institution	findex.gbv.de
topic_facet	Cosine distance triplet-center loss View attention module View-based 3D object retrieval ArcFace loss Instance attention module
dewey-raw	550
isfreeaccess_bool	false
container_title	Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea
authorswithroles_txt_mv	Lin, Dongyun @@aut@@ Li, Yiqun @@oth@@ Cheng, Yi @@oth@@ Prasad, Shitala @@oth@@ Nwe, Tin Lay @@oth@@ Dong, Sheng @@oth@@ Guo, Aiyuan @@oth@@
publishDateDaySort_date	2022-01-08T00:00:00Z
hierarchy_top_id	ELV001104926
dewey-sort	3550
id	ELV057662800
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">ELV057662800</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230626045522.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">220808s2022 xx \|\|\|\|\|o 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1016/j.knosys.2022.108754</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">/cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)ELV057662800</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ELSEVIER)S0950-7051(22)00354-9</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">550</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">38.00</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Lin, Dongyun</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022transfer abstract</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zzz</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">z</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zu</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Cosine distance triplet-center loss</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">View attention module</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">View-based 3D object retrieval</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">ArcFace loss</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Instance attention module</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Li, Yiqun</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cheng, Yi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Prasad, Shitala</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Nwe, Tin Lay</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Dong, Sheng</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Guo, Aiyuan</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="n">Elsevier Science</subfield><subfield code="a">Wang, Jiliang ELSEVIER</subfield><subfield code="t">Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea</subfield><subfield code="d">2018</subfield><subfield code="g">Amsterdam [u.a.]</subfield><subfield code="w">(DE-627)ELV001104926</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:247</subfield><subfield code="g">year:2022</subfield><subfield code="g">day:8</subfield><subfield code="g">month:07</subfield><subfield code="g">pages:0</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1016/j.knosys.2022.108754</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ELV</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-GGO</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">38.00</subfield><subfield code="j">Geowissenschaften: Allgemeines</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">247</subfield><subfield code="j">2022</subfield><subfield code="b">8</subfield><subfield code="c">0708</subfield><subfield code="h">0</subfield></datafield></record></collection>
author	Lin, Dongyun
spellingShingle	Lin, Dongyun ddc 550 bkl 38.00 Elsevier Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features
authorStr	Lin, Dongyun
ppnlink_with_tag_str_mv	@@773@@(DE-627)ELV001104926
format	electronic Article
dewey-ones	550 - Earth sciences
delete_txt_mv	keep
author_role	aut
collection	elsevier
remote_str	true
illustrated	Not Illustrated
topic_title	550 VZ 38.00 bkl Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier
topic	ddc 550 bkl 38.00 Elsevier Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module
topic_unstemmed	ddc 550 bkl 38.00 Elsevier Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module
topic_browse	ddc 550 bkl 38.00 Elsevier Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module
format_facet	Elektronische Aufsätze Aufsätze Elektronische Ressource
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	zu
author2_variant	y l yl y c yc s p sp t l n tl tln s d sd a g ag
hierarchy_parent_title	Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea
hierarchy_parent_id	ELV001104926
dewey-tens	550 - Earth sciences & geology
hierarchy_top_title	Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)ELV001104926
title	Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features
ctrlnum	(DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9
title_full	Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features
author_sort	Lin, Dongyun
journal	Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea
journalStr	Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea
lang_code	eng
isOA_bool	false
dewey-hundreds	500 - Science
recordtype	marc
publishDateSort	2022
contenttype_str_mv	zzz
container_start_page	0
author_browse	Lin, Dongyun
container_volume	247
class	550 VZ 38.00 bkl
format_se	Elektronische Aufsätze
author-letter	Lin, Dongyun
doi_str_mv	10.1016/j.knosys.2022.108754
dewey-full	550
title_sort	multi-view 3d object retrieval leveraging the aggregation of view and instance attentive features
title_auth	Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features
abstract	In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.
abstractGer	In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.
abstract_unstemmed	In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.
collection_details	GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO
title_short	Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features
url	https://doi.org/10.1016/j.knosys.2022.108754
remote_bool	true
author2	Li, Yiqun Cheng, Yi Prasad, Shitala Nwe, Tin Lay Dong, Sheng Guo, Aiyuan
author2Str	Li, Yiqun Cheng, Yi Prasad, Shitala Nwe, Tin Lay Dong, Sheng Guo, Aiyuan
ppnlink	ELV001104926
mediatype_str_mv	z
isOA_txt	false
hochschulschrift_bool	false
author2_role	oth oth oth oth oth oth
doi_str	10.1016/j.knosys.2022.108754
up_date	2024-07-06T16:47:58.917Z
_version_	1803849017765920768
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">ELV057662800</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230626045522.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">220808s2022 xx \|\|\|\|\|o 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1016/j.knosys.2022.108754</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">/cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)ELV057662800</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ELSEVIER)S0950-7051(22)00354-9</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">550</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">38.00</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Lin, Dongyun</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022transfer abstract</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zzz</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">z</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zu</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Cosine distance triplet-center loss</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">View attention module</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">View-based 3D object retrieval</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">ArcFace loss</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Instance attention module</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Li, Yiqun</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cheng, Yi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Prasad, Shitala</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Nwe, Tin Lay</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Dong, Sheng</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Guo, Aiyuan</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="n">Elsevier Science</subfield><subfield code="a">Wang, Jiliang ELSEVIER</subfield><subfield code="t">Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea</subfield><subfield code="d">2018</subfield><subfield code="g">Amsterdam [u.a.]</subfield><subfield code="w">(DE-627)ELV001104926</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:247</subfield><subfield code="g">year:2022</subfield><subfield code="g">day:8</subfield><subfield code="g">month:07</subfield><subfield code="g">pages:0</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1016/j.knosys.2022.108754</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ELV</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-GGO</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">38.00</subfield><subfield code="j">Geowissenschaften: Allgemeines</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">247</subfield><subfield code="j">2022</subfield><subfield code="b">8</subfield><subfield code="c">0708</subfield><subfield code="h">0</subfield></datafield></record></collection>
score	7.4006147

Nicht das Richtige dabei?

Schreiben Sie uns!

Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?