Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features
In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the lo...
Ausführliche Beschreibung
Autor*in: |
Lin, Dongyun [verfasserIn] |
---|
Format: |
E-Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2022transfer abstract |
---|
Schlagwörter: |
Cosine distance triplet-center loss |
---|
Übergeordnetes Werk: |
Enthalten in: Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea - Wang, Jiliang ELSEVIER, 2018, Amsterdam [u.a.] |
---|---|
Übergeordnetes Werk: |
volume:247 ; year:2022 ; day:8 ; month:07 ; pages:0 |
Links: |
---|
DOI / URN: |
10.1016/j.knosys.2022.108754 |
---|
Katalog-ID: |
ELV057662800 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | ELV057662800 | ||
003 | DE-627 | ||
005 | 20230626045522.0 | ||
007 | cr uuu---uuuuu | ||
008 | 220808s2022 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1016/j.knosys.2022.108754 |2 doi | |
028 | 5 | 2 | |a /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica |
035 | |a (DE-627)ELV057662800 | ||
035 | |a (ELSEVIER)S0950-7051(22)00354-9 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 550 |q VZ |
084 | |a 38.00 |2 bkl | ||
100 | 1 | |a Lin, Dongyun |e verfasserin |4 aut | |
245 | 1 | 0 | |a Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features |
264 | 1 | |c 2022transfer abstract | |
336 | |a nicht spezifiziert |b zzz |2 rdacontent | ||
337 | |a nicht spezifiziert |b z |2 rdamedia | ||
338 | |a nicht spezifiziert |b zu |2 rdacarrier | ||
520 | |a In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. | ||
520 | |a In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. | ||
650 | 7 | |a Cosine distance triplet-center loss |2 Elsevier | |
650 | 7 | |a View attention module |2 Elsevier | |
650 | 7 | |a View-based 3D object retrieval |2 Elsevier | |
650 | 7 | |a ArcFace loss |2 Elsevier | |
650 | 7 | |a Instance attention module |2 Elsevier | |
700 | 1 | |a Li, Yiqun |4 oth | |
700 | 1 | |a Cheng, Yi |4 oth | |
700 | 1 | |a Prasad, Shitala |4 oth | |
700 | 1 | |a Nwe, Tin Lay |4 oth | |
700 | 1 | |a Dong, Sheng |4 oth | |
700 | 1 | |a Guo, Aiyuan |4 oth | |
773 | 0 | 8 | |i Enthalten in |n Elsevier Science |a Wang, Jiliang ELSEVIER |t Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea |d 2018 |g Amsterdam [u.a.] |w (DE-627)ELV001104926 |
773 | 1 | 8 | |g volume:247 |g year:2022 |g day:8 |g month:07 |g pages:0 |
856 | 4 | 0 | |u https://doi.org/10.1016/j.knosys.2022.108754 |3 Volltext |
912 | |a GBV_USEFLAG_U | ||
912 | |a GBV_ELV | ||
912 | |a SYSFLAG_U | ||
912 | |a SSG-OPC-GGO | ||
936 | b | k | |a 38.00 |j Geowissenschaften: Allgemeines |q VZ |
951 | |a AR | ||
952 | |d 247 |j 2022 |b 8 |c 0708 |h 0 |
author_variant |
d l dl |
---|---|
matchkey_str |
lindongyunliyiqunchengyiprasadshitalanwe:2022----:utve3ojcrtivleeaighageainfiwnis |
hierarchy_sort_str |
2022transfer abstract |
bklnumber |
38.00 |
publishDate |
2022 |
allfields |
10.1016/j.knosys.2022.108754 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica (DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9 DE-627 ger DE-627 rakwb eng 550 VZ 38.00 bkl Lin, Dongyun verfasserin aut Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features 2022transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier Li, Yiqun oth Cheng, Yi oth Prasad, Shitala oth Nwe, Tin Lay oth Dong, Sheng oth Guo, Aiyuan oth Enthalten in Elsevier Science Wang, Jiliang ELSEVIER Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea 2018 Amsterdam [u.a.] (DE-627)ELV001104926 volume:247 year:2022 day:8 month:07 pages:0 https://doi.org/10.1016/j.knosys.2022.108754 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO 38.00 Geowissenschaften: Allgemeines VZ AR 247 2022 8 0708 0 |
spelling |
10.1016/j.knosys.2022.108754 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica (DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9 DE-627 ger DE-627 rakwb eng 550 VZ 38.00 bkl Lin, Dongyun verfasserin aut Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features 2022transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier Li, Yiqun oth Cheng, Yi oth Prasad, Shitala oth Nwe, Tin Lay oth Dong, Sheng oth Guo, Aiyuan oth Enthalten in Elsevier Science Wang, Jiliang ELSEVIER Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea 2018 Amsterdam [u.a.] (DE-627)ELV001104926 volume:247 year:2022 day:8 month:07 pages:0 https://doi.org/10.1016/j.knosys.2022.108754 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO 38.00 Geowissenschaften: Allgemeines VZ AR 247 2022 8 0708 0 |
allfields_unstemmed |
10.1016/j.knosys.2022.108754 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica (DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9 DE-627 ger DE-627 rakwb eng 550 VZ 38.00 bkl Lin, Dongyun verfasserin aut Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features 2022transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier Li, Yiqun oth Cheng, Yi oth Prasad, Shitala oth Nwe, Tin Lay oth Dong, Sheng oth Guo, Aiyuan oth Enthalten in Elsevier Science Wang, Jiliang ELSEVIER Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea 2018 Amsterdam [u.a.] (DE-627)ELV001104926 volume:247 year:2022 day:8 month:07 pages:0 https://doi.org/10.1016/j.knosys.2022.108754 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO 38.00 Geowissenschaften: Allgemeines VZ AR 247 2022 8 0708 0 |
allfieldsGer |
10.1016/j.knosys.2022.108754 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica (DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9 DE-627 ger DE-627 rakwb eng 550 VZ 38.00 bkl Lin, Dongyun verfasserin aut Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features 2022transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier Li, Yiqun oth Cheng, Yi oth Prasad, Shitala oth Nwe, Tin Lay oth Dong, Sheng oth Guo, Aiyuan oth Enthalten in Elsevier Science Wang, Jiliang ELSEVIER Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea 2018 Amsterdam [u.a.] (DE-627)ELV001104926 volume:247 year:2022 day:8 month:07 pages:0 https://doi.org/10.1016/j.knosys.2022.108754 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO 38.00 Geowissenschaften: Allgemeines VZ AR 247 2022 8 0708 0 |
allfieldsSound |
10.1016/j.knosys.2022.108754 doi /cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica (DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9 DE-627 ger DE-627 rakwb eng 550 VZ 38.00 bkl Lin, Dongyun verfasserin aut Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features 2022transfer abstract nicht spezifiziert zzz rdacontent nicht spezifiziert z rdamedia nicht spezifiziert zu rdacarrier In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier Li, Yiqun oth Cheng, Yi oth Prasad, Shitala oth Nwe, Tin Lay oth Dong, Sheng oth Guo, Aiyuan oth Enthalten in Elsevier Science Wang, Jiliang ELSEVIER Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea 2018 Amsterdam [u.a.] (DE-627)ELV001104926 volume:247 year:2022 day:8 month:07 pages:0 https://doi.org/10.1016/j.knosys.2022.108754 Volltext GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO 38.00 Geowissenschaften: Allgemeines VZ AR 247 2022 8 0708 0 |
language |
English |
source |
Enthalten in Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea Amsterdam [u.a.] volume:247 year:2022 day:8 month:07 pages:0 |
sourceStr |
Enthalten in Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea Amsterdam [u.a.] volume:247 year:2022 day:8 month:07 pages:0 |
format_phy_str_mv |
Article |
bklname |
Geowissenschaften: Allgemeines |
institution |
findex.gbv.de |
topic_facet |
Cosine distance triplet-center loss View attention module View-based 3D object retrieval ArcFace loss Instance attention module |
dewey-raw |
550 |
isfreeaccess_bool |
false |
container_title |
Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea |
authorswithroles_txt_mv |
Lin, Dongyun @@aut@@ Li, Yiqun @@oth@@ Cheng, Yi @@oth@@ Prasad, Shitala @@oth@@ Nwe, Tin Lay @@oth@@ Dong, Sheng @@oth@@ Guo, Aiyuan @@oth@@ |
publishDateDaySort_date |
2022-01-08T00:00:00Z |
hierarchy_top_id |
ELV001104926 |
dewey-sort |
3550 |
id |
ELV057662800 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">ELV057662800</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230626045522.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">220808s2022 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1016/j.knosys.2022.108754</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">/cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)ELV057662800</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ELSEVIER)S0950-7051(22)00354-9</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">550</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">38.00</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Lin, Dongyun</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022transfer abstract</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zzz</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">z</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zu</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Cosine distance triplet-center loss</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">View attention module</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">View-based 3D object retrieval</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">ArcFace loss</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Instance attention module</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Li, Yiqun</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cheng, Yi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Prasad, Shitala</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Nwe, Tin Lay</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Dong, Sheng</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Guo, Aiyuan</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="n">Elsevier Science</subfield><subfield code="a">Wang, Jiliang ELSEVIER</subfield><subfield code="t">Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea</subfield><subfield code="d">2018</subfield><subfield code="g">Amsterdam [u.a.]</subfield><subfield code="w">(DE-627)ELV001104926</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:247</subfield><subfield code="g">year:2022</subfield><subfield code="g">day:8</subfield><subfield code="g">month:07</subfield><subfield code="g">pages:0</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1016/j.knosys.2022.108754</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ELV</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-GGO</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">38.00</subfield><subfield code="j">Geowissenschaften: Allgemeines</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">247</subfield><subfield code="j">2022</subfield><subfield code="b">8</subfield><subfield code="c">0708</subfield><subfield code="h">0</subfield></datafield></record></collection>
|
author |
Lin, Dongyun |
spellingShingle |
Lin, Dongyun ddc 550 bkl 38.00 Elsevier Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features |
authorStr |
Lin, Dongyun |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)ELV001104926 |
format |
electronic Article |
dewey-ones |
550 - Earth sciences |
delete_txt_mv |
keep |
author_role |
aut |
collection |
elsevier |
remote_str |
true |
illustrated |
Not Illustrated |
topic_title |
550 VZ 38.00 bkl Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module Elsevier |
topic |
ddc 550 bkl 38.00 Elsevier Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module |
topic_unstemmed |
ddc 550 bkl 38.00 Elsevier Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module |
topic_browse |
ddc 550 bkl 38.00 Elsevier Cosine distance triplet-center loss Elsevier View attention module Elsevier View-based 3D object retrieval Elsevier ArcFace loss Elsevier Instance attention module |
format_facet |
Elektronische Aufsätze Aufsätze Elektronische Ressource |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
zu |
author2_variant |
y l yl y c yc s p sp t l n tl tln s d sd a g ag |
hierarchy_parent_title |
Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea |
hierarchy_parent_id |
ELV001104926 |
dewey-tens |
550 - Earth sciences & geology |
hierarchy_top_title |
Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)ELV001104926 |
title |
Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features |
ctrlnum |
(DE-627)ELV057662800 (ELSEVIER)S0950-7051(22)00354-9 |
title_full |
Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features |
author_sort |
Lin, Dongyun |
journal |
Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea |
journalStr |
Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
500 - Science |
recordtype |
marc |
publishDateSort |
2022 |
contenttype_str_mv |
zzz |
container_start_page |
0 |
author_browse |
Lin, Dongyun |
container_volume |
247 |
class |
550 VZ 38.00 bkl |
format_se |
Elektronische Aufsätze |
author-letter |
Lin, Dongyun |
doi_str_mv |
10.1016/j.knosys.2022.108754 |
dewey-full |
550 |
title_sort |
multi-view 3d object retrieval leveraging the aggregation of view and instance attentive features |
title_auth |
Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features |
abstract |
In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. |
abstractGer |
In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. |
abstract_unstemmed |
In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods. |
collection_details |
GBV_USEFLAG_U GBV_ELV SYSFLAG_U SSG-OPC-GGO |
title_short |
Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features |
url |
https://doi.org/10.1016/j.knosys.2022.108754 |
remote_bool |
true |
author2 |
Li, Yiqun Cheng, Yi Prasad, Shitala Nwe, Tin Lay Dong, Sheng Guo, Aiyuan |
author2Str |
Li, Yiqun Cheng, Yi Prasad, Shitala Nwe, Tin Lay Dong, Sheng Guo, Aiyuan |
ppnlink |
ELV001104926 |
mediatype_str_mv |
z |
isOA_txt |
false |
hochschulschrift_bool |
false |
author2_role |
oth oth oth oth oth oth |
doi_str |
10.1016/j.knosys.2022.108754 |
up_date |
2024-07-06T16:47:58.917Z |
_version_ |
1803849017765920768 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">ELV057662800</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230626045522.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">220808s2022 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1016/j.knosys.2022.108754</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">/cbs_pica/cbs_olc/import_discovery/elsevier/einzuspielen/GBV00000000001768.pica</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)ELV057662800</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(ELSEVIER)S0950-7051(22)00354-9</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">550</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">38.00</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Lin, Dongyun</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Multi-view 3D object retrieval leveraging the aggregation of view and instance attentive features</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022transfer abstract</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zzz</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">z</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">nicht spezifiziert</subfield><subfield code="b">zu</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In multi-view 3D object retrieval tasks, it is pivotal to aggregate visual features extracted from multiple view images to generate a discriminative representation for a 3D object. The existing multi-view convolutional neural network employs view pooling for feature aggregation, which ignores the local view-relevant discriminative information within each view image and the global correlative information across all view images. To leverage both types of information, we propose two self-attention modules, namely, View Attention Module and Instance Attention Module, to learn view and instance attentive features, respectively. The final representation of a 3D object is the aggregation of three features: original, view-attentive, and instance-attentive. Furthermore, we propose employing the ArcFace loss together with the cosine-distance-based triplet-center loss as the metric learning guidance to train our model. As the cosine distance is used to rank the retrieval results, our angular metric learning losses achieve a consistent objective between the training and testing processes, thereby facilitating discriminative feature learning. Extensive experiments and ablation studies are conducted on four publicly available datasets on 3D object retrieval to show the superiority of the proposed method over multiple state-of-the-art methods.</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Cosine distance triplet-center loss</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">View attention module</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">View-based 3D object retrieval</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">ArcFace loss</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="650" ind1=" " ind2="7"><subfield code="a">Instance attention module</subfield><subfield code="2">Elsevier</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Li, Yiqun</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cheng, Yi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Prasad, Shitala</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Nwe, Tin Lay</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Dong, Sheng</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Guo, Aiyuan</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="n">Elsevier Science</subfield><subfield code="a">Wang, Jiliang ELSEVIER</subfield><subfield code="t">Subsurface fluid flow at an active cold seep area in the Qiongdongnan Basin, northern South China Sea</subfield><subfield code="d">2018</subfield><subfield code="g">Amsterdam [u.a.]</subfield><subfield code="w">(DE-627)ELV001104926</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:247</subfield><subfield code="g">year:2022</subfield><subfield code="g">day:8</subfield><subfield code="g">month:07</subfield><subfield code="g">pages:0</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1016/j.knosys.2022.108754</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ELV</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_U</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-GGO</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">38.00</subfield><subfield code="j">Geowissenschaften: Allgemeines</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">247</subfield><subfield code="j">2022</subfield><subfield code="b">8</subfield><subfield code="c">0708</subfield><subfield code="h">0</subfield></datafield></record></collection>
|
score |
7.4006147 |