Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition
Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the...
Ausführliche Beschreibung
Autor*in: |
Wang, Anran [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2015 |
---|
Schlagwörter: |
---|
Systematik: |
|
---|
Übergeordnetes Werk: |
Enthalten in: IEEE transactions on multimedia - New York, NY : Institute of Electrical and Electronics Engineers, 1999, 17(2015), 11, Seite 1887-1898 |
---|---|
Übergeordnetes Werk: |
volume:17 ; year:2015 ; number:11 ; pages:1887-1898 |
Links: |
---|
DOI / URN: |
10.1109/TMM.2015.2476655 |
---|
Katalog-ID: |
OLC1960756982 |
---|
LEADER | 01000caa a2200265 4500 | ||
---|---|---|---|
001 | OLC1960756982 | ||
003 | DE-627 | ||
005 | 20220216154753.0 | ||
007 | tu | ||
008 | 160206s2015 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1109/TMM.2015.2476655 |2 doi | |
028 | 5 | 2 | |a PQ20160617 |
035 | |a (DE-627)OLC1960756982 | ||
035 | |a (DE-599)GBVOLC1960756982 | ||
035 | |a (PRQ)c1583-9274c5a0185e9e73bb41f66c9ca707a56920b2057ad41335211a7d1aa777a5490 | ||
035 | |a (KEY)0381447520150000017001101887largemarginmultimodaldeeplearningforrgbdobjectreco | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 004 |q DNB |
084 | |a ST 325: |q AVZ |2 rvk | ||
084 | |a 54.87 |2 bkl | ||
100 | 1 | |a Wang, Anran |e verfasserin |4 aut | |
245 | 1 | 0 | |a Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition |
264 | 1 | |c 2015 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
520 | |a Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data. | ||
650 | 4 | |a Labeling | |
650 | 4 | |a large-margin feature learning | |
650 | 4 | |a Neural networks | |
650 | 4 | |a Image color analysis | |
650 | 4 | |a Feature extraction | |
650 | 4 | |a Deep learning | |
650 | 4 | |a Machine learning | |
650 | 4 | |a Correlation | |
650 | 4 | |a multi-modality | |
650 | 4 | |a RGB-D object recognition | |
650 | 4 | |a Object recognition | |
700 | 1 | |a Lu, Jiwen |4 oth | |
700 | 1 | |a Cai, Jianfei |4 oth | |
700 | 1 | |a Cham, Tat-Jen |4 oth | |
700 | 1 | |a Wang, Gang |4 oth | |
773 | 0 | 8 | |i Enthalten in |t IEEE transactions on multimedia |d New York, NY : Institute of Electrical and Electronics Engineers, 1999 |g 17(2015), 11, Seite 1887-1898 |w (DE-627)266019404 |w (DE-600)1467073-2 |w (DE-576)074960644 |x 1520-9210 |7 nnns |
773 | 1 | 8 | |g volume:17 |g year:2015 |g number:11 |g pages:1887-1898 |
856 | 4 | 1 | |u http://dx.doi.org/10.1109/TMM.2015.2476655 |3 Volltext |
856 | 4 | 2 | |u http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7258382 |
856 | 4 | 2 | |u http://search.proquest.com/docview/1729395920 |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-MAT | ||
912 | |a GBV_ILN_70 | ||
912 | |a GBV_ILN_4318 | ||
936 | r | v | |a ST 325: |
936 | b | k | |a 54.87 |q AVZ |
951 | |a AR | ||
952 | |d 17 |j 2015 |e 11 |h 1887-1898 |
author_variant |
a w aw |
---|---|
matchkey_str |
article:15209210:2015----::agmrimlioadelannfrgdb |
hierarchy_sort_str |
2015 |
bklnumber |
54.87 |
publishDate |
2015 |
allfields |
10.1109/TMM.2015.2476655 doi PQ20160617 (DE-627)OLC1960756982 (DE-599)GBVOLC1960756982 (PRQ)c1583-9274c5a0185e9e73bb41f66c9ca707a56920b2057ad41335211a7d1aa777a5490 (KEY)0381447520150000017001101887largemarginmultimodaldeeplearningforrgbdobjectreco DE-627 ger DE-627 rakwb eng 004 DNB ST 325: AVZ rvk 54.87 bkl Wang, Anran verfasserin aut Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data. Labeling large-margin feature learning Neural networks Image color analysis Feature extraction Deep learning Machine learning Correlation multi-modality RGB-D object recognition Object recognition Lu, Jiwen oth Cai, Jianfei oth Cham, Tat-Jen oth Wang, Gang oth Enthalten in IEEE transactions on multimedia New York, NY : Institute of Electrical and Electronics Engineers, 1999 17(2015), 11, Seite 1887-1898 (DE-627)266019404 (DE-600)1467073-2 (DE-576)074960644 1520-9210 nnns volume:17 year:2015 number:11 pages:1887-1898 http://dx.doi.org/10.1109/TMM.2015.2476655 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7258382 http://search.proquest.com/docview/1729395920 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4318 ST 325: 54.87 AVZ AR 17 2015 11 1887-1898 |
spelling |
10.1109/TMM.2015.2476655 doi PQ20160617 (DE-627)OLC1960756982 (DE-599)GBVOLC1960756982 (PRQ)c1583-9274c5a0185e9e73bb41f66c9ca707a56920b2057ad41335211a7d1aa777a5490 (KEY)0381447520150000017001101887largemarginmultimodaldeeplearningforrgbdobjectreco DE-627 ger DE-627 rakwb eng 004 DNB ST 325: AVZ rvk 54.87 bkl Wang, Anran verfasserin aut Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data. Labeling large-margin feature learning Neural networks Image color analysis Feature extraction Deep learning Machine learning Correlation multi-modality RGB-D object recognition Object recognition Lu, Jiwen oth Cai, Jianfei oth Cham, Tat-Jen oth Wang, Gang oth Enthalten in IEEE transactions on multimedia New York, NY : Institute of Electrical and Electronics Engineers, 1999 17(2015), 11, Seite 1887-1898 (DE-627)266019404 (DE-600)1467073-2 (DE-576)074960644 1520-9210 nnns volume:17 year:2015 number:11 pages:1887-1898 http://dx.doi.org/10.1109/TMM.2015.2476655 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7258382 http://search.proquest.com/docview/1729395920 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4318 ST 325: 54.87 AVZ AR 17 2015 11 1887-1898 |
allfields_unstemmed |
10.1109/TMM.2015.2476655 doi PQ20160617 (DE-627)OLC1960756982 (DE-599)GBVOLC1960756982 (PRQ)c1583-9274c5a0185e9e73bb41f66c9ca707a56920b2057ad41335211a7d1aa777a5490 (KEY)0381447520150000017001101887largemarginmultimodaldeeplearningforrgbdobjectreco DE-627 ger DE-627 rakwb eng 004 DNB ST 325: AVZ rvk 54.87 bkl Wang, Anran verfasserin aut Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data. Labeling large-margin feature learning Neural networks Image color analysis Feature extraction Deep learning Machine learning Correlation multi-modality RGB-D object recognition Object recognition Lu, Jiwen oth Cai, Jianfei oth Cham, Tat-Jen oth Wang, Gang oth Enthalten in IEEE transactions on multimedia New York, NY : Institute of Electrical and Electronics Engineers, 1999 17(2015), 11, Seite 1887-1898 (DE-627)266019404 (DE-600)1467073-2 (DE-576)074960644 1520-9210 nnns volume:17 year:2015 number:11 pages:1887-1898 http://dx.doi.org/10.1109/TMM.2015.2476655 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7258382 http://search.proquest.com/docview/1729395920 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4318 ST 325: 54.87 AVZ AR 17 2015 11 1887-1898 |
allfieldsGer |
10.1109/TMM.2015.2476655 doi PQ20160617 (DE-627)OLC1960756982 (DE-599)GBVOLC1960756982 (PRQ)c1583-9274c5a0185e9e73bb41f66c9ca707a56920b2057ad41335211a7d1aa777a5490 (KEY)0381447520150000017001101887largemarginmultimodaldeeplearningforrgbdobjectreco DE-627 ger DE-627 rakwb eng 004 DNB ST 325: AVZ rvk 54.87 bkl Wang, Anran verfasserin aut Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data. Labeling large-margin feature learning Neural networks Image color analysis Feature extraction Deep learning Machine learning Correlation multi-modality RGB-D object recognition Object recognition Lu, Jiwen oth Cai, Jianfei oth Cham, Tat-Jen oth Wang, Gang oth Enthalten in IEEE transactions on multimedia New York, NY : Institute of Electrical and Electronics Engineers, 1999 17(2015), 11, Seite 1887-1898 (DE-627)266019404 (DE-600)1467073-2 (DE-576)074960644 1520-9210 nnns volume:17 year:2015 number:11 pages:1887-1898 http://dx.doi.org/10.1109/TMM.2015.2476655 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7258382 http://search.proquest.com/docview/1729395920 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4318 ST 325: 54.87 AVZ AR 17 2015 11 1887-1898 |
allfieldsSound |
10.1109/TMM.2015.2476655 doi PQ20160617 (DE-627)OLC1960756982 (DE-599)GBVOLC1960756982 (PRQ)c1583-9274c5a0185e9e73bb41f66c9ca707a56920b2057ad41335211a7d1aa777a5490 (KEY)0381447520150000017001101887largemarginmultimodaldeeplearningforrgbdobjectreco DE-627 ger DE-627 rakwb eng 004 DNB ST 325: AVZ rvk 54.87 bkl Wang, Anran verfasserin aut Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data. Labeling large-margin feature learning Neural networks Image color analysis Feature extraction Deep learning Machine learning Correlation multi-modality RGB-D object recognition Object recognition Lu, Jiwen oth Cai, Jianfei oth Cham, Tat-Jen oth Wang, Gang oth Enthalten in IEEE transactions on multimedia New York, NY : Institute of Electrical and Electronics Engineers, 1999 17(2015), 11, Seite 1887-1898 (DE-627)266019404 (DE-600)1467073-2 (DE-576)074960644 1520-9210 nnns volume:17 year:2015 number:11 pages:1887-1898 http://dx.doi.org/10.1109/TMM.2015.2476655 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7258382 http://search.proquest.com/docview/1729395920 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4318 ST 325: 54.87 AVZ AR 17 2015 11 1887-1898 |
language |
English |
source |
Enthalten in IEEE transactions on multimedia 17(2015), 11, Seite 1887-1898 volume:17 year:2015 number:11 pages:1887-1898 |
sourceStr |
Enthalten in IEEE transactions on multimedia 17(2015), 11, Seite 1887-1898 volume:17 year:2015 number:11 pages:1887-1898 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Labeling large-margin feature learning Neural networks Image color analysis Feature extraction Deep learning Machine learning Correlation multi-modality RGB-D object recognition Object recognition |
dewey-raw |
004 |
isfreeaccess_bool |
false |
container_title |
IEEE transactions on multimedia |
authorswithroles_txt_mv |
Wang, Anran @@aut@@ Lu, Jiwen @@oth@@ Cai, Jianfei @@oth@@ Cham, Tat-Jen @@oth@@ Wang, Gang @@oth@@ |
publishDateDaySort_date |
2015-01-01T00:00:00Z |
hierarchy_top_id |
266019404 |
dewey-sort |
14 |
id |
OLC1960756982 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1960756982</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20220216154753.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160206s2015 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1109/TMM.2015.2476655</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160617</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1960756982</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1960756982</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)c1583-9274c5a0185e9e73bb41f66c9ca707a56920b2057ad41335211a7d1aa777a5490</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0381447520150000017001101887largemarginmultimodaldeeplearningforrgbdobjectreco</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 325:</subfield><subfield code="q">AVZ</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">54.87</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Wang, Anran</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2015</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Labeling</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">large-margin feature learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Neural networks</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Image color analysis</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Feature extraction</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Deep learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Machine learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Correlation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">multi-modality</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">RGB-D object recognition</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Object recognition</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Lu, Jiwen</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cai, Jianfei</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cham, Tat-Jen</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wang, Gang</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">IEEE transactions on multimedia</subfield><subfield code="d">New York, NY : Institute of Electrical and Electronics Engineers, 1999</subfield><subfield code="g">17(2015), 11, Seite 1887-1898</subfield><subfield code="w">(DE-627)266019404</subfield><subfield code="w">(DE-600)1467073-2</subfield><subfield code="w">(DE-576)074960644</subfield><subfield code="x">1520-9210</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:17</subfield><subfield code="g">year:2015</subfield><subfield code="g">number:11</subfield><subfield code="g">pages:1887-1898</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1109/TMM.2015.2476655</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7258382</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://search.proquest.com/docview/1729395920</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4318</subfield></datafield><datafield tag="936" ind1="r" ind2="v"><subfield code="a">ST 325:</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">54.87</subfield><subfield code="q">AVZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">17</subfield><subfield code="j">2015</subfield><subfield code="e">11</subfield><subfield code="h">1887-1898</subfield></datafield></record></collection>
|
author |
Wang, Anran |
spellingShingle |
Wang, Anran ddc 004 rvk ST 325: bkl 54.87 misc Labeling misc large-margin feature learning misc Neural networks misc Image color analysis misc Feature extraction misc Deep learning misc Machine learning misc Correlation misc multi-modality misc RGB-D object recognition misc Object recognition Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition |
authorStr |
Wang, Anran |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)266019404 |
format |
Article |
dewey-ones |
004 - Data processing & computer science |
delete_txt_mv |
keep |
author_role |
aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
1520-9210 |
topic_title |
004 DNB ST 325: AVZ rvk 54.87 bkl Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition Labeling large-margin feature learning Neural networks Image color analysis Feature extraction Deep learning Machine learning Correlation multi-modality RGB-D object recognition Object recognition |
topic |
ddc 004 rvk ST 325: bkl 54.87 misc Labeling misc large-margin feature learning misc Neural networks misc Image color analysis misc Feature extraction misc Deep learning misc Machine learning misc Correlation misc multi-modality misc RGB-D object recognition misc Object recognition |
topic_unstemmed |
ddc 004 rvk ST 325: bkl 54.87 misc Labeling misc large-margin feature learning misc Neural networks misc Image color analysis misc Feature extraction misc Deep learning misc Machine learning misc Correlation misc multi-modality misc RGB-D object recognition misc Object recognition |
topic_browse |
ddc 004 rvk ST 325: bkl 54.87 misc Labeling misc large-margin feature learning misc Neural networks misc Image color analysis misc Feature extraction misc Deep learning misc Machine learning misc Correlation misc multi-modality misc RGB-D object recognition misc Object recognition |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
author2_variant |
j l jl j c jc t j c tjc g w gw |
hierarchy_parent_title |
IEEE transactions on multimedia |
hierarchy_parent_id |
266019404 |
dewey-tens |
000 - Computer science, knowledge & systems |
hierarchy_top_title |
IEEE transactions on multimedia |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)266019404 (DE-600)1467073-2 (DE-576)074960644 |
title |
Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition |
ctrlnum |
(DE-627)OLC1960756982 (DE-599)GBVOLC1960756982 (PRQ)c1583-9274c5a0185e9e73bb41f66c9ca707a56920b2057ad41335211a7d1aa777a5490 (KEY)0381447520150000017001101887largemarginmultimodaldeeplearningforrgbdobjectreco |
title_full |
Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition |
author_sort |
Wang, Anran |
journal |
IEEE transactions on multimedia |
journalStr |
IEEE transactions on multimedia |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works |
recordtype |
marc |
publishDateSort |
2015 |
contenttype_str_mv |
txt |
container_start_page |
1887 |
author_browse |
Wang, Anran |
container_volume |
17 |
class |
004 DNB ST 325: AVZ rvk 54.87 bkl |
format_se |
Aufsätze |
author-letter |
Wang, Anran |
doi_str_mv |
10.1109/TMM.2015.2476655 |
dewey-full |
004 |
title_sort |
large-margin multi-modal deep learning for rgb-d object recognition |
title_auth |
Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition |
abstract |
Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data. |
abstractGer |
Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data. |
abstract_unstemmed |
Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data. |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4318 |
container_issue |
11 |
title_short |
Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition |
url |
http://dx.doi.org/10.1109/TMM.2015.2476655 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7258382 http://search.proquest.com/docview/1729395920 |
remote_bool |
false |
author2 |
Lu, Jiwen Cai, Jianfei Cham, Tat-Jen Wang, Gang |
author2Str |
Lu, Jiwen Cai, Jianfei Cham, Tat-Jen Wang, Gang |
ppnlink |
266019404 |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
author2_role |
oth oth oth oth |
doi_str |
10.1109/TMM.2015.2476655 |
up_date |
2024-07-03T22:26:06.640Z |
_version_ |
1803598500111319040 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1960756982</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20220216154753.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160206s2015 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1109/TMM.2015.2476655</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160617</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1960756982</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1960756982</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)c1583-9274c5a0185e9e73bb41f66c9ca707a56920b2057ad41335211a7d1aa777a5490</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0381447520150000017001101887largemarginmultimodaldeeplearningforrgbdobjectreco</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">ST 325:</subfield><subfield code="q">AVZ</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">54.87</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Wang, Anran</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Large-Margin Multi-Modal Deep Learning for RGB-D Object Recognition</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2015</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Most existing feature learning-based methods for RGB-D object recognition either combine RGB and depth data in an undifferentiated manner from the outset, or learn features from color and depth separately, which do not adequately exploit different characteristics of the two modalities or utilize the shared relationship between the modalities. In this paper, we propose a general CNN-based multi-modal learning framework for RGB-D object recognition. We first construct deep CNN layers for color and depth separately, which are then connected with a carefully designed multi-modal layer. This layer is designed to not only discover the most discriminative features for each modality, but is also able to harness the complementary relationship between the two modalities. The results of the multi-modal layer are back-propagated to update parameters of the CNN layers, and the multi-modal feature learning and the back-propagation are iteratively performed until convergence. Experimental results on two widely used RGB-D object datasets show that our method for general multi-modal learning achieves comparable performance to state-of-the-art methods specifically designed for RGB-D data.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Labeling</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">large-margin feature learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Neural networks</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Image color analysis</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Feature extraction</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Deep learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Machine learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Correlation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">multi-modality</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">RGB-D object recognition</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Object recognition</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Lu, Jiwen</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cai, Jianfei</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Cham, Tat-Jen</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wang, Gang</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">IEEE transactions on multimedia</subfield><subfield code="d">New York, NY : Institute of Electrical and Electronics Engineers, 1999</subfield><subfield code="g">17(2015), 11, Seite 1887-1898</subfield><subfield code="w">(DE-627)266019404</subfield><subfield code="w">(DE-600)1467073-2</subfield><subfield code="w">(DE-576)074960644</subfield><subfield code="x">1520-9210</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:17</subfield><subfield code="g">year:2015</subfield><subfield code="g">number:11</subfield><subfield code="g">pages:1887-1898</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1109/TMM.2015.2476655</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=7258382</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://search.proquest.com/docview/1729395920</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4318</subfield></datafield><datafield tag="936" ind1="r" ind2="v"><subfield code="a">ST 325:</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">54.87</subfield><subfield code="q">AVZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">17</subfield><subfield code="j">2015</subfield><subfield code="e">11</subfield><subfield code="h">1887-1898</subfield></datafield></record></collection>
|
score |
7.401865 |