Class-dependent and cross-modal memory network considering sentimental features for video-based captioning
The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network con...
Ausführliche Beschreibung
Autor*in: |
Haitao Xiong [verfasserIn] Yuchen Zhou [verfasserIn] Jiaming Liu [verfasserIn] Yuanyuan Cai [verfasserIn] |
---|
Format: |
E-Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2023 |
---|
Schlagwörter: |
---|
Übergeordnetes Werk: |
In: Frontiers in Psychology - Frontiers Media S.A., 2010, 14(2023) |
---|---|
Übergeordnetes Werk: |
volume:14 ; year:2023 |
Links: |
---|
DOI / URN: |
10.3389/fpsyg.2023.1124369 |
---|
Katalog-ID: |
DOAJ080415024 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | DOAJ080415024 | ||
003 | DE-627 | ||
005 | 20230310191027.0 | ||
007 | cr uuu---uuuuu | ||
008 | 230310s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.3389/fpsyg.2023.1124369 |2 doi | |
035 | |a (DE-627)DOAJ080415024 | ||
035 | |a (DE-599)DOAJ3c917efce26146ea8ddcdc2f5c71569c | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
050 | 0 | |a BF1-990 | |
100 | 0 | |a Haitao Xiong |e verfasserin |4 aut | |
245 | 1 | 0 | |a Class-dependent and cross-modal memory network considering sentimental features for video-based captioning |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
520 | |a The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network considering SENtimental features (CCMN-SEN) for Video-based Captioning to enhance commonsense caption generation. Firstly, we develop class-dependent memory for recording the alignment between video features and text. It only allows cross-modal interactions and generation on cross-modal matrices that share the same labels. Then, to understand the sentiments conveyed in the videos and generate accurate captions, we add sentiment features to facilitate commonsense caption generation. Experiment results demonstrate that our proposed CCMN-SEN significantly outperforms the state-of-the-art methods. These results have practical significance for understanding video content better. | ||
650 | 4 | |a cross-modal mapping | |
650 | 4 | |a cross-modal memory network | |
650 | 4 | |a commonsense caption | |
650 | 4 | |a cross-modal matrices | |
650 | 4 | |a sentimental features | |
650 | 4 | |a class-dependent memory | |
653 | 0 | |a Psychology | |
700 | 0 | |a Yuchen Zhou |e verfasserin |4 aut | |
700 | 0 | |a Jiaming Liu |e verfasserin |4 aut | |
700 | 0 | |a Yuanyuan Cai |e verfasserin |4 aut | |
700 | 0 | |a Yuanyuan Cai |e verfasserin |4 aut | |
773 | 0 | 8 | |i In |t Frontiers in Psychology |d Frontiers Media S.A., 2010 |g 14(2023) |w (DE-627)631495711 |w (DE-600)2563826-9 |x 16641078 |7 nnns |
773 | 1 | 8 | |g volume:14 |g year:2023 |
856 | 4 | 0 | |u https://doi.org/10.3389/fpsyg.2023.1124369 |z kostenfrei |
856 | 4 | 0 | |u https://doaj.org/article/3c917efce26146ea8ddcdc2f5c71569c |z kostenfrei |
856 | 4 | 0 | |u https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1124369/full |z kostenfrei |
856 | 4 | 2 | |u https://doaj.org/toc/1664-1078 |y Journal toc |z kostenfrei |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_DOAJ | ||
912 | |a GBV_ILN_11 | ||
912 | |a GBV_ILN_20 | ||
912 | |a GBV_ILN_22 | ||
912 | |a GBV_ILN_23 | ||
912 | |a GBV_ILN_24 | ||
912 | |a GBV_ILN_31 | ||
912 | |a GBV_ILN_32 | ||
912 | |a GBV_ILN_39 | ||
912 | |a GBV_ILN_40 | ||
912 | |a GBV_ILN_60 | ||
912 | |a GBV_ILN_62 | ||
912 | |a GBV_ILN_63 | ||
912 | |a GBV_ILN_65 | ||
912 | |a GBV_ILN_69 | ||
912 | |a GBV_ILN_70 | ||
912 | |a GBV_ILN_73 | ||
912 | |a GBV_ILN_74 | ||
912 | |a GBV_ILN_90 | ||
912 | |a GBV_ILN_95 | ||
912 | |a GBV_ILN_100 | ||
912 | |a GBV_ILN_101 | ||
912 | |a GBV_ILN_105 | ||
912 | |a GBV_ILN_110 | ||
912 | |a GBV_ILN_138 | ||
912 | |a GBV_ILN_151 | ||
912 | |a GBV_ILN_152 | ||
912 | |a GBV_ILN_161 | ||
912 | |a GBV_ILN_187 | ||
912 | |a GBV_ILN_206 | ||
912 | |a GBV_ILN_213 | ||
912 | |a GBV_ILN_230 | ||
912 | |a GBV_ILN_250 | ||
912 | |a GBV_ILN_281 | ||
912 | |a GBV_ILN_285 | ||
912 | |a GBV_ILN_293 | ||
912 | |a GBV_ILN_602 | ||
912 | |a GBV_ILN_647 | ||
912 | |a GBV_ILN_702 | ||
912 | |a GBV_ILN_2003 | ||
912 | |a GBV_ILN_2009 | ||
912 | |a GBV_ILN_2014 | ||
912 | |a GBV_ILN_2086 | ||
912 | |a GBV_ILN_4012 | ||
912 | |a GBV_ILN_4037 | ||
912 | |a GBV_ILN_4112 | ||
912 | |a GBV_ILN_4125 | ||
912 | |a GBV_ILN_4126 | ||
912 | |a GBV_ILN_4249 | ||
912 | |a GBV_ILN_4305 | ||
912 | |a GBV_ILN_4306 | ||
912 | |a GBV_ILN_4307 | ||
912 | |a GBV_ILN_4313 | ||
912 | |a GBV_ILN_4322 | ||
912 | |a GBV_ILN_4323 | ||
912 | |a GBV_ILN_4324 | ||
912 | |a GBV_ILN_4325 | ||
912 | |a GBV_ILN_4326 | ||
912 | |a GBV_ILN_4335 | ||
912 | |a GBV_ILN_4338 | ||
912 | |a GBV_ILN_4367 | ||
912 | |a GBV_ILN_4700 | ||
951 | |a AR | ||
952 | |d 14 |j 2023 |
author_variant |
h x hx y z yz j l jl y c yc y c yc |
---|---|
matchkey_str |
article:16641078:2023----::lsdpnetncosoammrntokosdrnsnietletr |
hierarchy_sort_str |
2023 |
callnumber-subject-code |
BF |
publishDate |
2023 |
allfields |
10.3389/fpsyg.2023.1124369 doi (DE-627)DOAJ080415024 (DE-599)DOAJ3c917efce26146ea8ddcdc2f5c71569c DE-627 ger DE-627 rakwb eng BF1-990 Haitao Xiong verfasserin aut Class-dependent and cross-modal memory network considering sentimental features for video-based captioning 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network considering SENtimental features (CCMN-SEN) for Video-based Captioning to enhance commonsense caption generation. Firstly, we develop class-dependent memory for recording the alignment between video features and text. It only allows cross-modal interactions and generation on cross-modal matrices that share the same labels. Then, to understand the sentiments conveyed in the videos and generate accurate captions, we add sentiment features to facilitate commonsense caption generation. Experiment results demonstrate that our proposed CCMN-SEN significantly outperforms the state-of-the-art methods. These results have practical significance for understanding video content better. cross-modal mapping cross-modal memory network commonsense caption cross-modal matrices sentimental features class-dependent memory Psychology Yuchen Zhou verfasserin aut Jiaming Liu verfasserin aut Yuanyuan Cai verfasserin aut Yuanyuan Cai verfasserin aut In Frontiers in Psychology Frontiers Media S.A., 2010 14(2023) (DE-627)631495711 (DE-600)2563826-9 16641078 nnns volume:14 year:2023 https://doi.org/10.3389/fpsyg.2023.1124369 kostenfrei https://doaj.org/article/3c917efce26146ea8ddcdc2f5c71569c kostenfrei https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1124369/full kostenfrei https://doaj.org/toc/1664-1078 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_32 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_90 GBV_ILN_95 GBV_ILN_100 GBV_ILN_101 GBV_ILN_105 GBV_ILN_110 GBV_ILN_138 GBV_ILN_151 GBV_ILN_152 GBV_ILN_161 GBV_ILN_187 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_250 GBV_ILN_281 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_647 GBV_ILN_702 GBV_ILN_2003 GBV_ILN_2009 GBV_ILN_2014 GBV_ILN_2086 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 14 2023 |
spelling |
10.3389/fpsyg.2023.1124369 doi (DE-627)DOAJ080415024 (DE-599)DOAJ3c917efce26146ea8ddcdc2f5c71569c DE-627 ger DE-627 rakwb eng BF1-990 Haitao Xiong verfasserin aut Class-dependent and cross-modal memory network considering sentimental features for video-based captioning 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network considering SENtimental features (CCMN-SEN) for Video-based Captioning to enhance commonsense caption generation. Firstly, we develop class-dependent memory for recording the alignment between video features and text. It only allows cross-modal interactions and generation on cross-modal matrices that share the same labels. Then, to understand the sentiments conveyed in the videos and generate accurate captions, we add sentiment features to facilitate commonsense caption generation. Experiment results demonstrate that our proposed CCMN-SEN significantly outperforms the state-of-the-art methods. These results have practical significance for understanding video content better. cross-modal mapping cross-modal memory network commonsense caption cross-modal matrices sentimental features class-dependent memory Psychology Yuchen Zhou verfasserin aut Jiaming Liu verfasserin aut Yuanyuan Cai verfasserin aut Yuanyuan Cai verfasserin aut In Frontiers in Psychology Frontiers Media S.A., 2010 14(2023) (DE-627)631495711 (DE-600)2563826-9 16641078 nnns volume:14 year:2023 https://doi.org/10.3389/fpsyg.2023.1124369 kostenfrei https://doaj.org/article/3c917efce26146ea8ddcdc2f5c71569c kostenfrei https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1124369/full kostenfrei https://doaj.org/toc/1664-1078 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_32 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_90 GBV_ILN_95 GBV_ILN_100 GBV_ILN_101 GBV_ILN_105 GBV_ILN_110 GBV_ILN_138 GBV_ILN_151 GBV_ILN_152 GBV_ILN_161 GBV_ILN_187 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_250 GBV_ILN_281 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_647 GBV_ILN_702 GBV_ILN_2003 GBV_ILN_2009 GBV_ILN_2014 GBV_ILN_2086 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 14 2023 |
allfields_unstemmed |
10.3389/fpsyg.2023.1124369 doi (DE-627)DOAJ080415024 (DE-599)DOAJ3c917efce26146ea8ddcdc2f5c71569c DE-627 ger DE-627 rakwb eng BF1-990 Haitao Xiong verfasserin aut Class-dependent and cross-modal memory network considering sentimental features for video-based captioning 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network considering SENtimental features (CCMN-SEN) for Video-based Captioning to enhance commonsense caption generation. Firstly, we develop class-dependent memory for recording the alignment between video features and text. It only allows cross-modal interactions and generation on cross-modal matrices that share the same labels. Then, to understand the sentiments conveyed in the videos and generate accurate captions, we add sentiment features to facilitate commonsense caption generation. Experiment results demonstrate that our proposed CCMN-SEN significantly outperforms the state-of-the-art methods. These results have practical significance for understanding video content better. cross-modal mapping cross-modal memory network commonsense caption cross-modal matrices sentimental features class-dependent memory Psychology Yuchen Zhou verfasserin aut Jiaming Liu verfasserin aut Yuanyuan Cai verfasserin aut Yuanyuan Cai verfasserin aut In Frontiers in Psychology Frontiers Media S.A., 2010 14(2023) (DE-627)631495711 (DE-600)2563826-9 16641078 nnns volume:14 year:2023 https://doi.org/10.3389/fpsyg.2023.1124369 kostenfrei https://doaj.org/article/3c917efce26146ea8ddcdc2f5c71569c kostenfrei https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1124369/full kostenfrei https://doaj.org/toc/1664-1078 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_32 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_90 GBV_ILN_95 GBV_ILN_100 GBV_ILN_101 GBV_ILN_105 GBV_ILN_110 GBV_ILN_138 GBV_ILN_151 GBV_ILN_152 GBV_ILN_161 GBV_ILN_187 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_250 GBV_ILN_281 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_647 GBV_ILN_702 GBV_ILN_2003 GBV_ILN_2009 GBV_ILN_2014 GBV_ILN_2086 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 14 2023 |
allfieldsGer |
10.3389/fpsyg.2023.1124369 doi (DE-627)DOAJ080415024 (DE-599)DOAJ3c917efce26146ea8ddcdc2f5c71569c DE-627 ger DE-627 rakwb eng BF1-990 Haitao Xiong verfasserin aut Class-dependent and cross-modal memory network considering sentimental features for video-based captioning 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network considering SENtimental features (CCMN-SEN) for Video-based Captioning to enhance commonsense caption generation. Firstly, we develop class-dependent memory for recording the alignment between video features and text. It only allows cross-modal interactions and generation on cross-modal matrices that share the same labels. Then, to understand the sentiments conveyed in the videos and generate accurate captions, we add sentiment features to facilitate commonsense caption generation. Experiment results demonstrate that our proposed CCMN-SEN significantly outperforms the state-of-the-art methods. These results have practical significance for understanding video content better. cross-modal mapping cross-modal memory network commonsense caption cross-modal matrices sentimental features class-dependent memory Psychology Yuchen Zhou verfasserin aut Jiaming Liu verfasserin aut Yuanyuan Cai verfasserin aut Yuanyuan Cai verfasserin aut In Frontiers in Psychology Frontiers Media S.A., 2010 14(2023) (DE-627)631495711 (DE-600)2563826-9 16641078 nnns volume:14 year:2023 https://doi.org/10.3389/fpsyg.2023.1124369 kostenfrei https://doaj.org/article/3c917efce26146ea8ddcdc2f5c71569c kostenfrei https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1124369/full kostenfrei https://doaj.org/toc/1664-1078 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_32 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_90 GBV_ILN_95 GBV_ILN_100 GBV_ILN_101 GBV_ILN_105 GBV_ILN_110 GBV_ILN_138 GBV_ILN_151 GBV_ILN_152 GBV_ILN_161 GBV_ILN_187 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_250 GBV_ILN_281 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_647 GBV_ILN_702 GBV_ILN_2003 GBV_ILN_2009 GBV_ILN_2014 GBV_ILN_2086 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 14 2023 |
allfieldsSound |
10.3389/fpsyg.2023.1124369 doi (DE-627)DOAJ080415024 (DE-599)DOAJ3c917efce26146ea8ddcdc2f5c71569c DE-627 ger DE-627 rakwb eng BF1-990 Haitao Xiong verfasserin aut Class-dependent and cross-modal memory network considering sentimental features for video-based captioning 2023 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network considering SENtimental features (CCMN-SEN) for Video-based Captioning to enhance commonsense caption generation. Firstly, we develop class-dependent memory for recording the alignment between video features and text. It only allows cross-modal interactions and generation on cross-modal matrices that share the same labels. Then, to understand the sentiments conveyed in the videos and generate accurate captions, we add sentiment features to facilitate commonsense caption generation. Experiment results demonstrate that our proposed CCMN-SEN significantly outperforms the state-of-the-art methods. These results have practical significance for understanding video content better. cross-modal mapping cross-modal memory network commonsense caption cross-modal matrices sentimental features class-dependent memory Psychology Yuchen Zhou verfasserin aut Jiaming Liu verfasserin aut Yuanyuan Cai verfasserin aut Yuanyuan Cai verfasserin aut In Frontiers in Psychology Frontiers Media S.A., 2010 14(2023) (DE-627)631495711 (DE-600)2563826-9 16641078 nnns volume:14 year:2023 https://doi.org/10.3389/fpsyg.2023.1124369 kostenfrei https://doaj.org/article/3c917efce26146ea8ddcdc2f5c71569c kostenfrei https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1124369/full kostenfrei https://doaj.org/toc/1664-1078 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_32 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_90 GBV_ILN_95 GBV_ILN_100 GBV_ILN_101 GBV_ILN_105 GBV_ILN_110 GBV_ILN_138 GBV_ILN_151 GBV_ILN_152 GBV_ILN_161 GBV_ILN_187 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_250 GBV_ILN_281 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_647 GBV_ILN_702 GBV_ILN_2003 GBV_ILN_2009 GBV_ILN_2014 GBV_ILN_2086 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 14 2023 |
language |
English |
source |
In Frontiers in Psychology 14(2023) volume:14 year:2023 |
sourceStr |
In Frontiers in Psychology 14(2023) volume:14 year:2023 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
cross-modal mapping cross-modal memory network commonsense caption cross-modal matrices sentimental features class-dependent memory Psychology |
isfreeaccess_bool |
true |
container_title |
Frontiers in Psychology |
authorswithroles_txt_mv |
Haitao Xiong @@aut@@ Yuchen Zhou @@aut@@ Jiaming Liu @@aut@@ Yuanyuan Cai @@aut@@ |
publishDateDaySort_date |
2023-01-01T00:00:00Z |
hierarchy_top_id |
631495711 |
id |
DOAJ080415024 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">DOAJ080415024</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230310191027.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">230310s2023 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.3389/fpsyg.2023.1124369</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)DOAJ080415024</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DOAJ3c917efce26146ea8ddcdc2f5c71569c</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">BF1-990</subfield></datafield><datafield tag="100" ind1="0" ind2=" "><subfield code="a">Haitao Xiong</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Class-dependent and cross-modal memory network considering sentimental features for video-based captioning</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2023</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network considering SENtimental features (CCMN-SEN) for Video-based Captioning to enhance commonsense caption generation. Firstly, we develop class-dependent memory for recording the alignment between video features and text. It only allows cross-modal interactions and generation on cross-modal matrices that share the same labels. Then, to understand the sentiments conveyed in the videos and generate accurate captions, we add sentiment features to facilitate commonsense caption generation. Experiment results demonstrate that our proposed CCMN-SEN significantly outperforms the state-of-the-art methods. These results have practical significance for understanding video content better.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">cross-modal mapping</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">cross-modal memory network</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">commonsense caption</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">cross-modal matrices</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">sentimental features</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">class-dependent memory</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Psychology</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Yuchen Zhou</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Jiaming Liu</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Yuanyuan Cai</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Yuanyuan Cai</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">In</subfield><subfield code="t">Frontiers in Psychology</subfield><subfield code="d">Frontiers Media S.A., 2010</subfield><subfield code="g">14(2023)</subfield><subfield code="w">(DE-627)631495711</subfield><subfield code="w">(DE-600)2563826-9</subfield><subfield code="x">16641078</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:14</subfield><subfield code="g">year:2023</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.3389/fpsyg.2023.1124369</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doaj.org/article/3c917efce26146ea8ddcdc2f5c71569c</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1124369/full</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">https://doaj.org/toc/1664-1078</subfield><subfield code="y">Journal toc</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_DOAJ</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_20</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_22</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_23</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_24</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_32</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_39</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_60</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_63</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_69</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_73</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_74</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_90</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_95</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_100</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_101</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_105</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_110</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_138</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_151</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_152</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_161</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_187</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_206</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_213</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_230</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_250</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_281</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_285</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_293</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_602</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_647</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_702</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2003</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2009</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2014</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2086</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4037</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4112</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4125</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4249</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4306</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4307</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4313</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4322</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4323</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4324</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4325</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4326</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4335</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4338</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4367</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4700</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">14</subfield><subfield code="j">2023</subfield></datafield></record></collection>
|
callnumber-first |
B - Philosophy, Psychology, Religion |
author |
Haitao Xiong |
spellingShingle |
Haitao Xiong misc BF1-990 misc cross-modal mapping misc cross-modal memory network misc commonsense caption misc cross-modal matrices misc sentimental features misc class-dependent memory misc Psychology Class-dependent and cross-modal memory network considering sentimental features for video-based captioning |
authorStr |
Haitao Xiong |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)631495711 |
format |
electronic Article |
delete_txt_mv |
keep |
author_role |
aut aut aut aut aut |
collection |
DOAJ |
remote_str |
true |
callnumber-label |
BF1-990 |
illustrated |
Not Illustrated |
issn |
16641078 |
topic_title |
BF1-990 Class-dependent and cross-modal memory network considering sentimental features for video-based captioning cross-modal mapping cross-modal memory network commonsense caption cross-modal matrices sentimental features class-dependent memory |
topic |
misc BF1-990 misc cross-modal mapping misc cross-modal memory network misc commonsense caption misc cross-modal matrices misc sentimental features misc class-dependent memory misc Psychology |
topic_unstemmed |
misc BF1-990 misc cross-modal mapping misc cross-modal memory network misc commonsense caption misc cross-modal matrices misc sentimental features misc class-dependent memory misc Psychology |
topic_browse |
misc BF1-990 misc cross-modal mapping misc cross-modal memory network misc commonsense caption misc cross-modal matrices misc sentimental features misc class-dependent memory misc Psychology |
format_facet |
Elektronische Aufsätze Aufsätze Elektronische Ressource |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
cr |
hierarchy_parent_title |
Frontiers in Psychology |
hierarchy_parent_id |
631495711 |
hierarchy_top_title |
Frontiers in Psychology |
isfreeaccess_txt |
true |
familylinks_str_mv |
(DE-627)631495711 (DE-600)2563826-9 |
title |
Class-dependent and cross-modal memory network considering sentimental features for video-based captioning |
ctrlnum |
(DE-627)DOAJ080415024 (DE-599)DOAJ3c917efce26146ea8ddcdc2f5c71569c |
title_full |
Class-dependent and cross-modal memory network considering sentimental features for video-based captioning |
author_sort |
Haitao Xiong |
journal |
Frontiers in Psychology |
journalStr |
Frontiers in Psychology |
callnumber-first-code |
B |
lang_code |
eng |
isOA_bool |
true |
recordtype |
marc |
publishDateSort |
2023 |
contenttype_str_mv |
txt |
author_browse |
Haitao Xiong Yuchen Zhou Jiaming Liu Yuanyuan Cai |
container_volume |
14 |
class |
BF1-990 |
format_se |
Elektronische Aufsätze |
author-letter |
Haitao Xiong |
doi_str_mv |
10.3389/fpsyg.2023.1124369 |
author2-role |
verfasserin |
title_sort |
class-dependent and cross-modal memory network considering sentimental features for video-based captioning |
callnumber |
BF1-990 |
title_auth |
Class-dependent and cross-modal memory network considering sentimental features for video-based captioning |
abstract |
The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network considering SENtimental features (CCMN-SEN) for Video-based Captioning to enhance commonsense caption generation. Firstly, we develop class-dependent memory for recording the alignment between video features and text. It only allows cross-modal interactions and generation on cross-modal matrices that share the same labels. Then, to understand the sentiments conveyed in the videos and generate accurate captions, we add sentiment features to facilitate commonsense caption generation. Experiment results demonstrate that our proposed CCMN-SEN significantly outperforms the state-of-the-art methods. These results have practical significance for understanding video content better. |
abstractGer |
The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network considering SENtimental features (CCMN-SEN) for Video-based Captioning to enhance commonsense caption generation. Firstly, we develop class-dependent memory for recording the alignment between video features and text. It only allows cross-modal interactions and generation on cross-modal matrices that share the same labels. Then, to understand the sentiments conveyed in the videos and generate accurate captions, we add sentiment features to facilitate commonsense caption generation. Experiment results demonstrate that our proposed CCMN-SEN significantly outperforms the state-of-the-art methods. These results have practical significance for understanding video content better. |
abstract_unstemmed |
The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network considering SENtimental features (CCMN-SEN) for Video-based Captioning to enhance commonsense caption generation. Firstly, we develop class-dependent memory for recording the alignment between video features and text. It only allows cross-modal interactions and generation on cross-modal matrices that share the same labels. Then, to understand the sentiments conveyed in the videos and generate accurate captions, we add sentiment features to facilitate commonsense caption generation. Experiment results demonstrate that our proposed CCMN-SEN significantly outperforms the state-of-the-art methods. These results have practical significance for understanding video content better. |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_32 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_90 GBV_ILN_95 GBV_ILN_100 GBV_ILN_101 GBV_ILN_105 GBV_ILN_110 GBV_ILN_138 GBV_ILN_151 GBV_ILN_152 GBV_ILN_161 GBV_ILN_187 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_250 GBV_ILN_281 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_647 GBV_ILN_702 GBV_ILN_2003 GBV_ILN_2009 GBV_ILN_2014 GBV_ILN_2086 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 |
title_short |
Class-dependent and cross-modal memory network considering sentimental features for video-based captioning |
url |
https://doi.org/10.3389/fpsyg.2023.1124369 https://doaj.org/article/3c917efce26146ea8ddcdc2f5c71569c https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1124369/full https://doaj.org/toc/1664-1078 |
remote_bool |
true |
author2 |
Yuchen Zhou Jiaming Liu Yuanyuan Cai |
author2Str |
Yuchen Zhou Jiaming Liu Yuanyuan Cai |
ppnlink |
631495711 |
callnumber-subject |
BF - Psychology |
mediatype_str_mv |
c |
isOA_txt |
true |
hochschulschrift_bool |
false |
doi_str |
10.3389/fpsyg.2023.1124369 |
callnumber-a |
BF1-990 |
up_date |
2024-07-03T14:30:48.363Z |
_version_ |
1803568596508475392 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">DOAJ080415024</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230310191027.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">230310s2023 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.3389/fpsyg.2023.1124369</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)DOAJ080415024</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DOAJ3c917efce26146ea8ddcdc2f5c71569c</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">BF1-990</subfield></datafield><datafield tag="100" ind1="0" ind2=" "><subfield code="a">Haitao Xiong</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Class-dependent and cross-modal memory network considering sentimental features for video-based captioning</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2023</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">The video-based commonsense captioning task aims to add multiple commonsense descriptions to video captions to understand video content better. This paper aims to consider the importance of cross-modal mapping. We propose a combined framework called Class-dependent and Cross-modal Memory Network considering SENtimental features (CCMN-SEN) for Video-based Captioning to enhance commonsense caption generation. Firstly, we develop class-dependent memory for recording the alignment between video features and text. It only allows cross-modal interactions and generation on cross-modal matrices that share the same labels. Then, to understand the sentiments conveyed in the videos and generate accurate captions, we add sentiment features to facilitate commonsense caption generation. Experiment results demonstrate that our proposed CCMN-SEN significantly outperforms the state-of-the-art methods. These results have practical significance for understanding video content better.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">cross-modal mapping</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">cross-modal memory network</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">commonsense caption</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">cross-modal matrices</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">sentimental features</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">class-dependent memory</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Psychology</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Yuchen Zhou</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Jiaming Liu</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Yuanyuan Cai</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Yuanyuan Cai</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">In</subfield><subfield code="t">Frontiers in Psychology</subfield><subfield code="d">Frontiers Media S.A., 2010</subfield><subfield code="g">14(2023)</subfield><subfield code="w">(DE-627)631495711</subfield><subfield code="w">(DE-600)2563826-9</subfield><subfield code="x">16641078</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:14</subfield><subfield code="g">year:2023</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.3389/fpsyg.2023.1124369</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doaj.org/article/3c917efce26146ea8ddcdc2f5c71569c</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://www.frontiersin.org/articles/10.3389/fpsyg.2023.1124369/full</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">https://doaj.org/toc/1664-1078</subfield><subfield code="y">Journal toc</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_DOAJ</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_20</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_22</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_23</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_24</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_32</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_39</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_60</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_63</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_69</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_73</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_74</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_90</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_95</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_100</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_101</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_105</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_110</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_138</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_151</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_152</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_161</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_187</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_206</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_213</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_230</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_250</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_281</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_285</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_293</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_602</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_647</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_702</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2003</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2009</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2014</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2086</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4037</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4112</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4125</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4249</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4306</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4307</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4313</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4322</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4323</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4324</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4325</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4326</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4335</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4338</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4367</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4700</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">14</subfield><subfield code="j">2023</subfield></datafield></record></collection>
|
score |
7.4016323 |