Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform

Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show s...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Huang, Xiaodong [verfasserIn]

Format:	Artikel
Sprache:	Englisch

Erschienen:	2017

Schlagwörter:	Video text Nonsubsampled Contourlet Transform Text detection Text frame classification

Anmerkung:	© Springer Science+Business Media New York 2017

Übergeordnetes Werk:	Enthalten in: Multimedia tools and applications - Springer US, 1995, 77(2017), 6 vom: 25. März, Seite 7033-7049
Übergeordnetes Werk:	volume:77 ; year:2017 ; number:6 ; day:25 ; month:03 ; pages:7033-7049

Links:	Volltext

DOI / URN:	10.1007/s11042-017-4619-8

Katalog-ID:	OLC2035045304

Internformat


LEADER	01000caa a22002652 4500
001	OLC2035045304
003	DE-627
005	20230503193442.0
007	tu
008	200819s2017 xx \|\|\|\|\| 00\| \|\|eng c
024	7		\|a 10.1007/s11042-017-4619-8 \|2 doi
035			\|a (DE-627)OLC2035045304
035			\|a (DE-He213)s11042-017-4619-8-p
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 070 \|a 004 \|q VZ
100	1		\|a Huang, Xiaodong \|e verfasserin \|0 (orcid)0000-0002-7953-750X \|4 aut
245	1	0	\|a Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform
264		1	\|c 2017
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
500			\|a © Springer Science+Business Media New York 2017
520			\|a Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines.
650		4	\|a Video text
650		4	\|a Nonsubsampled Contourlet Transform
650		4	\|a Text detection
650		4	\|a Text frame classification
773	0	8	\|i Enthalten in \|t Multimedia tools and applications \|d Springer US, 1995 \|g 77(2017), 6 vom: 25. März, Seite 7033-7049 \|w (DE-627)189064145 \|w (DE-600)1287642-2 \|w (DE-576)052842126 \|x 1380-7501 \|7 nnns
773	1	8	\|g volume:77 \|g year:2017 \|g number:6 \|g day:25 \|g month:03 \|g pages:7033-7049
856	4	1	\|u https://doi.org/10.1007/s11042-017-4619-8 \|z lizenzpflichtig \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_OLC
912			\|a SSG-OLC-MAT
912			\|a SSG-OLC-BUB
912			\|a SSG-OLC-MKW
912			\|a GBV_ILN_70
951			\|a AR
952			\|d 77 \|j 2017 \|e 6 \|b 25 \|c 03 \|h 7033-7049

Indexfelder

author_variant	x h xh
matchkey_str	article:13807501:2017----::uoaivdoueipsdeteetobsdnosbap
hierarchy_sort_str	2017
publishDate	2017
allfields	10.1007/s11042-017-4619-8 doi (DE-627)OLC2035045304 (DE-He213)s11042-017-4619-8-p DE-627 ger DE-627 rakwb eng 070 004 VZ Huang, Xiaodong verfasserin (orcid)0000-0002-7953-750X aut Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform 2017 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media New York 2017 Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines. Video text Nonsubsampled Contourlet Transform Text detection Text frame classification Enthalten in Multimedia tools and applications Springer US, 1995 77(2017), 6 vom: 25. März, Seite 7033-7049 (DE-627)189064145 (DE-600)1287642-2 (DE-576)052842126 1380-7501 nnns volume:77 year:2017 number:6 day:25 month:03 pages:7033-7049 https://doi.org/10.1007/s11042-017-4619-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OLC-MKW GBV_ILN_70 AR 77 2017 6 25 03 7033-7049
spelling	10.1007/s11042-017-4619-8 doi (DE-627)OLC2035045304 (DE-He213)s11042-017-4619-8-p DE-627 ger DE-627 rakwb eng 070 004 VZ Huang, Xiaodong verfasserin (orcid)0000-0002-7953-750X aut Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform 2017 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media New York 2017 Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines. Video text Nonsubsampled Contourlet Transform Text detection Text frame classification Enthalten in Multimedia tools and applications Springer US, 1995 77(2017), 6 vom: 25. März, Seite 7033-7049 (DE-627)189064145 (DE-600)1287642-2 (DE-576)052842126 1380-7501 nnns volume:77 year:2017 number:6 day:25 month:03 pages:7033-7049 https://doi.org/10.1007/s11042-017-4619-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OLC-MKW GBV_ILN_70 AR 77 2017 6 25 03 7033-7049
allfields_unstemmed	10.1007/s11042-017-4619-8 doi (DE-627)OLC2035045304 (DE-He213)s11042-017-4619-8-p DE-627 ger DE-627 rakwb eng 070 004 VZ Huang, Xiaodong verfasserin (orcid)0000-0002-7953-750X aut Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform 2017 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media New York 2017 Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines. Video text Nonsubsampled Contourlet Transform Text detection Text frame classification Enthalten in Multimedia tools and applications Springer US, 1995 77(2017), 6 vom: 25. März, Seite 7033-7049 (DE-627)189064145 (DE-600)1287642-2 (DE-576)052842126 1380-7501 nnns volume:77 year:2017 number:6 day:25 month:03 pages:7033-7049 https://doi.org/10.1007/s11042-017-4619-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OLC-MKW GBV_ILN_70 AR 77 2017 6 25 03 7033-7049
allfieldsGer	10.1007/s11042-017-4619-8 doi (DE-627)OLC2035045304 (DE-He213)s11042-017-4619-8-p DE-627 ger DE-627 rakwb eng 070 004 VZ Huang, Xiaodong verfasserin (orcid)0000-0002-7953-750X aut Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform 2017 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media New York 2017 Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines. Video text Nonsubsampled Contourlet Transform Text detection Text frame classification Enthalten in Multimedia tools and applications Springer US, 1995 77(2017), 6 vom: 25. März, Seite 7033-7049 (DE-627)189064145 (DE-600)1287642-2 (DE-576)052842126 1380-7501 nnns volume:77 year:2017 number:6 day:25 month:03 pages:7033-7049 https://doi.org/10.1007/s11042-017-4619-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OLC-MKW GBV_ILN_70 AR 77 2017 6 25 03 7033-7049
allfieldsSound	10.1007/s11042-017-4619-8 doi (DE-627)OLC2035045304 (DE-He213)s11042-017-4619-8-p DE-627 ger DE-627 rakwb eng 070 004 VZ Huang, Xiaodong verfasserin (orcid)0000-0002-7953-750X aut Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform 2017 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media New York 2017 Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines. Video text Nonsubsampled Contourlet Transform Text detection Text frame classification Enthalten in Multimedia tools and applications Springer US, 1995 77(2017), 6 vom: 25. März, Seite 7033-7049 (DE-627)189064145 (DE-600)1287642-2 (DE-576)052842126 1380-7501 nnns volume:77 year:2017 number:6 day:25 month:03 pages:7033-7049 https://doi.org/10.1007/s11042-017-4619-8 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OLC-MKW GBV_ILN_70 AR 77 2017 6 25 03 7033-7049
language	English
source	Enthalten in Multimedia tools and applications 77(2017), 6 vom: 25. März, Seite 7033-7049 volume:77 year:2017 number:6 day:25 month:03 pages:7033-7049
sourceStr	Enthalten in Multimedia tools and applications 77(2017), 6 vom: 25. März, Seite 7033-7049 volume:77 year:2017 number:6 day:25 month:03 pages:7033-7049
format_phy_str_mv	Article
institution	findex.gbv.de
topic_facet	Video text Nonsubsampled Contourlet Transform Text detection Text frame classification
dewey-raw	070
isfreeaccess_bool	false
container_title	Multimedia tools and applications
authorswithroles_txt_mv	Huang, Xiaodong @@aut@@
publishDateDaySort_date	2017-03-25T00:00:00Z
hierarchy_top_id	189064145
dewey-sort	270
id	OLC2035045304
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2035045304</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230503193442.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2017 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11042-017-4619-8</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2035045304</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11042-017-4619-8-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">070</subfield><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Huang, Xiaodong</subfield><subfield code="e">verfasserin</subfield><subfield code="0">(orcid)0000-0002-7953-750X</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2017</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Springer Science+Business Media New York 2017</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Video text</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Nonsubsampled Contourlet Transform</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text detection</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text frame classification</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Multimedia tools and applications</subfield><subfield code="d">Springer US, 1995</subfield><subfield code="g">77(2017), 6 vom: 25. März, Seite 7033-7049</subfield><subfield code="w">(DE-627)189064145</subfield><subfield code="w">(DE-600)1287642-2</subfield><subfield code="w">(DE-576)052842126</subfield><subfield code="x">1380-7501</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:77</subfield><subfield code="g">year:2017</subfield><subfield code="g">number:6</subfield><subfield code="g">day:25</subfield><subfield code="g">month:03</subfield><subfield code="g">pages:7033-7049</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11042-017-4619-8</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MKW</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">77</subfield><subfield code="j">2017</subfield><subfield code="e">6</subfield><subfield code="b">25</subfield><subfield code="c">03</subfield><subfield code="h">7033-7049</subfield></datafield></record></collection>
author	Huang, Xiaodong
spellingShingle	Huang, Xiaodong ddc 070 misc Video text misc Nonsubsampled Contourlet Transform misc Text detection misc Text frame classification Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform
authorStr	Huang, Xiaodong
ppnlink_with_tag_str_mv	@@773@@(DE-627)189064145
format	Article
dewey-ones	070 - News media, journalism & publishing 004 - Data processing & computer science
delete_txt_mv	keep
author_role	aut
collection	OLC
remote_str	false
illustrated	Not Illustrated
issn	1380-7501
topic_title	070 004 VZ Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform Video text Nonsubsampled Contourlet Transform Text detection Text frame classification
topic	ddc 070 misc Video text misc Nonsubsampled Contourlet Transform misc Text detection misc Text frame classification
topic_unstemmed	ddc 070 misc Video text misc Nonsubsampled Contourlet Transform misc Text detection misc Text frame classification
topic_browse	ddc 070 misc Video text misc Nonsubsampled Contourlet Transform misc Text detection misc Text frame classification
format_facet	Aufsätze Gedruckte Aufsätze
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	nc
hierarchy_parent_title	Multimedia tools and applications
hierarchy_parent_id	189064145
dewey-tens	070 - News media, journalism & publishing 000 - Computer science, knowledge & systems
hierarchy_top_title	Multimedia tools and applications
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)189064145 (DE-600)1287642-2 (DE-576)052842126
title	Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform
ctrlnum	(DE-627)OLC2035045304 (DE-He213)s11042-017-4619-8-p
title_full	Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform
author_sort	Huang, Xiaodong
journal	Multimedia tools and applications
journalStr	Multimedia tools and applications
lang_code	eng
isOA_bool	false
dewey-hundreds	000 - Computer science, information & general works
recordtype	marc
publishDateSort	2017
contenttype_str_mv	txt
container_start_page	7033
author_browse	Huang, Xiaodong
container_volume	77
class	070 004 VZ
format_se	Aufsätze
author-letter	Huang, Xiaodong
doi_str_mv	10.1007/s11042-017-4619-8
normlink	(ORCID)0000-0002-7953-750X
normlink_prefix_str_mv	(orcid)0000-0002-7953-750X
dewey-full	070 004
title_sort	automatic video superimposed text detection based on nonsubsampled contourlet transform
title_auth	Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform
abstract	Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines. © Springer Science+Business Media New York 2017
abstractGer	Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines. © Springer Science+Business Media New York 2017
abstract_unstemmed	Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines. © Springer Science+Business Media New York 2017
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OLC-BUB SSG-OLC-MKW GBV_ILN_70
container_issue	6
title_short	Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform
url	https://doi.org/10.1007/s11042-017-4619-8
remote_bool	false
ppnlink	189064145
mediatype_str_mv	n
isOA_txt	false
hochschulschrift_bool	false
doi_str	10.1007/s11042-017-4619-8
up_date	2024-07-03T23:34:50.175Z
_version_	1803602823927037952
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2035045304</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230503193442.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2017 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11042-017-4619-8</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2035045304</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11042-017-4619-8-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">070</subfield><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Huang, Xiaodong</subfield><subfield code="e">verfasserin</subfield><subfield code="0">(orcid)0000-0002-7953-750X</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2017</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Springer Science+Business Media New York 2017</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Compared with other video semantic clues, such as gestures, motions etc., video text generally provides highly useful and fairly precise semantic information, the analysis of which can to a great extent facilitate video and scene understanding. It can be observed that the video texts show stronger edges. The Nonsubsampled Contourlet Transform (NSCT) is a fully shift-invariant, multi-scale, and multi-direction expansion, which can preserve the edge/silhouette of the text characters well. Therefore, in this paper, a new approach has been proposed to detect video text based on NSCT. First of all, the 8 directional coefficients of NSCT are combined to build the directional edge map (DEM), which can keep the horizontal, vertical and diagonal edge features and suppress other directional edge features. Then various directional pixels of DEM are integrated into a whole binary image (BE). Based on the BE, text frame classification is carried out to determine whether the video frames contain the text lines. Finally, text detection based on the BE is performed on consecutive frames to discriminate the video text from non-text regions. Experimental evaluations based on our collected TV videos data set demonstrate that our method significantly outperforms the other 3 video text detection algorithms in both detection speed and accuracy, especially when there are challenges such as video text with various sizes, languages, colors, fonts, short or long text lines.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Video text</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Nonsubsampled Contourlet Transform</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text detection</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text frame classification</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Multimedia tools and applications</subfield><subfield code="d">Springer US, 1995</subfield><subfield code="g">77(2017), 6 vom: 25. März, Seite 7033-7049</subfield><subfield code="w">(DE-627)189064145</subfield><subfield code="w">(DE-600)1287642-2</subfield><subfield code="w">(DE-576)052842126</subfield><subfield code="x">1380-7501</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:77</subfield><subfield code="g">year:2017</subfield><subfield code="g">number:6</subfield><subfield code="g">day:25</subfield><subfield code="g">month:03</subfield><subfield code="g">pages:7033-7049</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11042-017-4619-8</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MKW</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">77</subfield><subfield code="j">2017</subfield><subfield code="e">6</subfield><subfield code="b">25</subfield><subfield code="c">03</subfield><subfield code="h">7033-7049</subfield></datafield></record></collection>
score	7.3996086

Nicht das Richtige dabei?

Schreiben Sie uns!

Automatic video superimposed text detection based on Nonsubsampled Contourlet Transform

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?