Vocal-tract length estimation
Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other meth...
Ausführliche Beschreibung
Autor*in: |
Sorokin, V. N. [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2013 |
---|
Schlagwörter: |
---|
Anmerkung: |
© Pleiades Publishing, Inc. 2013 |
---|
Übergeordnetes Werk: |
Enthalten in: Journal of communications technology and electronics - Springer US, 1993, 58(2013), 12 vom: Dez., Seite 1292-1301 |
---|---|
Übergeordnetes Werk: |
volume:58 ; year:2013 ; number:12 ; month:12 ; pages:1292-1301 |
Links: |
---|
DOI / URN: |
10.1134/S1064226913120164 |
---|
Katalog-ID: |
OLC2059681936 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | OLC2059681936 | ||
003 | DE-627 | ||
005 | 20230302133952.0 | ||
007 | tu | ||
008 | 200819s2013 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1134/S1064226913120164 |2 doi | |
035 | |a (DE-627)OLC2059681936 | ||
035 | |a (DE-He213)S1064226913120164-p | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 620 |q VZ |
100 | 1 | |a Sorokin, V. N. |e verfasserin |4 aut | |
245 | 1 | 0 | |a Vocal-tract length estimation |
264 | 1 | |c 2013 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
500 | |a © Pleiades Publishing, Inc. 2013 | ||
520 | |a Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other method, the vocal-tract length is calculated according to the average frequency of the second formant determined by the frequencies of first three formants. In addition, various variants of analysis are discussed irrespective of the context and with allowance for known vowels. The probability that the speaker gender is correctly recognized via two methods is about 13%, and its value is almost independent of the knowledge of the context. The probabilities that male and female voices are correctly recognized according to the difference of higher formants are, respectively, 31 and 25.5% regardless of the context and 37 and 31% with allowance for it. The probabilities of correct recognition of male and female voices reach to 27 and 21.5%, respectively, if context-independent recognition is performed from the average frequency of the second formant and 43 and 35.5% after context-dependent recognition with the known vowel type. | ||
650 | 4 | |a vocal-tract length | |
650 | 4 | |a gender recognition | |
650 | 4 | |a and speaker recognition | |
700 | 1 | |a Geras’kin, I. V. |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Journal of communications technology and electronics |d Springer US, 1993 |g 58(2013), 12 vom: Dez., Seite 1292-1301 |w (DE-627)171168402 |w (DE-600)1160383-5 |w (DE-576)038494272 |x 1064-2269 |7 nnns |
773 | 1 | 8 | |g volume:58 |g year:2013 |g number:12 |g month:12 |g pages:1292-1301 |
856 | 4 | 1 | |u https://doi.org/10.1134/S1064226913120164 |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-TEC | ||
912 | |a GBV_ILN_70 | ||
951 | |a AR | ||
952 | |d 58 |j 2013 |e 12 |c 12 |h 1292-1301 |
author_variant |
v n s vn vns i v g iv ivg |
---|---|
matchkey_str |
article:10642269:2013----::oatateghs |
hierarchy_sort_str |
2013 |
publishDate |
2013 |
allfields |
10.1134/S1064226913120164 doi (DE-627)OLC2059681936 (DE-He213)S1064226913120164-p DE-627 ger DE-627 rakwb eng 620 VZ Sorokin, V. N. verfasserin aut Vocal-tract length estimation 2013 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Pleiades Publishing, Inc. 2013 Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other method, the vocal-tract length is calculated according to the average frequency of the second formant determined by the frequencies of first three formants. In addition, various variants of analysis are discussed irrespective of the context and with allowance for known vowels. The probability that the speaker gender is correctly recognized via two methods is about 13%, and its value is almost independent of the knowledge of the context. The probabilities that male and female voices are correctly recognized according to the difference of higher formants are, respectively, 31 and 25.5% regardless of the context and 37 and 31% with allowance for it. The probabilities of correct recognition of male and female voices reach to 27 and 21.5%, respectively, if context-independent recognition is performed from the average frequency of the second formant and 43 and 35.5% after context-dependent recognition with the known vowel type. vocal-tract length gender recognition and speaker recognition Geras’kin, I. V. aut Enthalten in Journal of communications technology and electronics Springer US, 1993 58(2013), 12 vom: Dez., Seite 1292-1301 (DE-627)171168402 (DE-600)1160383-5 (DE-576)038494272 1064-2269 nnns volume:58 year:2013 number:12 month:12 pages:1292-1301 https://doi.org/10.1134/S1064226913120164 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC GBV_ILN_70 AR 58 2013 12 12 1292-1301 |
spelling |
10.1134/S1064226913120164 doi (DE-627)OLC2059681936 (DE-He213)S1064226913120164-p DE-627 ger DE-627 rakwb eng 620 VZ Sorokin, V. N. verfasserin aut Vocal-tract length estimation 2013 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Pleiades Publishing, Inc. 2013 Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other method, the vocal-tract length is calculated according to the average frequency of the second formant determined by the frequencies of first three formants. In addition, various variants of analysis are discussed irrespective of the context and with allowance for known vowels. The probability that the speaker gender is correctly recognized via two methods is about 13%, and its value is almost independent of the knowledge of the context. The probabilities that male and female voices are correctly recognized according to the difference of higher formants are, respectively, 31 and 25.5% regardless of the context and 37 and 31% with allowance for it. The probabilities of correct recognition of male and female voices reach to 27 and 21.5%, respectively, if context-independent recognition is performed from the average frequency of the second formant and 43 and 35.5% after context-dependent recognition with the known vowel type. vocal-tract length gender recognition and speaker recognition Geras’kin, I. V. aut Enthalten in Journal of communications technology and electronics Springer US, 1993 58(2013), 12 vom: Dez., Seite 1292-1301 (DE-627)171168402 (DE-600)1160383-5 (DE-576)038494272 1064-2269 nnns volume:58 year:2013 number:12 month:12 pages:1292-1301 https://doi.org/10.1134/S1064226913120164 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC GBV_ILN_70 AR 58 2013 12 12 1292-1301 |
allfields_unstemmed |
10.1134/S1064226913120164 doi (DE-627)OLC2059681936 (DE-He213)S1064226913120164-p DE-627 ger DE-627 rakwb eng 620 VZ Sorokin, V. N. verfasserin aut Vocal-tract length estimation 2013 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Pleiades Publishing, Inc. 2013 Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other method, the vocal-tract length is calculated according to the average frequency of the second formant determined by the frequencies of first three formants. In addition, various variants of analysis are discussed irrespective of the context and with allowance for known vowels. The probability that the speaker gender is correctly recognized via two methods is about 13%, and its value is almost independent of the knowledge of the context. The probabilities that male and female voices are correctly recognized according to the difference of higher formants are, respectively, 31 and 25.5% regardless of the context and 37 and 31% with allowance for it. The probabilities of correct recognition of male and female voices reach to 27 and 21.5%, respectively, if context-independent recognition is performed from the average frequency of the second formant and 43 and 35.5% after context-dependent recognition with the known vowel type. vocal-tract length gender recognition and speaker recognition Geras’kin, I. V. aut Enthalten in Journal of communications technology and electronics Springer US, 1993 58(2013), 12 vom: Dez., Seite 1292-1301 (DE-627)171168402 (DE-600)1160383-5 (DE-576)038494272 1064-2269 nnns volume:58 year:2013 number:12 month:12 pages:1292-1301 https://doi.org/10.1134/S1064226913120164 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC GBV_ILN_70 AR 58 2013 12 12 1292-1301 |
allfieldsGer |
10.1134/S1064226913120164 doi (DE-627)OLC2059681936 (DE-He213)S1064226913120164-p DE-627 ger DE-627 rakwb eng 620 VZ Sorokin, V. N. verfasserin aut Vocal-tract length estimation 2013 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Pleiades Publishing, Inc. 2013 Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other method, the vocal-tract length is calculated according to the average frequency of the second formant determined by the frequencies of first three formants. In addition, various variants of analysis are discussed irrespective of the context and with allowance for known vowels. The probability that the speaker gender is correctly recognized via two methods is about 13%, and its value is almost independent of the knowledge of the context. The probabilities that male and female voices are correctly recognized according to the difference of higher formants are, respectively, 31 and 25.5% regardless of the context and 37 and 31% with allowance for it. The probabilities of correct recognition of male and female voices reach to 27 and 21.5%, respectively, if context-independent recognition is performed from the average frequency of the second formant and 43 and 35.5% after context-dependent recognition with the known vowel type. vocal-tract length gender recognition and speaker recognition Geras’kin, I. V. aut Enthalten in Journal of communications technology and electronics Springer US, 1993 58(2013), 12 vom: Dez., Seite 1292-1301 (DE-627)171168402 (DE-600)1160383-5 (DE-576)038494272 1064-2269 nnns volume:58 year:2013 number:12 month:12 pages:1292-1301 https://doi.org/10.1134/S1064226913120164 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC GBV_ILN_70 AR 58 2013 12 12 1292-1301 |
allfieldsSound |
10.1134/S1064226913120164 doi (DE-627)OLC2059681936 (DE-He213)S1064226913120164-p DE-627 ger DE-627 rakwb eng 620 VZ Sorokin, V. N. verfasserin aut Vocal-tract length estimation 2013 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Pleiades Publishing, Inc. 2013 Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other method, the vocal-tract length is calculated according to the average frequency of the second formant determined by the frequencies of first three formants. In addition, various variants of analysis are discussed irrespective of the context and with allowance for known vowels. The probability that the speaker gender is correctly recognized via two methods is about 13%, and its value is almost independent of the knowledge of the context. The probabilities that male and female voices are correctly recognized according to the difference of higher formants are, respectively, 31 and 25.5% regardless of the context and 37 and 31% with allowance for it. The probabilities of correct recognition of male and female voices reach to 27 and 21.5%, respectively, if context-independent recognition is performed from the average frequency of the second formant and 43 and 35.5% after context-dependent recognition with the known vowel type. vocal-tract length gender recognition and speaker recognition Geras’kin, I. V. aut Enthalten in Journal of communications technology and electronics Springer US, 1993 58(2013), 12 vom: Dez., Seite 1292-1301 (DE-627)171168402 (DE-600)1160383-5 (DE-576)038494272 1064-2269 nnns volume:58 year:2013 number:12 month:12 pages:1292-1301 https://doi.org/10.1134/S1064226913120164 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC GBV_ILN_70 AR 58 2013 12 12 1292-1301 |
language |
English |
source |
Enthalten in Journal of communications technology and electronics 58(2013), 12 vom: Dez., Seite 1292-1301 volume:58 year:2013 number:12 month:12 pages:1292-1301 |
sourceStr |
Enthalten in Journal of communications technology and electronics 58(2013), 12 vom: Dez., Seite 1292-1301 volume:58 year:2013 number:12 month:12 pages:1292-1301 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
vocal-tract length gender recognition and speaker recognition |
dewey-raw |
620 |
isfreeaccess_bool |
false |
container_title |
Journal of communications technology and electronics |
authorswithroles_txt_mv |
Sorokin, V. N. @@aut@@ Geras’kin, I. V. @@aut@@ |
publishDateDaySort_date |
2013-12-01T00:00:00Z |
hierarchy_top_id |
171168402 |
dewey-sort |
3620 |
id |
OLC2059681936 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2059681936</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230302133952.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2013 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1134/S1064226913120164</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2059681936</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)S1064226913120164-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">620</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Sorokin, V. N.</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Vocal-tract length estimation</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2013</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Pleiades Publishing, Inc. 2013</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other method, the vocal-tract length is calculated according to the average frequency of the second formant determined by the frequencies of first three formants. In addition, various variants of analysis are discussed irrespective of the context and with allowance for known vowels. The probability that the speaker gender is correctly recognized via two methods is about 13%, and its value is almost independent of the knowledge of the context. The probabilities that male and female voices are correctly recognized according to the difference of higher formants are, respectively, 31 and 25.5% regardless of the context and 37 and 31% with allowance for it. The probabilities of correct recognition of male and female voices reach to 27 and 21.5%, respectively, if context-independent recognition is performed from the average frequency of the second formant and 43 and 35.5% after context-dependent recognition with the known vowel type.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">vocal-tract length</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">gender recognition</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">and speaker recognition</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Geras’kin, I. V.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Journal of communications technology and electronics</subfield><subfield code="d">Springer US, 1993</subfield><subfield code="g">58(2013), 12 vom: Dez., Seite 1292-1301</subfield><subfield code="w">(DE-627)171168402</subfield><subfield code="w">(DE-600)1160383-5</subfield><subfield code="w">(DE-576)038494272</subfield><subfield code="x">1064-2269</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:58</subfield><subfield code="g">year:2013</subfield><subfield code="g">number:12</subfield><subfield code="g">month:12</subfield><subfield code="g">pages:1292-1301</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1134/S1064226913120164</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">58</subfield><subfield code="j">2013</subfield><subfield code="e">12</subfield><subfield code="c">12</subfield><subfield code="h">1292-1301</subfield></datafield></record></collection>
|
author |
Sorokin, V. N. |
spellingShingle |
Sorokin, V. N. ddc 620 misc vocal-tract length misc gender recognition misc and speaker recognition Vocal-tract length estimation |
authorStr |
Sorokin, V. N. |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)171168402 |
format |
Article |
dewey-ones |
620 - Engineering & allied operations |
delete_txt_mv |
keep |
author_role |
aut aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
1064-2269 |
topic_title |
620 VZ Vocal-tract length estimation vocal-tract length gender recognition and speaker recognition |
topic |
ddc 620 misc vocal-tract length misc gender recognition misc and speaker recognition |
topic_unstemmed |
ddc 620 misc vocal-tract length misc gender recognition misc and speaker recognition |
topic_browse |
ddc 620 misc vocal-tract length misc gender recognition misc and speaker recognition |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
hierarchy_parent_title |
Journal of communications technology and electronics |
hierarchy_parent_id |
171168402 |
dewey-tens |
620 - Engineering |
hierarchy_top_title |
Journal of communications technology and electronics |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)171168402 (DE-600)1160383-5 (DE-576)038494272 |
title |
Vocal-tract length estimation |
ctrlnum |
(DE-627)OLC2059681936 (DE-He213)S1064226913120164-p |
title_full |
Vocal-tract length estimation |
author_sort |
Sorokin, V. N. |
journal |
Journal of communications technology and electronics |
journalStr |
Journal of communications technology and electronics |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
600 - Technology |
recordtype |
marc |
publishDateSort |
2013 |
contenttype_str_mv |
txt |
container_start_page |
1292 |
author_browse |
Sorokin, V. N. Geras’kin, I. V. |
container_volume |
58 |
class |
620 VZ |
format_se |
Aufsätze |
author-letter |
Sorokin, V. N. |
doi_str_mv |
10.1134/S1064226913120164 |
dewey-full |
620 |
title_sort |
vocal-tract length estimation |
title_auth |
Vocal-tract length estimation |
abstract |
Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other method, the vocal-tract length is calculated according to the average frequency of the second formant determined by the frequencies of first three formants. In addition, various variants of analysis are discussed irrespective of the context and with allowance for known vowels. The probability that the speaker gender is correctly recognized via two methods is about 13%, and its value is almost independent of the knowledge of the context. The probabilities that male and female voices are correctly recognized according to the difference of higher formants are, respectively, 31 and 25.5% regardless of the context and 37 and 31% with allowance for it. The probabilities of correct recognition of male and female voices reach to 27 and 21.5%, respectively, if context-independent recognition is performed from the average frequency of the second formant and 43 and 35.5% after context-dependent recognition with the known vowel type. © Pleiades Publishing, Inc. 2013 |
abstractGer |
Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other method, the vocal-tract length is calculated according to the average frequency of the second formant determined by the frequencies of first three formants. In addition, various variants of analysis are discussed irrespective of the context and with allowance for known vowels. The probability that the speaker gender is correctly recognized via two methods is about 13%, and its value is almost independent of the knowledge of the context. The probabilities that male and female voices are correctly recognized according to the difference of higher formants are, respectively, 31 and 25.5% regardless of the context and 37 and 31% with allowance for it. The probabilities of correct recognition of male and female voices reach to 27 and 21.5%, respectively, if context-independent recognition is performed from the average frequency of the second formant and 43 and 35.5% after context-dependent recognition with the known vowel type. © Pleiades Publishing, Inc. 2013 |
abstract_unstemmed |
Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other method, the vocal-tract length is calculated according to the average frequency of the second formant determined by the frequencies of first three formants. In addition, various variants of analysis are discussed irrespective of the context and with allowance for known vowels. The probability that the speaker gender is correctly recognized via two methods is about 13%, and its value is almost independent of the knowledge of the context. The probabilities that male and female voices are correctly recognized according to the difference of higher formants are, respectively, 31 and 25.5% regardless of the context and 37 and 31% with allowance for it. The probabilities of correct recognition of male and female voices reach to 27 and 21.5%, respectively, if context-independent recognition is performed from the average frequency of the second formant and 43 and 35.5% after context-dependent recognition with the known vowel type. © Pleiades Publishing, Inc. 2013 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC GBV_ILN_70 |
container_issue |
12 |
title_short |
Vocal-tract length estimation |
url |
https://doi.org/10.1134/S1064226913120164 |
remote_bool |
false |
author2 |
Geras’kin, I. V. |
author2Str |
Geras’kin, I. V. |
ppnlink |
171168402 |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
doi_str |
10.1134/S1064226913120164 |
up_date |
2024-07-03T23:02:02.322Z |
_version_ |
1803600760483610624 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2059681936</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230302133952.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2013 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1134/S1064226913120164</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2059681936</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)S1064226913120164-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">620</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Sorokin, V. N.</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Vocal-tract length estimation</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2013</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Pleiades Publishing, Inc. 2013</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Two methods for estimating the vocal-tract length equivalent to the homogeneous acoustic tube length are investigated. One method is based on calculating the tract length from the difference between the frequencies of the adjacent local spectral maxima, which exceed 4 kHz. In the other method, the vocal-tract length is calculated according to the average frequency of the second formant determined by the frequencies of first three formants. In addition, various variants of analysis are discussed irrespective of the context and with allowance for known vowels. The probability that the speaker gender is correctly recognized via two methods is about 13%, and its value is almost independent of the knowledge of the context. The probabilities that male and female voices are correctly recognized according to the difference of higher formants are, respectively, 31 and 25.5% regardless of the context and 37 and 31% with allowance for it. The probabilities of correct recognition of male and female voices reach to 27 and 21.5%, respectively, if context-independent recognition is performed from the average frequency of the second formant and 43 and 35.5% after context-dependent recognition with the known vowel type.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">vocal-tract length</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">gender recognition</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">and speaker recognition</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Geras’kin, I. V.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Journal of communications technology and electronics</subfield><subfield code="d">Springer US, 1993</subfield><subfield code="g">58(2013), 12 vom: Dez., Seite 1292-1301</subfield><subfield code="w">(DE-627)171168402</subfield><subfield code="w">(DE-600)1160383-5</subfield><subfield code="w">(DE-576)038494272</subfield><subfield code="x">1064-2269</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:58</subfield><subfield code="g">year:2013</subfield><subfield code="g">number:12</subfield><subfield code="g">month:12</subfield><subfield code="g">pages:1292-1301</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1134/S1064226913120164</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">58</subfield><subfield code="j">2013</subfield><subfield code="e">12</subfield><subfield code="c">12</subfield><subfield code="h">1292-1301</subfield></datafield></record></collection>
|
score |
7.3995275 |