DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks
Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of r...
Ausführliche Beschreibung
Autor*in: |
Cheng, Jianlin [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2006 |
---|
Schlagwörter: |
---|
Anmerkung: |
© Springer Science + Business Media, LLC 2006 |
---|
Übergeordnetes Werk: |
Enthalten in: Data mining and knowledge discovery - Springer US, 1997, 13(2006), 1 vom: 11. Mai, Seite 1-10 |
---|---|
Übergeordnetes Werk: |
volume:13 ; year:2006 ; number:1 ; day:11 ; month:05 ; pages:1-10 |
Links: |
---|
DOI / URN: |
10.1007/s10618-005-0023-5 |
---|
Katalog-ID: |
OLC2027057471 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | OLC2027057471 | ||
003 | DE-627 | ||
005 | 20230503034657.0 | ||
007 | tu | ||
008 | 200819s2006 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1007/s10618-005-0023-5 |2 doi | |
035 | |a (DE-627)OLC2027057471 | ||
035 | |a (DE-He213)s10618-005-0023-5-p | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 400 |a 070 |a 004 |q VZ |
084 | |a 24,1 |2 ssgn | ||
084 | |a LING |q DE-30 |2 fid | ||
100 | 1 | |a Cheng, Jianlin |e verfasserin |4 aut | |
245 | 1 | 0 | |a DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks |
264 | 1 | |c 2006 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
500 | |a © Springer Science + Business Media, LLC 2006 | ||
520 | |a Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html. | ||
650 | 4 | |a protein structure prediction | |
650 | 4 | |a domain | |
650 | 4 | |a recursive neural networks | |
700 | 1 | |a Sweredoski, Michael J. |4 aut | |
700 | 1 | |a Baldi, Pierre |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Data mining and knowledge discovery |d Springer US, 1997 |g 13(2006), 1 vom: 11. Mai, Seite 1-10 |w (DE-627)230491774 |w (DE-600)1386325-3 |w (DE-576)067290434 |x 1384-5810 |7 nnns |
773 | 1 | 8 | |g volume:13 |g year:2006 |g number:1 |g day:11 |g month:05 |g pages:1-10 |
856 | 4 | 1 | |u https://doi.org/10.1007/s10618-005-0023-5 |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a FID-LING | ||
912 | |a SSG-OLC-BUB | ||
912 | |a SSG-OLC-MAT | ||
912 | |a SSG-OPC-BBI | ||
912 | |a SSG-OPC-ANG | ||
912 | |a GBV_ILN_40 | ||
912 | |a GBV_ILN_62 | ||
912 | |a GBV_ILN_70 | ||
912 | |a GBV_ILN_100 | ||
912 | |a GBV_ILN_2005 | ||
912 | |a GBV_ILN_4305 | ||
951 | |a AR | ||
952 | |d 13 |j 2006 |e 1 |b 11 |c 05 |h 1-10 |
author_variant |
j c jc m j s mj mjs p b pb |
---|---|
matchkey_str |
article:13845810:2006----::oportidmipeitouigrflseodrsrcueeaieovnacsii |
hierarchy_sort_str |
2006 |
publishDate |
2006 |
allfields |
10.1007/s10618-005-0023-5 doi (DE-627)OLC2027057471 (DE-He213)s10618-005-0023-5-p DE-627 ger DE-627 rakwb eng 400 070 004 VZ 24,1 ssgn LING DE-30 fid Cheng, Jianlin verfasserin aut DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks 2006 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science + Business Media, LLC 2006 Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html. protein structure prediction domain recursive neural networks Sweredoski, Michael J. aut Baldi, Pierre aut Enthalten in Data mining and knowledge discovery Springer US, 1997 13(2006), 1 vom: 11. Mai, Seite 1-10 (DE-627)230491774 (DE-600)1386325-3 (DE-576)067290434 1384-5810 nnns volume:13 year:2006 number:1 day:11 month:05 pages:1-10 https://doi.org/10.1007/s10618-005-0023-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-LING SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI SSG-OPC-ANG GBV_ILN_40 GBV_ILN_62 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_4305 AR 13 2006 1 11 05 1-10 |
spelling |
10.1007/s10618-005-0023-5 doi (DE-627)OLC2027057471 (DE-He213)s10618-005-0023-5-p DE-627 ger DE-627 rakwb eng 400 070 004 VZ 24,1 ssgn LING DE-30 fid Cheng, Jianlin verfasserin aut DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks 2006 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science + Business Media, LLC 2006 Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html. protein structure prediction domain recursive neural networks Sweredoski, Michael J. aut Baldi, Pierre aut Enthalten in Data mining and knowledge discovery Springer US, 1997 13(2006), 1 vom: 11. Mai, Seite 1-10 (DE-627)230491774 (DE-600)1386325-3 (DE-576)067290434 1384-5810 nnns volume:13 year:2006 number:1 day:11 month:05 pages:1-10 https://doi.org/10.1007/s10618-005-0023-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-LING SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI SSG-OPC-ANG GBV_ILN_40 GBV_ILN_62 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_4305 AR 13 2006 1 11 05 1-10 |
allfields_unstemmed |
10.1007/s10618-005-0023-5 doi (DE-627)OLC2027057471 (DE-He213)s10618-005-0023-5-p DE-627 ger DE-627 rakwb eng 400 070 004 VZ 24,1 ssgn LING DE-30 fid Cheng, Jianlin verfasserin aut DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks 2006 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science + Business Media, LLC 2006 Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html. protein structure prediction domain recursive neural networks Sweredoski, Michael J. aut Baldi, Pierre aut Enthalten in Data mining and knowledge discovery Springer US, 1997 13(2006), 1 vom: 11. Mai, Seite 1-10 (DE-627)230491774 (DE-600)1386325-3 (DE-576)067290434 1384-5810 nnns volume:13 year:2006 number:1 day:11 month:05 pages:1-10 https://doi.org/10.1007/s10618-005-0023-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-LING SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI SSG-OPC-ANG GBV_ILN_40 GBV_ILN_62 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_4305 AR 13 2006 1 11 05 1-10 |
allfieldsGer |
10.1007/s10618-005-0023-5 doi (DE-627)OLC2027057471 (DE-He213)s10618-005-0023-5-p DE-627 ger DE-627 rakwb eng 400 070 004 VZ 24,1 ssgn LING DE-30 fid Cheng, Jianlin verfasserin aut DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks 2006 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science + Business Media, LLC 2006 Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html. protein structure prediction domain recursive neural networks Sweredoski, Michael J. aut Baldi, Pierre aut Enthalten in Data mining and knowledge discovery Springer US, 1997 13(2006), 1 vom: 11. Mai, Seite 1-10 (DE-627)230491774 (DE-600)1386325-3 (DE-576)067290434 1384-5810 nnns volume:13 year:2006 number:1 day:11 month:05 pages:1-10 https://doi.org/10.1007/s10618-005-0023-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-LING SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI SSG-OPC-ANG GBV_ILN_40 GBV_ILN_62 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_4305 AR 13 2006 1 11 05 1-10 |
allfieldsSound |
10.1007/s10618-005-0023-5 doi (DE-627)OLC2027057471 (DE-He213)s10618-005-0023-5-p DE-627 ger DE-627 rakwb eng 400 070 004 VZ 24,1 ssgn LING DE-30 fid Cheng, Jianlin verfasserin aut DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks 2006 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science + Business Media, LLC 2006 Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html. protein structure prediction domain recursive neural networks Sweredoski, Michael J. aut Baldi, Pierre aut Enthalten in Data mining and knowledge discovery Springer US, 1997 13(2006), 1 vom: 11. Mai, Seite 1-10 (DE-627)230491774 (DE-600)1386325-3 (DE-576)067290434 1384-5810 nnns volume:13 year:2006 number:1 day:11 month:05 pages:1-10 https://doi.org/10.1007/s10618-005-0023-5 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-LING SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI SSG-OPC-ANG GBV_ILN_40 GBV_ILN_62 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_4305 AR 13 2006 1 11 05 1-10 |
language |
English |
source |
Enthalten in Data mining and knowledge discovery 13(2006), 1 vom: 11. Mai, Seite 1-10 volume:13 year:2006 number:1 day:11 month:05 pages:1-10 |
sourceStr |
Enthalten in Data mining and knowledge discovery 13(2006), 1 vom: 11. Mai, Seite 1-10 volume:13 year:2006 number:1 day:11 month:05 pages:1-10 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
protein structure prediction domain recursive neural networks |
dewey-raw |
400 |
isfreeaccess_bool |
false |
container_title |
Data mining and knowledge discovery |
authorswithroles_txt_mv |
Cheng, Jianlin @@aut@@ Sweredoski, Michael J. @@aut@@ Baldi, Pierre @@aut@@ |
publishDateDaySort_date |
2006-05-11T00:00:00Z |
hierarchy_top_id |
230491774 |
dewey-sort |
3400 |
id |
OLC2027057471 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2027057471</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230503034657.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2006 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10618-005-0023-5</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2027057471</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s10618-005-0023-5-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">400</subfield><subfield code="a">070</subfield><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">LING</subfield><subfield code="q">DE-30</subfield><subfield code="2">fid</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Cheng, Jianlin</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2006</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Springer Science + Business Media, LLC 2006</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">protein structure prediction</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">domain</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">recursive neural networks</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Sweredoski, Michael J.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Baldi, Pierre</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Data mining and knowledge discovery</subfield><subfield code="d">Springer US, 1997</subfield><subfield code="g">13(2006), 1 vom: 11. Mai, Seite 1-10</subfield><subfield code="w">(DE-627)230491774</subfield><subfield code="w">(DE-600)1386325-3</subfield><subfield code="w">(DE-576)067290434</subfield><subfield code="x">1384-5810</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:13</subfield><subfield code="g">year:2006</subfield><subfield code="g">number:1</subfield><subfield code="g">day:11</subfield><subfield code="g">month:05</subfield><subfield code="g">pages:1-10</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s10618-005-0023-5</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">FID-LING</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-ANG</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_100</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2005</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">13</subfield><subfield code="j">2006</subfield><subfield code="e">1</subfield><subfield code="b">11</subfield><subfield code="c">05</subfield><subfield code="h">1-10</subfield></datafield></record></collection>
|
author |
Cheng, Jianlin |
spellingShingle |
Cheng, Jianlin ddc 400 ssgn 24,1 fid LING misc protein structure prediction misc domain misc recursive neural networks DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks |
authorStr |
Cheng, Jianlin |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)230491774 |
format |
Article |
dewey-ones |
400 - Language 070 - News media, journalism & publishing 004 - Data processing & computer science |
delete_txt_mv |
keep |
author_role |
aut aut aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
1384-5810 |
topic_title |
400 070 004 VZ 24,1 ssgn LING DE-30 fid DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks protein structure prediction domain recursive neural networks |
topic |
ddc 400 ssgn 24,1 fid LING misc protein structure prediction misc domain misc recursive neural networks |
topic_unstemmed |
ddc 400 ssgn 24,1 fid LING misc protein structure prediction misc domain misc recursive neural networks |
topic_browse |
ddc 400 ssgn 24,1 fid LING misc protein structure prediction misc domain misc recursive neural networks |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
hierarchy_parent_title |
Data mining and knowledge discovery |
hierarchy_parent_id |
230491774 |
dewey-tens |
400 - Language 070 - News media, journalism & publishing 000 - Computer science, knowledge & systems |
hierarchy_top_title |
Data mining and knowledge discovery |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)230491774 (DE-600)1386325-3 (DE-576)067290434 |
title |
DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks |
ctrlnum |
(DE-627)OLC2027057471 (DE-He213)s10618-005-0023-5-p |
title_full |
DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks |
author_sort |
Cheng, Jianlin |
journal |
Data mining and knowledge discovery |
journalStr |
Data mining and knowledge discovery |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
400 - Language 000 - Computer science, information & general works |
recordtype |
marc |
publishDateSort |
2006 |
contenttype_str_mv |
txt |
container_start_page |
1 |
author_browse |
Cheng, Jianlin Sweredoski, Michael J. Baldi, Pierre |
container_volume |
13 |
class |
400 070 004 VZ 24,1 ssgn LING DE-30 fid |
format_se |
Aufsätze |
author-letter |
Cheng, Jianlin |
doi_str_mv |
10.1007/s10618-005-0023-5 |
dewey-full |
400 070 004 |
title_sort |
dompro: protein domain prediction using profiles, secondary structure, relative solvent accessibility, and recursive neural networks |
title_auth |
DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks |
abstract |
Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html. © Springer Science + Business Media, LLC 2006 |
abstractGer |
Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html. © Springer Science + Business Media, LLC 2006 |
abstract_unstemmed |
Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html. © Springer Science + Business Media, LLC 2006 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-LING SSG-OLC-BUB SSG-OLC-MAT SSG-OPC-BBI SSG-OPC-ANG GBV_ILN_40 GBV_ILN_62 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_4305 |
container_issue |
1 |
title_short |
DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks |
url |
https://doi.org/10.1007/s10618-005-0023-5 |
remote_bool |
false |
author2 |
Sweredoski, Michael J. Baldi, Pierre |
author2Str |
Sweredoski, Michael J. Baldi, Pierre |
ppnlink |
230491774 |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
doi_str |
10.1007/s10618-005-0023-5 |
up_date |
2024-07-03T13:36:33.914Z |
_version_ |
1803565183968215040 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2027057471</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230503034657.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2006 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10618-005-0023-5</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2027057471</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s10618-005-0023-5-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">400</subfield><subfield code="a">070</subfield><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">LING</subfield><subfield code="q">DE-30</subfield><subfield code="2">fid</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Cheng, Jianlin</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">DOMpro: Protein Domain Prediction Using Profiles, Secondary Structure, Relative Solvent Accessibility, and Recursive Neural Networks</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2006</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Springer Science + Business Media, LLC 2006</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Protein domains are the structural and functional units of proteins. The ability to parse protein chains into different domains is important for protein classification and for understanding protein structure, function, and evolution. Here we use machine learning algorithms, in the form of recursive neural networks, to develop a protein domain predictor called DOMpro. DOMpro predicts protein domains using a combination of evolutionary information in the form of profiles, predicted secondary structure, and predicted relative solvent accessibility. DOMpro is trained and tested on a curated dataset derived from the CATH database. DOMpro correctly predicts the number of domains for 69% of the combined dataset of single and multi-domain chains. DOMpro achieves a sensitivity of 76% and specificity of 85% with respect to the single-domain proteins and sensitivity of 59% and specificity of 38% with respect to the two-domain proteins. DOMpro also achieved a sensitivity and specificity of 71% and 71% respectively in the Critical Assessment of Fully Automated Structure Prediction 4 (CAFASP-4) (Fisher et al., 1999; Saini and Fischer, 2005) and was ranked among the top ab initio domain predictors. The DOMpro server, software, and dataset are available at http://www.igb.uci.edu/servers/psss.html.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">protein structure prediction</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">domain</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">recursive neural networks</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Sweredoski, Michael J.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Baldi, Pierre</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Data mining and knowledge discovery</subfield><subfield code="d">Springer US, 1997</subfield><subfield code="g">13(2006), 1 vom: 11. Mai, Seite 1-10</subfield><subfield code="w">(DE-627)230491774</subfield><subfield code="w">(DE-600)1386325-3</subfield><subfield code="w">(DE-576)067290434</subfield><subfield code="x">1384-5810</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:13</subfield><subfield code="g">year:2006</subfield><subfield code="g">number:1</subfield><subfield code="g">day:11</subfield><subfield code="g">month:05</subfield><subfield code="g">pages:1-10</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s10618-005-0023-5</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">FID-LING</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-ANG</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_100</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2005</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">13</subfield><subfield code="j">2006</subfield><subfield code="e">1</subfield><subfield code="b">11</subfield><subfield code="c">05</subfield><subfield code="h">1-10</subfield></datafield></record></collection>
|
score |
7.39999 |