Mathematical model for empirically optimizing large scale production of soluble protein domains
<p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production sy...
Ausführliche Beschreibung
Autor*in: |
Miyazaki Satoshi [verfasserIn] Yabuki Takashi [verfasserIn] Tanaka Takanori [verfasserIn] Kurotani Atsushi [verfasserIn] Chikayama Eisuke [verfasserIn] Yokoyama Shigeyuki [verfasserIn] Kuroda Yutaka [verfasserIn] |
---|
Format: |
E-Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2010 |
---|
Übergeordnetes Werk: |
In: BMC Bioinformatics - BMC, 2003, 11(2010), 1, p 113 |
---|---|
Übergeordnetes Werk: |
volume:11 ; year:2010 ; number:1, p 113 |
Links: |
---|
DOI / URN: |
10.1186/1471-2105-11-113 |
---|
Katalog-ID: |
DOAJ038765179 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | DOAJ038765179 | ||
003 | DE-627 | ||
005 | 20230503021935.0 | ||
007 | cr uuu---uuuuu | ||
008 | 230227s2010 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1186/1471-2105-11-113 |2 doi | |
035 | |a (DE-627)DOAJ038765179 | ||
035 | |a (DE-599)DOAJ944744b7b9fe4706b34c644b29aca57a | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
050 | 0 | |a R858-859.7 | |
050 | 0 | |a QH301-705.5 | |
100 | 0 | |a Miyazaki Satoshi |e verfasserin |4 aut | |
245 | 1 | 0 | |a Mathematical model for empirically optimizing large scale production of soluble protein domains |
264 | 1 | |c 2010 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
520 | |a <p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research.</p< <p<Results</p< <p<The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain.</p< <p<Conclusions</p< <p<Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected <it<a priori</it<. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.</p< | ||
653 | 0 | |a Computer applications to medicine. Medical informatics | |
653 | 0 | |a Biology (General) | |
700 | 0 | |a Yabuki Takashi |e verfasserin |4 aut | |
700 | 0 | |a Tanaka Takanori |e verfasserin |4 aut | |
700 | 0 | |a Kurotani Atsushi |e verfasserin |4 aut | |
700 | 0 | |a Chikayama Eisuke |e verfasserin |4 aut | |
700 | 0 | |a Yokoyama Shigeyuki |e verfasserin |4 aut | |
700 | 0 | |a Kuroda Yutaka |e verfasserin |4 aut | |
773 | 0 | 8 | |i In |t BMC Bioinformatics |d BMC, 2003 |g 11(2010), 1, p 113 |w (DE-627)326644814 |w (DE-600)2041484-5 |x 14712105 |7 nnns |
773 | 1 | 8 | |g volume:11 |g year:2010 |g number:1, p 113 |
856 | 4 | 0 | |u https://doi.org/10.1186/1471-2105-11-113 |z kostenfrei |
856 | 4 | 0 | |u https://doaj.org/article/944744b7b9fe4706b34c644b29aca57a |z kostenfrei |
856 | 4 | 0 | |u http://www.biomedcentral.com/1471-2105/11/113 |z kostenfrei |
856 | 4 | 2 | |u https://doaj.org/toc/1471-2105 |y Journal toc |z kostenfrei |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_DOAJ | ||
912 | |a SSG-OLC-PHA | ||
912 | |a GBV_ILN_11 | ||
912 | |a GBV_ILN_20 | ||
912 | |a GBV_ILN_22 | ||
912 | |a GBV_ILN_23 | ||
912 | |a GBV_ILN_24 | ||
912 | |a GBV_ILN_31 | ||
912 | |a GBV_ILN_39 | ||
912 | |a GBV_ILN_40 | ||
912 | |a GBV_ILN_60 | ||
912 | |a GBV_ILN_62 | ||
912 | |a GBV_ILN_63 | ||
912 | |a GBV_ILN_65 | ||
912 | |a GBV_ILN_69 | ||
912 | |a GBV_ILN_70 | ||
912 | |a GBV_ILN_73 | ||
912 | |a GBV_ILN_74 | ||
912 | |a GBV_ILN_95 | ||
912 | |a GBV_ILN_105 | ||
912 | |a GBV_ILN_110 | ||
912 | |a GBV_ILN_151 | ||
912 | |a GBV_ILN_161 | ||
912 | |a GBV_ILN_170 | ||
912 | |a GBV_ILN_206 | ||
912 | |a GBV_ILN_213 | ||
912 | |a GBV_ILN_230 | ||
912 | |a GBV_ILN_285 | ||
912 | |a GBV_ILN_293 | ||
912 | |a GBV_ILN_370 | ||
912 | |a GBV_ILN_602 | ||
912 | |a GBV_ILN_702 | ||
912 | |a GBV_ILN_2001 | ||
912 | |a GBV_ILN_2003 | ||
912 | |a GBV_ILN_2005 | ||
912 | |a GBV_ILN_2006 | ||
912 | |a GBV_ILN_2008 | ||
912 | |a GBV_ILN_2009 | ||
912 | |a GBV_ILN_2010 | ||
912 | |a GBV_ILN_2011 | ||
912 | |a GBV_ILN_2014 | ||
912 | |a GBV_ILN_2015 | ||
912 | |a GBV_ILN_2020 | ||
912 | |a GBV_ILN_2021 | ||
912 | |a GBV_ILN_2025 | ||
912 | |a GBV_ILN_2031 | ||
912 | |a GBV_ILN_2038 | ||
912 | |a GBV_ILN_2044 | ||
912 | |a GBV_ILN_2048 | ||
912 | |a GBV_ILN_2050 | ||
912 | |a GBV_ILN_2055 | ||
912 | |a GBV_ILN_2056 | ||
912 | |a GBV_ILN_2057 | ||
912 | |a GBV_ILN_2061 | ||
912 | |a GBV_ILN_2111 | ||
912 | |a GBV_ILN_2113 | ||
912 | |a GBV_ILN_2190 | ||
912 | |a GBV_ILN_4012 | ||
912 | |a GBV_ILN_4037 | ||
912 | |a GBV_ILN_4112 | ||
912 | |a GBV_ILN_4125 | ||
912 | |a GBV_ILN_4126 | ||
912 | |a GBV_ILN_4249 | ||
912 | |a GBV_ILN_4305 | ||
912 | |a GBV_ILN_4306 | ||
912 | |a GBV_ILN_4307 | ||
912 | |a GBV_ILN_4313 | ||
912 | |a GBV_ILN_4322 | ||
912 | |a GBV_ILN_4323 | ||
912 | |a GBV_ILN_4324 | ||
912 | |a GBV_ILN_4325 | ||
912 | |a GBV_ILN_4326 | ||
912 | |a GBV_ILN_4335 | ||
912 | |a GBV_ILN_4338 | ||
912 | |a GBV_ILN_4367 | ||
912 | |a GBV_ILN_4700 | ||
951 | |a AR | ||
952 | |d 11 |j 2010 |e 1, p 113 |
author_variant |
m s ms y t yt t t tt k a ka c e ce y s ys k y ky |
---|---|
matchkey_str |
article:14712105:2010----::ahmtcloefrmiialotmznlreclpoutoo |
hierarchy_sort_str |
2010 |
callnumber-subject-code |
R |
publishDate |
2010 |
allfields |
10.1186/1471-2105-11-113 doi (DE-627)DOAJ038765179 (DE-599)DOAJ944744b7b9fe4706b34c644b29aca57a DE-627 ger DE-627 rakwb eng R858-859.7 QH301-705.5 Miyazaki Satoshi verfasserin aut Mathematical model for empirically optimizing large scale production of soluble protein domains 2010 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier <p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research.</p< <p<Results</p< <p<The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain.</p< <p<Conclusions</p< <p<Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected <it<a priori</it<. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.</p< Computer applications to medicine. Medical informatics Biology (General) Yabuki Takashi verfasserin aut Tanaka Takanori verfasserin aut Kurotani Atsushi verfasserin aut Chikayama Eisuke verfasserin aut Yokoyama Shigeyuki verfasserin aut Kuroda Yutaka verfasserin aut In BMC Bioinformatics BMC, 2003 11(2010), 1, p 113 (DE-627)326644814 (DE-600)2041484-5 14712105 nnns volume:11 year:2010 number:1, p 113 https://doi.org/10.1186/1471-2105-11-113 kostenfrei https://doaj.org/article/944744b7b9fe4706b34c644b29aca57a kostenfrei http://www.biomedcentral.com/1471-2105/11/113 kostenfrei https://doaj.org/toc/1471-2105 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_702 GBV_ILN_2001 GBV_ILN_2003 GBV_ILN_2005 GBV_ILN_2006 GBV_ILN_2008 GBV_ILN_2009 GBV_ILN_2010 GBV_ILN_2011 GBV_ILN_2014 GBV_ILN_2015 GBV_ILN_2020 GBV_ILN_2021 GBV_ILN_2025 GBV_ILN_2031 GBV_ILN_2038 GBV_ILN_2044 GBV_ILN_2048 GBV_ILN_2050 GBV_ILN_2055 GBV_ILN_2056 GBV_ILN_2057 GBV_ILN_2061 GBV_ILN_2111 GBV_ILN_2113 GBV_ILN_2190 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 11 2010 1, p 113 |
spelling |
10.1186/1471-2105-11-113 doi (DE-627)DOAJ038765179 (DE-599)DOAJ944744b7b9fe4706b34c644b29aca57a DE-627 ger DE-627 rakwb eng R858-859.7 QH301-705.5 Miyazaki Satoshi verfasserin aut Mathematical model for empirically optimizing large scale production of soluble protein domains 2010 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier <p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research.</p< <p<Results</p< <p<The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain.</p< <p<Conclusions</p< <p<Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected <it<a priori</it<. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.</p< Computer applications to medicine. Medical informatics Biology (General) Yabuki Takashi verfasserin aut Tanaka Takanori verfasserin aut Kurotani Atsushi verfasserin aut Chikayama Eisuke verfasserin aut Yokoyama Shigeyuki verfasserin aut Kuroda Yutaka verfasserin aut In BMC Bioinformatics BMC, 2003 11(2010), 1, p 113 (DE-627)326644814 (DE-600)2041484-5 14712105 nnns volume:11 year:2010 number:1, p 113 https://doi.org/10.1186/1471-2105-11-113 kostenfrei https://doaj.org/article/944744b7b9fe4706b34c644b29aca57a kostenfrei http://www.biomedcentral.com/1471-2105/11/113 kostenfrei https://doaj.org/toc/1471-2105 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_702 GBV_ILN_2001 GBV_ILN_2003 GBV_ILN_2005 GBV_ILN_2006 GBV_ILN_2008 GBV_ILN_2009 GBV_ILN_2010 GBV_ILN_2011 GBV_ILN_2014 GBV_ILN_2015 GBV_ILN_2020 GBV_ILN_2021 GBV_ILN_2025 GBV_ILN_2031 GBV_ILN_2038 GBV_ILN_2044 GBV_ILN_2048 GBV_ILN_2050 GBV_ILN_2055 GBV_ILN_2056 GBV_ILN_2057 GBV_ILN_2061 GBV_ILN_2111 GBV_ILN_2113 GBV_ILN_2190 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 11 2010 1, p 113 |
allfields_unstemmed |
10.1186/1471-2105-11-113 doi (DE-627)DOAJ038765179 (DE-599)DOAJ944744b7b9fe4706b34c644b29aca57a DE-627 ger DE-627 rakwb eng R858-859.7 QH301-705.5 Miyazaki Satoshi verfasserin aut Mathematical model for empirically optimizing large scale production of soluble protein domains 2010 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier <p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research.</p< <p<Results</p< <p<The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain.</p< <p<Conclusions</p< <p<Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected <it<a priori</it<. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.</p< Computer applications to medicine. Medical informatics Biology (General) Yabuki Takashi verfasserin aut Tanaka Takanori verfasserin aut Kurotani Atsushi verfasserin aut Chikayama Eisuke verfasserin aut Yokoyama Shigeyuki verfasserin aut Kuroda Yutaka verfasserin aut In BMC Bioinformatics BMC, 2003 11(2010), 1, p 113 (DE-627)326644814 (DE-600)2041484-5 14712105 nnns volume:11 year:2010 number:1, p 113 https://doi.org/10.1186/1471-2105-11-113 kostenfrei https://doaj.org/article/944744b7b9fe4706b34c644b29aca57a kostenfrei http://www.biomedcentral.com/1471-2105/11/113 kostenfrei https://doaj.org/toc/1471-2105 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_702 GBV_ILN_2001 GBV_ILN_2003 GBV_ILN_2005 GBV_ILN_2006 GBV_ILN_2008 GBV_ILN_2009 GBV_ILN_2010 GBV_ILN_2011 GBV_ILN_2014 GBV_ILN_2015 GBV_ILN_2020 GBV_ILN_2021 GBV_ILN_2025 GBV_ILN_2031 GBV_ILN_2038 GBV_ILN_2044 GBV_ILN_2048 GBV_ILN_2050 GBV_ILN_2055 GBV_ILN_2056 GBV_ILN_2057 GBV_ILN_2061 GBV_ILN_2111 GBV_ILN_2113 GBV_ILN_2190 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 11 2010 1, p 113 |
allfieldsGer |
10.1186/1471-2105-11-113 doi (DE-627)DOAJ038765179 (DE-599)DOAJ944744b7b9fe4706b34c644b29aca57a DE-627 ger DE-627 rakwb eng R858-859.7 QH301-705.5 Miyazaki Satoshi verfasserin aut Mathematical model for empirically optimizing large scale production of soluble protein domains 2010 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier <p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research.</p< <p<Results</p< <p<The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain.</p< <p<Conclusions</p< <p<Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected <it<a priori</it<. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.</p< Computer applications to medicine. Medical informatics Biology (General) Yabuki Takashi verfasserin aut Tanaka Takanori verfasserin aut Kurotani Atsushi verfasserin aut Chikayama Eisuke verfasserin aut Yokoyama Shigeyuki verfasserin aut Kuroda Yutaka verfasserin aut In BMC Bioinformatics BMC, 2003 11(2010), 1, p 113 (DE-627)326644814 (DE-600)2041484-5 14712105 nnns volume:11 year:2010 number:1, p 113 https://doi.org/10.1186/1471-2105-11-113 kostenfrei https://doaj.org/article/944744b7b9fe4706b34c644b29aca57a kostenfrei http://www.biomedcentral.com/1471-2105/11/113 kostenfrei https://doaj.org/toc/1471-2105 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_702 GBV_ILN_2001 GBV_ILN_2003 GBV_ILN_2005 GBV_ILN_2006 GBV_ILN_2008 GBV_ILN_2009 GBV_ILN_2010 GBV_ILN_2011 GBV_ILN_2014 GBV_ILN_2015 GBV_ILN_2020 GBV_ILN_2021 GBV_ILN_2025 GBV_ILN_2031 GBV_ILN_2038 GBV_ILN_2044 GBV_ILN_2048 GBV_ILN_2050 GBV_ILN_2055 GBV_ILN_2056 GBV_ILN_2057 GBV_ILN_2061 GBV_ILN_2111 GBV_ILN_2113 GBV_ILN_2190 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 11 2010 1, p 113 |
allfieldsSound |
10.1186/1471-2105-11-113 doi (DE-627)DOAJ038765179 (DE-599)DOAJ944744b7b9fe4706b34c644b29aca57a DE-627 ger DE-627 rakwb eng R858-859.7 QH301-705.5 Miyazaki Satoshi verfasserin aut Mathematical model for empirically optimizing large scale production of soluble protein domains 2010 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier <p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research.</p< <p<Results</p< <p<The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain.</p< <p<Conclusions</p< <p<Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected <it<a priori</it<. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.</p< Computer applications to medicine. Medical informatics Biology (General) Yabuki Takashi verfasserin aut Tanaka Takanori verfasserin aut Kurotani Atsushi verfasserin aut Chikayama Eisuke verfasserin aut Yokoyama Shigeyuki verfasserin aut Kuroda Yutaka verfasserin aut In BMC Bioinformatics BMC, 2003 11(2010), 1, p 113 (DE-627)326644814 (DE-600)2041484-5 14712105 nnns volume:11 year:2010 number:1, p 113 https://doi.org/10.1186/1471-2105-11-113 kostenfrei https://doaj.org/article/944744b7b9fe4706b34c644b29aca57a kostenfrei http://www.biomedcentral.com/1471-2105/11/113 kostenfrei https://doaj.org/toc/1471-2105 Journal toc kostenfrei GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_702 GBV_ILN_2001 GBV_ILN_2003 GBV_ILN_2005 GBV_ILN_2006 GBV_ILN_2008 GBV_ILN_2009 GBV_ILN_2010 GBV_ILN_2011 GBV_ILN_2014 GBV_ILN_2015 GBV_ILN_2020 GBV_ILN_2021 GBV_ILN_2025 GBV_ILN_2031 GBV_ILN_2038 GBV_ILN_2044 GBV_ILN_2048 GBV_ILN_2050 GBV_ILN_2055 GBV_ILN_2056 GBV_ILN_2057 GBV_ILN_2061 GBV_ILN_2111 GBV_ILN_2113 GBV_ILN_2190 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 11 2010 1, p 113 |
language |
English |
source |
In BMC Bioinformatics 11(2010), 1, p 113 volume:11 year:2010 number:1, p 113 |
sourceStr |
In BMC Bioinformatics 11(2010), 1, p 113 volume:11 year:2010 number:1, p 113 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Computer applications to medicine. Medical informatics Biology (General) |
isfreeaccess_bool |
true |
container_title |
BMC Bioinformatics |
authorswithroles_txt_mv |
Miyazaki Satoshi @@aut@@ Yabuki Takashi @@aut@@ Tanaka Takanori @@aut@@ Kurotani Atsushi @@aut@@ Chikayama Eisuke @@aut@@ Yokoyama Shigeyuki @@aut@@ Kuroda Yutaka @@aut@@ |
publishDateDaySort_date |
2010-01-01T00:00:00Z |
hierarchy_top_id |
326644814 |
id |
DOAJ038765179 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">DOAJ038765179</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230503021935.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">230227s2010 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1186/1471-2105-11-113</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)DOAJ038765179</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DOAJ944744b7b9fe4706b34c644b29aca57a</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">R858-859.7</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">QH301-705.5</subfield></datafield><datafield tag="100" ind1="0" ind2=" "><subfield code="a">Miyazaki Satoshi</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Mathematical model for empirically optimizing large scale production of soluble protein domains</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2010</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a"><p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research.</p< <p<Results</p< <p<The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain.</p< <p<Conclusions</p< <p<Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected <it<a priori</it<. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.</p<</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Computer applications to medicine. Medical informatics</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Biology (General)</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Yabuki Takashi</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Tanaka Takanori</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Kurotani Atsushi</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Chikayama Eisuke</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Yokoyama Shigeyuki</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Kuroda Yutaka</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">In</subfield><subfield code="t">BMC Bioinformatics</subfield><subfield code="d">BMC, 2003</subfield><subfield code="g">11(2010), 1, p 113</subfield><subfield code="w">(DE-627)326644814</subfield><subfield code="w">(DE-600)2041484-5</subfield><subfield code="x">14712105</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:11</subfield><subfield code="g">year:2010</subfield><subfield code="g">number:1, p 113</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1186/1471-2105-11-113</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doaj.org/article/944744b7b9fe4706b34c644b29aca57a</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">http://www.biomedcentral.com/1471-2105/11/113</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">https://doaj.org/toc/1471-2105</subfield><subfield code="y">Journal toc</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_DOAJ</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHA</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_20</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_22</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_23</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_24</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_39</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_60</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_63</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_69</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_73</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_74</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_95</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_105</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_110</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_151</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_161</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_170</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_206</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_213</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_230</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_285</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_293</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_370</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_602</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_702</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2001</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2003</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2005</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2006</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2008</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2009</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2010</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2011</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2014</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2015</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2020</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2021</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2025</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2031</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2038</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2044</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2048</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2050</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2055</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2056</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2057</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2061</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2111</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2113</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2190</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4037</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4112</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4125</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4249</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4306</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4307</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4313</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4322</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4323</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4324</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4325</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4326</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4335</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4338</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4367</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4700</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">11</subfield><subfield code="j">2010</subfield><subfield code="e">1, p 113</subfield></datafield></record></collection>
|
callnumber-first |
R - Medicine |
author |
Miyazaki Satoshi |
spellingShingle |
Miyazaki Satoshi misc R858-859.7 misc QH301-705.5 misc Computer applications to medicine. Medical informatics misc Biology (General) Mathematical model for empirically optimizing large scale production of soluble protein domains |
authorStr |
Miyazaki Satoshi |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)326644814 |
format |
electronic Article |
delete_txt_mv |
keep |
author_role |
aut aut aut aut aut aut aut |
collection |
DOAJ |
remote_str |
true |
callnumber-label |
R858-859 |
illustrated |
Not Illustrated |
issn |
14712105 |
topic_title |
R858-859.7 QH301-705.5 Mathematical model for empirically optimizing large scale production of soluble protein domains |
topic |
misc R858-859.7 misc QH301-705.5 misc Computer applications to medicine. Medical informatics misc Biology (General) |
topic_unstemmed |
misc R858-859.7 misc QH301-705.5 misc Computer applications to medicine. Medical informatics misc Biology (General) |
topic_browse |
misc R858-859.7 misc QH301-705.5 misc Computer applications to medicine. Medical informatics misc Biology (General) |
format_facet |
Elektronische Aufsätze Aufsätze Elektronische Ressource |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
cr |
hierarchy_parent_title |
BMC Bioinformatics |
hierarchy_parent_id |
326644814 |
hierarchy_top_title |
BMC Bioinformatics |
isfreeaccess_txt |
true |
familylinks_str_mv |
(DE-627)326644814 (DE-600)2041484-5 |
title |
Mathematical model for empirically optimizing large scale production of soluble protein domains |
ctrlnum |
(DE-627)DOAJ038765179 (DE-599)DOAJ944744b7b9fe4706b34c644b29aca57a |
title_full |
Mathematical model for empirically optimizing large scale production of soluble protein domains |
author_sort |
Miyazaki Satoshi |
journal |
BMC Bioinformatics |
journalStr |
BMC Bioinformatics |
callnumber-first-code |
R |
lang_code |
eng |
isOA_bool |
true |
recordtype |
marc |
publishDateSort |
2010 |
contenttype_str_mv |
txt |
author_browse |
Miyazaki Satoshi Yabuki Takashi Tanaka Takanori Kurotani Atsushi Chikayama Eisuke Yokoyama Shigeyuki Kuroda Yutaka |
container_volume |
11 |
class |
R858-859.7 QH301-705.5 |
format_se |
Elektronische Aufsätze |
author-letter |
Miyazaki Satoshi |
doi_str_mv |
10.1186/1471-2105-11-113 |
author2-role |
verfasserin |
title_sort |
mathematical model for empirically optimizing large scale production of soluble protein domains |
callnumber |
R858-859.7 |
title_auth |
Mathematical model for empirically optimizing large scale production of soluble protein domains |
abstract |
<p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research.</p< <p<Results</p< <p<The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain.</p< <p<Conclusions</p< <p<Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected <it<a priori</it<. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.</p< |
abstractGer |
<p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research.</p< <p<Results</p< <p<The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain.</p< <p<Conclusions</p< <p<Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected <it<a priori</it<. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.</p< |
abstract_unstemmed |
<p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research.</p< <p<Results</p< <p<The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain.</p< <p<Conclusions</p< <p<Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected <it<a priori</it<. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.</p< |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_DOAJ SSG-OLC-PHA GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_206 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_702 GBV_ILN_2001 GBV_ILN_2003 GBV_ILN_2005 GBV_ILN_2006 GBV_ILN_2008 GBV_ILN_2009 GBV_ILN_2010 GBV_ILN_2011 GBV_ILN_2014 GBV_ILN_2015 GBV_ILN_2020 GBV_ILN_2021 GBV_ILN_2025 GBV_ILN_2031 GBV_ILN_2038 GBV_ILN_2044 GBV_ILN_2048 GBV_ILN_2050 GBV_ILN_2055 GBV_ILN_2056 GBV_ILN_2057 GBV_ILN_2061 GBV_ILN_2111 GBV_ILN_2113 GBV_ILN_2190 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 |
container_issue |
1, p 113 |
title_short |
Mathematical model for empirically optimizing large scale production of soluble protein domains |
url |
https://doi.org/10.1186/1471-2105-11-113 https://doaj.org/article/944744b7b9fe4706b34c644b29aca57a http://www.biomedcentral.com/1471-2105/11/113 https://doaj.org/toc/1471-2105 |
remote_bool |
true |
author2 |
Yabuki Takashi Tanaka Takanori Kurotani Atsushi Chikayama Eisuke Yokoyama Shigeyuki Kuroda Yutaka |
author2Str |
Yabuki Takashi Tanaka Takanori Kurotani Atsushi Chikayama Eisuke Yokoyama Shigeyuki Kuroda Yutaka |
ppnlink |
326644814 |
callnumber-subject |
R - General Medicine |
mediatype_str_mv |
c |
isOA_txt |
true |
hochschulschrift_bool |
false |
doi_str |
10.1186/1471-2105-11-113 |
callnumber-a |
R858-859.7 |
up_date |
2024-07-03T19:45:31.730Z |
_version_ |
1803588397151813632 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">DOAJ038765179</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230503021935.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">230227s2010 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1186/1471-2105-11-113</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)DOAJ038765179</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)DOAJ944744b7b9fe4706b34c644b29aca57a</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">R858-859.7</subfield></datafield><datafield tag="050" ind1=" " ind2="0"><subfield code="a">QH301-705.5</subfield></datafield><datafield tag="100" ind1="0" ind2=" "><subfield code="a">Miyazaki Satoshi</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Mathematical model for empirically optimizing large scale production of soluble protein domains</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2010</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a"><p<Abstract</p< <p<Background</p< <p<Efficient dissection of large proteins into their structural domains is critical for high throughput proteome analysis. So far, no study has focused on mathematically modeling a protein dissection protocol in terms of a production system. Here, we report a mathematical model for empirically optimizing the cost of large-scale domain production in proteomics research.</p< <p<Results</p< <p<The model computes the expected number of successfully producing soluble domains, using a conditional probability between domain and boundary identification. Typical values for the model's parameters were estimated using the experimental results for identifying soluble domains from the 2,032 Kazusa HUGE protein sequences. Among the 215 fragments corresponding to the 24 domains that were expressed correctly, 111, corresponding to 18 domains, were soluble. Our model indicates that, under the conditions used in our pilot experiment, the probability of correctly predicting the existence of a domain was 81% (175/215) and that of predicting its boundary was 63% (111/175). Under these conditions, the most cost/effort-effective production of soluble domains was to prepare one to seven fragments per predicted domain.</p< <p<Conclusions</p< <p<Our mathematical modeling of protein dissection protocols indicates that the optimum number of fragments tested per domain is actually much smaller than expected <it<a priori</it<. The application range of our model is not limited to protein dissection, and it can be utilized for designing various large-scale mutational analyses or screening libraries.</p<</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Computer applications to medicine. Medical informatics</subfield></datafield><datafield tag="653" ind1=" " ind2="0"><subfield code="a">Biology (General)</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Yabuki Takashi</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Tanaka Takanori</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Kurotani Atsushi</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Chikayama Eisuke</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Yokoyama Shigeyuki</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Kuroda Yutaka</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">In</subfield><subfield code="t">BMC Bioinformatics</subfield><subfield code="d">BMC, 2003</subfield><subfield code="g">11(2010), 1, p 113</subfield><subfield code="w">(DE-627)326644814</subfield><subfield code="w">(DE-600)2041484-5</subfield><subfield code="x">14712105</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:11</subfield><subfield code="g">year:2010</subfield><subfield code="g">number:1, p 113</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doi.org/10.1186/1471-2105-11-113</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://doaj.org/article/944744b7b9fe4706b34c644b29aca57a</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">http://www.biomedcentral.com/1471-2105/11/113</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">https://doaj.org/toc/1471-2105</subfield><subfield code="y">Journal toc</subfield><subfield code="z">kostenfrei</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_DOAJ</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHA</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_20</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_22</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_23</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_24</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_39</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_60</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_63</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_69</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_73</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_74</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_95</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_105</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_110</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_151</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_161</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_170</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_206</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_213</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_230</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_285</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_293</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_370</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_602</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_702</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2001</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2003</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2005</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2006</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2008</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2009</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2010</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2011</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2014</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2015</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2020</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2021</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2025</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2031</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2038</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2044</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2048</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2050</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2055</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2056</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2057</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2061</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2111</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2113</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2190</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4037</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4112</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4125</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4249</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4306</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4307</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4313</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4322</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4323</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4324</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4325</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4326</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4335</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4338</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4367</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4700</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">11</subfield><subfield code="j">2010</subfield><subfield code="e">1, p 113</subfield></datafield></record></collection>
|
score |
7.4021635 |