Deciphering the impact of genetic variation on human polyadenylation using APARENT2
Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the com...
Ausführliche Beschreibung
Autor*in: |
Linder, Johannes [verfasserIn] |
---|
Format: |
E-Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2022 |
---|
Schlagwörter: |
---|
Anmerkung: |
© The Author(s) 2022 |
---|
Übergeordnetes Werk: |
Enthalten in: Genome biology - London : BioMed Central, 2000, 23(2022), 1 vom: 05. Nov. |
---|---|
Übergeordnetes Werk: |
volume:23 ; year:2022 ; number:1 ; day:05 ; month:11 |
Links: |
---|
DOI / URN: |
10.1186/s13059-022-02799-4 |
---|
Katalog-ID: |
SPR05110864X |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | SPR05110864X | ||
003 | DE-627 | ||
005 | 20230509115256.0 | ||
007 | cr uuu---uuuuu | ||
008 | 230508s2022 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1186/s13059-022-02799-4 |2 doi | |
035 | |a (DE-627)SPR05110864X | ||
035 | |a (SPR)s13059-022-02799-4-e | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Linder, Johannes |e verfasserin |0 (orcid)0000-0003-2134-7292 |4 aut | |
245 | 1 | 0 | |a Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
500 | |a © The Author(s) 2022 | ||
520 | |a Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. Results We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of %${>}43%$ million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. Conclusions A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health. | ||
650 | 4 | |a RNA |7 (dpeaa)DE-He213 | |
650 | 4 | |a Polyadenylation |7 (dpeaa)DE-He213 | |
650 | 4 | |a Deep learning |7 (dpeaa)DE-He213 | |
650 | 4 | |a Neural networks |7 (dpeaa)DE-He213 | |
650 | 4 | |a Untranslated region |7 (dpeaa)DE-He213 | |
650 | 4 | |a Variant interpretation |7 (dpeaa)DE-He213 | |
650 | 4 | |a Genomics |7 (dpeaa)DE-He213 | |
650 | 4 | |a Explainable AI |7 (dpeaa)DE-He213 | |
700 | 1 | |a Koplik, Samantha E. |0 (orcid)0000-0003-3614-4885 |4 aut | |
700 | 1 | |a Kundaje, Anshul |0 (orcid)0000-0003-3084-2287 |4 aut | |
700 | 1 | |a Seelig, Georg |0 (orcid)0000-0002-3163-8782 |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Genome biology |d London : BioMed Central, 2000 |g 23(2022), 1 vom: 05. Nov. |w (DE-627)326173617 |w (DE-600)2040529-7 |x 1474-760X |7 nnns |
773 | 1 | 8 | |g volume:23 |g year:2022 |g number:1 |g day:05 |g month:11 |
856 | 4 | 0 | |u https://dx.doi.org/10.1186/s13059-022-02799-4 |z kostenfrei |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_SPRINGER | ||
912 | |a GBV_ILN_11 | ||
912 | |a GBV_ILN_20 | ||
912 | |a GBV_ILN_22 | ||
912 | |a GBV_ILN_23 | ||
912 | |a GBV_ILN_24 | ||
912 | |a GBV_ILN_31 | ||
912 | |a GBV_ILN_39 | ||
912 | |a GBV_ILN_40 | ||
912 | |a GBV_ILN_62 | ||
912 | |a GBV_ILN_63 | ||
912 | |a GBV_ILN_65 | ||
912 | |a GBV_ILN_69 | ||
912 | |a GBV_ILN_70 | ||
912 | |a GBV_ILN_73 | ||
912 | |a GBV_ILN_74 | ||
912 | |a GBV_ILN_95 | ||
912 | |a GBV_ILN_105 | ||
912 | |a GBV_ILN_110 | ||
912 | |a GBV_ILN_151 | ||
912 | |a GBV_ILN_161 | ||
912 | |a GBV_ILN_170 | ||
912 | |a GBV_ILN_213 | ||
912 | |a GBV_ILN_230 | ||
912 | |a GBV_ILN_285 | ||
912 | |a GBV_ILN_293 | ||
912 | |a GBV_ILN_602 | ||
912 | |a GBV_ILN_2003 | ||
912 | |a GBV_ILN_2014 | ||
912 | |a GBV_ILN_4012 | ||
912 | |a GBV_ILN_4037 | ||
912 | |a GBV_ILN_4112 | ||
912 | |a GBV_ILN_4125 | ||
912 | |a GBV_ILN_4126 | ||
912 | |a GBV_ILN_4249 | ||
912 | |a GBV_ILN_4305 | ||
912 | |a GBV_ILN_4306 | ||
912 | |a GBV_ILN_4307 | ||
912 | |a GBV_ILN_4313 | ||
912 | |a GBV_ILN_4322 | ||
912 | |a GBV_ILN_4323 | ||
912 | |a GBV_ILN_4324 | ||
912 | |a GBV_ILN_4325 | ||
912 | |a GBV_ILN_4338 | ||
912 | |a GBV_ILN_4367 | ||
912 | |a GBV_ILN_4700 | ||
951 | |a AR | ||
952 | |d 23 |j 2022 |e 1 |b 05 |c 11 |
author_variant |
j l jl s e k se sek a k ak g s gs |
---|---|
matchkey_str |
article:1474760X:2022----::eihrntematfeeivrainnuaplaey |
hierarchy_sort_str |
2022 |
publishDate |
2022 |
allfields |
10.1186/s13059-022-02799-4 doi (DE-627)SPR05110864X (SPR)s13059-022-02799-4-e DE-627 ger DE-627 rakwb eng Linder, Johannes verfasserin (orcid)0000-0003-2134-7292 aut Deciphering the impact of genetic variation on human polyadenylation using APARENT2 2022 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s) 2022 Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. Results We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of %${>}43%$ million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. Conclusions A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health. RNA (dpeaa)DE-He213 Polyadenylation (dpeaa)DE-He213 Deep learning (dpeaa)DE-He213 Neural networks (dpeaa)DE-He213 Untranslated region (dpeaa)DE-He213 Variant interpretation (dpeaa)DE-He213 Genomics (dpeaa)DE-He213 Explainable AI (dpeaa)DE-He213 Koplik, Samantha E. (orcid)0000-0003-3614-4885 aut Kundaje, Anshul (orcid)0000-0003-3084-2287 aut Seelig, Georg (orcid)0000-0002-3163-8782 aut Enthalten in Genome biology London : BioMed Central, 2000 23(2022), 1 vom: 05. Nov. (DE-627)326173617 (DE-600)2040529-7 1474-760X nnns volume:23 year:2022 number:1 day:05 month:11 https://dx.doi.org/10.1186/s13059-022-02799-4 kostenfrei Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_2003 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 23 2022 1 05 11 |
spelling |
10.1186/s13059-022-02799-4 doi (DE-627)SPR05110864X (SPR)s13059-022-02799-4-e DE-627 ger DE-627 rakwb eng Linder, Johannes verfasserin (orcid)0000-0003-2134-7292 aut Deciphering the impact of genetic variation on human polyadenylation using APARENT2 2022 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s) 2022 Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. Results We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of %${>}43%$ million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. Conclusions A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health. RNA (dpeaa)DE-He213 Polyadenylation (dpeaa)DE-He213 Deep learning (dpeaa)DE-He213 Neural networks (dpeaa)DE-He213 Untranslated region (dpeaa)DE-He213 Variant interpretation (dpeaa)DE-He213 Genomics (dpeaa)DE-He213 Explainable AI (dpeaa)DE-He213 Koplik, Samantha E. (orcid)0000-0003-3614-4885 aut Kundaje, Anshul (orcid)0000-0003-3084-2287 aut Seelig, Georg (orcid)0000-0002-3163-8782 aut Enthalten in Genome biology London : BioMed Central, 2000 23(2022), 1 vom: 05. Nov. (DE-627)326173617 (DE-600)2040529-7 1474-760X nnns volume:23 year:2022 number:1 day:05 month:11 https://dx.doi.org/10.1186/s13059-022-02799-4 kostenfrei Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_2003 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 23 2022 1 05 11 |
allfields_unstemmed |
10.1186/s13059-022-02799-4 doi (DE-627)SPR05110864X (SPR)s13059-022-02799-4-e DE-627 ger DE-627 rakwb eng Linder, Johannes verfasserin (orcid)0000-0003-2134-7292 aut Deciphering the impact of genetic variation on human polyadenylation using APARENT2 2022 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s) 2022 Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. Results We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of %${>}43%$ million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. Conclusions A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health. RNA (dpeaa)DE-He213 Polyadenylation (dpeaa)DE-He213 Deep learning (dpeaa)DE-He213 Neural networks (dpeaa)DE-He213 Untranslated region (dpeaa)DE-He213 Variant interpretation (dpeaa)DE-He213 Genomics (dpeaa)DE-He213 Explainable AI (dpeaa)DE-He213 Koplik, Samantha E. (orcid)0000-0003-3614-4885 aut Kundaje, Anshul (orcid)0000-0003-3084-2287 aut Seelig, Georg (orcid)0000-0002-3163-8782 aut Enthalten in Genome biology London : BioMed Central, 2000 23(2022), 1 vom: 05. Nov. (DE-627)326173617 (DE-600)2040529-7 1474-760X nnns volume:23 year:2022 number:1 day:05 month:11 https://dx.doi.org/10.1186/s13059-022-02799-4 kostenfrei Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_2003 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 23 2022 1 05 11 |
allfieldsGer |
10.1186/s13059-022-02799-4 doi (DE-627)SPR05110864X (SPR)s13059-022-02799-4-e DE-627 ger DE-627 rakwb eng Linder, Johannes verfasserin (orcid)0000-0003-2134-7292 aut Deciphering the impact of genetic variation on human polyadenylation using APARENT2 2022 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s) 2022 Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. Results We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of %${>}43%$ million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. Conclusions A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health. RNA (dpeaa)DE-He213 Polyadenylation (dpeaa)DE-He213 Deep learning (dpeaa)DE-He213 Neural networks (dpeaa)DE-He213 Untranslated region (dpeaa)DE-He213 Variant interpretation (dpeaa)DE-He213 Genomics (dpeaa)DE-He213 Explainable AI (dpeaa)DE-He213 Koplik, Samantha E. (orcid)0000-0003-3614-4885 aut Kundaje, Anshul (orcid)0000-0003-3084-2287 aut Seelig, Georg (orcid)0000-0002-3163-8782 aut Enthalten in Genome biology London : BioMed Central, 2000 23(2022), 1 vom: 05. Nov. (DE-627)326173617 (DE-600)2040529-7 1474-760X nnns volume:23 year:2022 number:1 day:05 month:11 https://dx.doi.org/10.1186/s13059-022-02799-4 kostenfrei Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_2003 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 23 2022 1 05 11 |
allfieldsSound |
10.1186/s13059-022-02799-4 doi (DE-627)SPR05110864X (SPR)s13059-022-02799-4-e DE-627 ger DE-627 rakwb eng Linder, Johannes verfasserin (orcid)0000-0003-2134-7292 aut Deciphering the impact of genetic variation on human polyadenylation using APARENT2 2022 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s) 2022 Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. Results We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of %${>}43%$ million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. Conclusions A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health. RNA (dpeaa)DE-He213 Polyadenylation (dpeaa)DE-He213 Deep learning (dpeaa)DE-He213 Neural networks (dpeaa)DE-He213 Untranslated region (dpeaa)DE-He213 Variant interpretation (dpeaa)DE-He213 Genomics (dpeaa)DE-He213 Explainable AI (dpeaa)DE-He213 Koplik, Samantha E. (orcid)0000-0003-3614-4885 aut Kundaje, Anshul (orcid)0000-0003-3084-2287 aut Seelig, Georg (orcid)0000-0002-3163-8782 aut Enthalten in Genome biology London : BioMed Central, 2000 23(2022), 1 vom: 05. Nov. (DE-627)326173617 (DE-600)2040529-7 1474-760X nnns volume:23 year:2022 number:1 day:05 month:11 https://dx.doi.org/10.1186/s13059-022-02799-4 kostenfrei Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_2003 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 23 2022 1 05 11 |
language |
English |
source |
Enthalten in Genome biology 23(2022), 1 vom: 05. Nov. volume:23 year:2022 number:1 day:05 month:11 |
sourceStr |
Enthalten in Genome biology 23(2022), 1 vom: 05. Nov. volume:23 year:2022 number:1 day:05 month:11 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
RNA Polyadenylation Deep learning Neural networks Untranslated region Variant interpretation Genomics Explainable AI |
isfreeaccess_bool |
true |
container_title |
Genome biology |
authorswithroles_txt_mv |
Linder, Johannes @@aut@@ Koplik, Samantha E. @@aut@@ Kundaje, Anshul @@aut@@ Seelig, Georg @@aut@@ |
publishDateDaySort_date |
2022-11-05T00:00:00Z |
hierarchy_top_id |
326173617 |
id |
SPR05110864X |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">SPR05110864X</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230509115256.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">230508s2022 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1186/s13059-022-02799-4</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)SPR05110864X</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(SPR)s13059-022-02799-4-e</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Linder, Johannes</subfield><subfield code="e">verfasserin</subfield><subfield code="0">(orcid)0000-0003-2134-7292</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Deciphering the impact of genetic variation on human polyadenylation using APARENT2</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s) 2022</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. Results We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of %${>}43%$ million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. Conclusions A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">RNA</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Polyadenylation</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Deep learning</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Neural networks</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Untranslated region</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Variant interpretation</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Genomics</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Explainable AI</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Koplik, Samantha E.</subfield><subfield code="0">(orcid)0000-0003-3614-4885</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Kundaje, Anshul</subfield><subfield code="0">(orcid)0000-0003-3084-2287</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Seelig, Georg</subfield><subfield code="0">(orcid)0000-0002-3163-8782</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Genome biology</subfield><subfield code="d">London : BioMed Central, 2000</subfield><subfield code="g">23(2022), 1 vom: 05. Nov.</subfield><subfield code="w">(DE-627)326173617</subfield><subfield code="w">(DE-600)2040529-7</subfield><subfield code="x">1474-760X</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:23</subfield><subfield code="g">year:2022</subfield><subfield code="g">number:1</subfield><subfield code="g">day:05</subfield><subfield code="g">month:11</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://dx.doi.org/10.1186/s13059-022-02799-4</subfield><subfield code="z">kostenfrei</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_SPRINGER</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_20</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_22</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_23</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_24</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_39</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_63</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_69</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_73</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_74</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_95</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_105</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_110</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_151</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_161</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_170</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_213</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_230</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_285</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_293</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_602</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2003</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2014</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4037</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4112</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4125</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4249</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4306</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4307</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4313</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4322</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4323</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4324</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4325</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4338</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4367</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4700</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">23</subfield><subfield code="j">2022</subfield><subfield code="e">1</subfield><subfield code="b">05</subfield><subfield code="c">11</subfield></datafield></record></collection>
|
author |
Linder, Johannes |
spellingShingle |
Linder, Johannes misc RNA misc Polyadenylation misc Deep learning misc Neural networks misc Untranslated region misc Variant interpretation misc Genomics misc Explainable AI Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
authorStr |
Linder, Johannes |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)326173617 |
format |
electronic Article |
delete_txt_mv |
keep |
author_role |
aut aut aut aut |
collection |
springer |
remote_str |
true |
illustrated |
Not Illustrated |
issn |
1474-760X |
topic_title |
Deciphering the impact of genetic variation on human polyadenylation using APARENT2 RNA (dpeaa)DE-He213 Polyadenylation (dpeaa)DE-He213 Deep learning (dpeaa)DE-He213 Neural networks (dpeaa)DE-He213 Untranslated region (dpeaa)DE-He213 Variant interpretation (dpeaa)DE-He213 Genomics (dpeaa)DE-He213 Explainable AI (dpeaa)DE-He213 |
topic |
misc RNA misc Polyadenylation misc Deep learning misc Neural networks misc Untranslated region misc Variant interpretation misc Genomics misc Explainable AI |
topic_unstemmed |
misc RNA misc Polyadenylation misc Deep learning misc Neural networks misc Untranslated region misc Variant interpretation misc Genomics misc Explainable AI |
topic_browse |
misc RNA misc Polyadenylation misc Deep learning misc Neural networks misc Untranslated region misc Variant interpretation misc Genomics misc Explainable AI |
format_facet |
Elektronische Aufsätze Aufsätze Elektronische Ressource |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
cr |
hierarchy_parent_title |
Genome biology |
hierarchy_parent_id |
326173617 |
hierarchy_top_title |
Genome biology |
isfreeaccess_txt |
true |
familylinks_str_mv |
(DE-627)326173617 (DE-600)2040529-7 |
title |
Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
ctrlnum |
(DE-627)SPR05110864X (SPR)s13059-022-02799-4-e |
title_full |
Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
author_sort |
Linder, Johannes |
journal |
Genome biology |
journalStr |
Genome biology |
lang_code |
eng |
isOA_bool |
true |
recordtype |
marc |
publishDateSort |
2022 |
contenttype_str_mv |
txt |
author_browse |
Linder, Johannes Koplik, Samantha E. Kundaje, Anshul Seelig, Georg |
container_volume |
23 |
format_se |
Elektronische Aufsätze |
author-letter |
Linder, Johannes |
doi_str_mv |
10.1186/s13059-022-02799-4 |
normlink |
(ORCID)0000-0003-2134-7292 (ORCID)0000-0003-3614-4885 (ORCID)0000-0003-3084-2287 (ORCID)0000-0002-3163-8782 |
normlink_prefix_str_mv |
(orcid)0000-0003-2134-7292 (orcid)0000-0003-3614-4885 (orcid)0000-0003-3084-2287 (orcid)0000-0002-3163-8782 |
title_sort |
deciphering the impact of genetic variation on human polyadenylation using aparent2 |
title_auth |
Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
abstract |
Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. Results We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of %${>}43%$ million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. Conclusions A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health. © The Author(s) 2022 |
abstractGer |
Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. Results We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of %${>}43%$ million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. Conclusions A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health. © The Author(s) 2022 |
abstract_unstemmed |
Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. Results We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of %${>}43%$ million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. Conclusions A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health. © The Author(s) 2022 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_74 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_602 GBV_ILN_2003 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 |
container_issue |
1 |
title_short |
Deciphering the impact of genetic variation on human polyadenylation using APARENT2 |
url |
https://dx.doi.org/10.1186/s13059-022-02799-4 |
remote_bool |
true |
author2 |
Koplik, Samantha E. Kundaje, Anshul Seelig, Georg |
author2Str |
Koplik, Samantha E. Kundaje, Anshul Seelig, Georg |
ppnlink |
326173617 |
mediatype_str_mv |
c |
isOA_txt |
true |
hochschulschrift_bool |
false |
doi_str |
10.1186/s13059-022-02799-4 |
up_date |
2024-07-03T19:49:56.139Z |
_version_ |
1803588674406842371 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">SPR05110864X</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230509115256.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">230508s2022 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1186/s13059-022-02799-4</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)SPR05110864X</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(SPR)s13059-022-02799-4-e</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Linder, Johannes</subfield><subfield code="e">verfasserin</subfield><subfield code="0">(orcid)0000-0003-2134-7292</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Deciphering the impact of genetic variation on human polyadenylation using APARENT2</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s) 2022</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Background 3′-end processing by cleavage and polyadenylation is an important and finely tuned regulatory process during mRNA maturation. Numerous genetic variants are known to cause or contribute to human disorders by disrupting the cis-regulatory code of polyadenylation signals. Yet, due to the complexity of this code, variant interpretation remains challenging. Results We introduce a residual neural network model, APARENT2, that can infer 3′-cleavage and polyadenylation from DNA sequence more accurately than any previous model. This model generalizes to the case of alternative polyadenylation (APA) for a variable number of polyadenylation signals. We demonstrate APARENT2’s performance on several variant datasets, including functional reporter data and human 3′ aQTLs from GTEx. We apply neural network interpretation methods to gain insights into disrupted or protective higher-order features of polyadenylation. We fine-tune APARENT2 on human tissue-resolved transcriptomic data to elucidate tissue-specific variant effects. By combining APARENT2 with models of mRNA stability, we extend aQTL effect size predictions to the entire 3′ untranslated region. Finally, we perform in silico saturation mutagenesis of all human polyadenylation signals and compare the predicted effects of %${>}43%$ million variants against gnomAD. While loss-of-function variants were generally selected against, we also find specific clinical conditions linked to gain-of-function mutations. For example, we detect an association between gain-of-function mutations in the 3′-end and autism spectrum disorder. To experimentally validate APARENT2’s predictions, we assayed clinically relevant variants in multiple cell lines, including microglia-derived cells. Conclusions A sequence-to-function model based on deep residual learning enables accurate functional interpretation of genetic variants in polyadenylation signals and, when coupled with large human variation databases, elucidates the link between functional 3′-end mutations and human health.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">RNA</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Polyadenylation</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Deep learning</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Neural networks</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Untranslated region</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Variant interpretation</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Genomics</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Explainable AI</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Koplik, Samantha E.</subfield><subfield code="0">(orcid)0000-0003-3614-4885</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Kundaje, Anshul</subfield><subfield code="0">(orcid)0000-0003-3084-2287</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Seelig, Georg</subfield><subfield code="0">(orcid)0000-0002-3163-8782</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Genome biology</subfield><subfield code="d">London : BioMed Central, 2000</subfield><subfield code="g">23(2022), 1 vom: 05. Nov.</subfield><subfield code="w">(DE-627)326173617</subfield><subfield code="w">(DE-600)2040529-7</subfield><subfield code="x">1474-760X</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:23</subfield><subfield code="g">year:2022</subfield><subfield code="g">number:1</subfield><subfield code="g">day:05</subfield><subfield code="g">month:11</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://dx.doi.org/10.1186/s13059-022-02799-4</subfield><subfield code="z">kostenfrei</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_SPRINGER</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_20</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_22</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_23</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_24</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_39</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_63</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_69</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_73</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_74</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_95</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_105</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_110</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_151</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_161</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_170</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_213</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_230</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_285</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_293</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_602</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2003</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2014</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4037</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4112</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4125</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4249</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4306</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4307</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4313</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4322</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4323</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4324</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4325</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4338</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4367</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4700</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">23</subfield><subfield code="j">2022</subfield><subfield code="e">1</subfield><subfield code="b">05</subfield><subfield code="c">11</subfield></datafield></record></collection>
|
score |
7.400157 |