Missing value imputation strategies for metabolomics data

The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality an...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Armitage, Emily Grace [verfasserIn] Godzien, Joanna Alonso‐Herranz, Vanesa López‐Gonzálvez, Ángeles Barbas, Coral

Format:	Artikel
Sprache:	Englisch

Erschienen:	2015

Rechteinformationen:	Nutzungsrecht: © 2015 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Schlagwörter:	CE‐MS Data False‐discovery rate k‐nearest neighbour Imputation Missing values Metabolomics

Übergeordnetes Werk:	Enthalten in: Electrophoresis - Weinheim : Wiley-VCH, 1980, 36(2015), 24, Seite 3050-3060
Übergeordnetes Werk:	volume:36 ; year:2015 ; number:24 ; pages:3050-3060

Links:	Volltext Link aufrufen Link aufrufen

DOI / URN:	10.1002/elps.201500352

Katalog-ID:	OLC1958967114

Internformat


LEADER	01000caa a2200265 4500
001	OLC1958967114
003	DE-627
005	20230519020931.0
007	tu
008	160206s2015 xx \|\|\|\|\| 00\| \|\|eng c
024	7		\|a 10.1002/elps.201500352 \|2 doi
028	5	2	\|a PQ20160617
035			\|a (DE-627)OLC1958967114
035			\|a (DE-599)GBVOLC1958967114
035			\|a (PRQ)p927-b5b5a994d8613c28550d7e445dce004084555a05a4eb5508265a49738119ef863
035			\|a (KEY)0204026320150000036002403050missingvalueimputationstrategiesformetabolomicsdat
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 540 \|a 570 \|q DNB
082	0	4	\|a 570 \|q AVZ
084			\|a BIODIV \|2 fid
084			\|a 35.29 \|2 bkl
100	1		\|a Armitage, Emily Grace \|e verfasserin \|4 aut
245	1	0	\|a Missing value imputation strategies for metabolomics data
264		1	\|c 2015
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
520			\|a The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k‐means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a “gray area” and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros.
540			\|a Nutzungsrecht: © 2015 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim
540			\|a © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
650		4	\|a CE‐MS
650		4	\|a Data
650		4	\|a False‐discovery rate
650		4	\|a k‐nearest neighbour
650		4	\|a Imputation
650		4	\|a Missing values
650		4	\|a Metabolomics
700	1		\|a Godzien, Joanna \|4 oth
700	1		\|a Alonso‐Herranz, Vanesa \|4 oth
700	1		\|a López‐Gonzálvez, Ángeles \|4 oth
700	1		\|a Barbas, Coral \|4 oth
773	0	8	\|i Enthalten in \|t Electrophoresis \|d Weinheim : Wiley-VCH, 1980 \|g 36(2015), 24, Seite 3050-3060 \|w (DE-627)130409952 \|w (DE-600)619001-7 \|w (DE-576)015913732 \|x 0173-0835 \|7 nnns
773	1	8	\|g volume:36 \|g year:2015 \|g number:24 \|g pages:3050-3060
856	4	1	\|u http://dx.doi.org/10.1002/elps.201500352 \|3 Volltext
856	4	2	\|u http://onlinelibrary.wiley.com/doi/10.1002/elps.201500352/abstract
856	4	2	\|u http://www.ncbi.nlm.nih.gov/pubmed/26376450
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_OLC
912			\|a FID-BIODIV
912			\|a SSG-OLC-TEC
912			\|a SSG-OLC-CHE
912			\|a SSG-OLC-PHA
912			\|a SSG-OLC-DE-84
912			\|a GBV_ILN_70
912			\|a GBV_ILN_267
912			\|a GBV_ILN_2018
912			\|a GBV_ILN_2219
912			\|a GBV_ILN_4012
936	b	k	\|a 35.29 \|q AVZ
951			\|a AR
952			\|d 36 \|j 2015 \|e 24 \|h 3050-3060

Indexfelder

author_variant	e g a eg ega
matchkey_str	article:01730835:2015----::isnvlemuaintaeisom
hierarchy_sort_str	2015
bklnumber	35.29
publishDate	2015
allfields	10.1002/elps.201500352 doi PQ20160617 (DE-627)OLC1958967114 (DE-599)GBVOLC1958967114 (PRQ)p927-b5b5a994d8613c28550d7e445dce004084555a05a4eb5508265a49738119ef863 (KEY)0204026320150000036002403050missingvalueimputationstrategiesformetabolomicsdat DE-627 ger DE-627 rakwb eng 540 570 DNB 570 AVZ BIODIV fid 35.29 bkl Armitage, Emily Grace verfasserin aut Missing value imputation strategies for metabolomics data 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k‐means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a “gray area” and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros. Nutzungsrecht: © 2015 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim. CE‐MS Data False‐discovery rate k‐nearest neighbour Imputation Missing values Metabolomics Godzien, Joanna oth Alonso‐Herranz, Vanesa oth López‐Gonzálvez, Ángeles oth Barbas, Coral oth Enthalten in Electrophoresis Weinheim : Wiley-VCH, 1980 36(2015), 24, Seite 3050-3060 (DE-627)130409952 (DE-600)619001-7 (DE-576)015913732 0173-0835 nnns volume:36 year:2015 number:24 pages:3050-3060 http://dx.doi.org/10.1002/elps.201500352 Volltext http://onlinelibrary.wiley.com/doi/10.1002/elps.201500352/abstract http://www.ncbi.nlm.nih.gov/pubmed/26376450 GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-BIODIV SSG-OLC-TEC SSG-OLC-CHE SSG-OLC-PHA SSG-OLC-DE-84 GBV_ILN_70 GBV_ILN_267 GBV_ILN_2018 GBV_ILN_2219 GBV_ILN_4012 35.29 AVZ AR 36 2015 24 3050-3060
spelling	10.1002/elps.201500352 doi PQ20160617 (DE-627)OLC1958967114 (DE-599)GBVOLC1958967114 (PRQ)p927-b5b5a994d8613c28550d7e445dce004084555a05a4eb5508265a49738119ef863 (KEY)0204026320150000036002403050missingvalueimputationstrategiesformetabolomicsdat DE-627 ger DE-627 rakwb eng 540 570 DNB 570 AVZ BIODIV fid 35.29 bkl Armitage, Emily Grace verfasserin aut Missing value imputation strategies for metabolomics data 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k‐means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a “gray area” and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros. Nutzungsrecht: © 2015 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim. CE‐MS Data False‐discovery rate k‐nearest neighbour Imputation Missing values Metabolomics Godzien, Joanna oth Alonso‐Herranz, Vanesa oth López‐Gonzálvez, Ángeles oth Barbas, Coral oth Enthalten in Electrophoresis Weinheim : Wiley-VCH, 1980 36(2015), 24, Seite 3050-3060 (DE-627)130409952 (DE-600)619001-7 (DE-576)015913732 0173-0835 nnns volume:36 year:2015 number:24 pages:3050-3060 http://dx.doi.org/10.1002/elps.201500352 Volltext http://onlinelibrary.wiley.com/doi/10.1002/elps.201500352/abstract http://www.ncbi.nlm.nih.gov/pubmed/26376450 GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-BIODIV SSG-OLC-TEC SSG-OLC-CHE SSG-OLC-PHA SSG-OLC-DE-84 GBV_ILN_70 GBV_ILN_267 GBV_ILN_2018 GBV_ILN_2219 GBV_ILN_4012 35.29 AVZ AR 36 2015 24 3050-3060
allfields_unstemmed	10.1002/elps.201500352 doi PQ20160617 (DE-627)OLC1958967114 (DE-599)GBVOLC1958967114 (PRQ)p927-b5b5a994d8613c28550d7e445dce004084555a05a4eb5508265a49738119ef863 (KEY)0204026320150000036002403050missingvalueimputationstrategiesformetabolomicsdat DE-627 ger DE-627 rakwb eng 540 570 DNB 570 AVZ BIODIV fid 35.29 bkl Armitage, Emily Grace verfasserin aut Missing value imputation strategies for metabolomics data 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k‐means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a “gray area” and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros. Nutzungsrecht: © 2015 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim. CE‐MS Data False‐discovery rate k‐nearest neighbour Imputation Missing values Metabolomics Godzien, Joanna oth Alonso‐Herranz, Vanesa oth López‐Gonzálvez, Ángeles oth Barbas, Coral oth Enthalten in Electrophoresis Weinheim : Wiley-VCH, 1980 36(2015), 24, Seite 3050-3060 (DE-627)130409952 (DE-600)619001-7 (DE-576)015913732 0173-0835 nnns volume:36 year:2015 number:24 pages:3050-3060 http://dx.doi.org/10.1002/elps.201500352 Volltext http://onlinelibrary.wiley.com/doi/10.1002/elps.201500352/abstract http://www.ncbi.nlm.nih.gov/pubmed/26376450 GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-BIODIV SSG-OLC-TEC SSG-OLC-CHE SSG-OLC-PHA SSG-OLC-DE-84 GBV_ILN_70 GBV_ILN_267 GBV_ILN_2018 GBV_ILN_2219 GBV_ILN_4012 35.29 AVZ AR 36 2015 24 3050-3060
allfieldsGer	10.1002/elps.201500352 doi PQ20160617 (DE-627)OLC1958967114 (DE-599)GBVOLC1958967114 (PRQ)p927-b5b5a994d8613c28550d7e445dce004084555a05a4eb5508265a49738119ef863 (KEY)0204026320150000036002403050missingvalueimputationstrategiesformetabolomicsdat DE-627 ger DE-627 rakwb eng 540 570 DNB 570 AVZ BIODIV fid 35.29 bkl Armitage, Emily Grace verfasserin aut Missing value imputation strategies for metabolomics data 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k‐means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a “gray area” and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros. Nutzungsrecht: © 2015 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim. CE‐MS Data False‐discovery rate k‐nearest neighbour Imputation Missing values Metabolomics Godzien, Joanna oth Alonso‐Herranz, Vanesa oth López‐Gonzálvez, Ángeles oth Barbas, Coral oth Enthalten in Electrophoresis Weinheim : Wiley-VCH, 1980 36(2015), 24, Seite 3050-3060 (DE-627)130409952 (DE-600)619001-7 (DE-576)015913732 0173-0835 nnns volume:36 year:2015 number:24 pages:3050-3060 http://dx.doi.org/10.1002/elps.201500352 Volltext http://onlinelibrary.wiley.com/doi/10.1002/elps.201500352/abstract http://www.ncbi.nlm.nih.gov/pubmed/26376450 GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-BIODIV SSG-OLC-TEC SSG-OLC-CHE SSG-OLC-PHA SSG-OLC-DE-84 GBV_ILN_70 GBV_ILN_267 GBV_ILN_2018 GBV_ILN_2219 GBV_ILN_4012 35.29 AVZ AR 36 2015 24 3050-3060
allfieldsSound	10.1002/elps.201500352 doi PQ20160617 (DE-627)OLC1958967114 (DE-599)GBVOLC1958967114 (PRQ)p927-b5b5a994d8613c28550d7e445dce004084555a05a4eb5508265a49738119ef863 (KEY)0204026320150000036002403050missingvalueimputationstrategiesformetabolomicsdat DE-627 ger DE-627 rakwb eng 540 570 DNB 570 AVZ BIODIV fid 35.29 bkl Armitage, Emily Grace verfasserin aut Missing value imputation strategies for metabolomics data 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k‐means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a “gray area” and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros. Nutzungsrecht: © 2015 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim. CE‐MS Data False‐discovery rate k‐nearest neighbour Imputation Missing values Metabolomics Godzien, Joanna oth Alonso‐Herranz, Vanesa oth López‐Gonzálvez, Ángeles oth Barbas, Coral oth Enthalten in Electrophoresis Weinheim : Wiley-VCH, 1980 36(2015), 24, Seite 3050-3060 (DE-627)130409952 (DE-600)619001-7 (DE-576)015913732 0173-0835 nnns volume:36 year:2015 number:24 pages:3050-3060 http://dx.doi.org/10.1002/elps.201500352 Volltext http://onlinelibrary.wiley.com/doi/10.1002/elps.201500352/abstract http://www.ncbi.nlm.nih.gov/pubmed/26376450 GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-BIODIV SSG-OLC-TEC SSG-OLC-CHE SSG-OLC-PHA SSG-OLC-DE-84 GBV_ILN_70 GBV_ILN_267 GBV_ILN_2018 GBV_ILN_2219 GBV_ILN_4012 35.29 AVZ AR 36 2015 24 3050-3060
language	English
source	Enthalten in Electrophoresis 36(2015), 24, Seite 3050-3060 volume:36 year:2015 number:24 pages:3050-3060
sourceStr	Enthalten in Electrophoresis 36(2015), 24, Seite 3050-3060 volume:36 year:2015 number:24 pages:3050-3060
format_phy_str_mv	Article
institution	findex.gbv.de
topic_facet	CE‐MS Data False‐discovery rate k‐nearest neighbour Imputation Missing values Metabolomics
dewey-raw	540
isfreeaccess_bool	false
container_title	Electrophoresis
authorswithroles_txt_mv	Armitage, Emily Grace @@aut@@ Godzien, Joanna @@oth@@ Alonso‐Herranz, Vanesa @@oth@@ López‐Gonzálvez, Ángeles @@oth@@ Barbas, Coral @@oth@@
publishDateDaySort_date	2015-01-01T00:00:00Z
hierarchy_top_id	130409952
dewey-sort	3540
id	OLC1958967114
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1958967114</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230519020931.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160206s2015 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1002/elps.201500352</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160617</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1958967114</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1958967114</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)p927-b5b5a994d8613c28550d7e445dce004084555a05a4eb5508265a49738119ef863</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0204026320150000036002403050missingvalueimputationstrategiesformetabolomicsdat</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">540</subfield><subfield code="a">570</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">570</subfield><subfield code="q">AVZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">BIODIV</subfield><subfield code="2">fid</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">35.29</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Armitage, Emily Grace</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Missing value imputation strategies for metabolomics data</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2015</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k‐means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a “gray area” and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros.</subfield></datafield><datafield tag="540" ind1=" " ind2=" "><subfield code="a">Nutzungsrecht: © 2015 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim</subfield></datafield><datafield tag="540" ind1=" " ind2=" "><subfield code="a">© 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">CE‐MS</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">False‐discovery rate</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">k‐nearest neighbour</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Imputation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Missing values</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Metabolomics</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Godzien, Joanna</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Alonso‐Herranz, Vanesa</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">López‐Gonzálvez, Ángeles</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Barbas, Coral</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Electrophoresis</subfield><subfield code="d">Weinheim : Wiley-VCH, 1980</subfield><subfield code="g">36(2015), 24, Seite 3050-3060</subfield><subfield code="w">(DE-627)130409952</subfield><subfield code="w">(DE-600)619001-7</subfield><subfield code="w">(DE-576)015913732</subfield><subfield code="x">0173-0835</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:36</subfield><subfield code="g">year:2015</subfield><subfield code="g">number:24</subfield><subfield code="g">pages:3050-3060</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1002/elps.201500352</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://onlinelibrary.wiley.com/doi/10.1002/elps.201500352/abstract</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://www.ncbi.nlm.nih.gov/pubmed/26376450</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">FID-BIODIV</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-CHE</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHA</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-DE-84</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_267</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2018</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2219</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">35.29</subfield><subfield code="q">AVZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">36</subfield><subfield code="j">2015</subfield><subfield code="e">24</subfield><subfield code="h">3050-3060</subfield></datafield></record></collection>
author	Armitage, Emily Grace
spellingShingle	Armitage, Emily Grace ddc 540 ddc 570 fid BIODIV bkl 35.29 misc CE‐MS misc Data misc False‐discovery rate misc k‐nearest neighbour misc Imputation misc Missing values misc Metabolomics Missing value imputation strategies for metabolomics data
authorStr	Armitage, Emily Grace
ppnlink_with_tag_str_mv	@@773@@(DE-627)130409952
format	Article
dewey-ones	540 - Chemistry & allied sciences 570 - Life sciences; biology
delete_txt_mv	keep
author_role	aut
collection	OLC
remote_str	false
illustrated	Not Illustrated
issn	0173-0835
topic_title	540 570 DNB 570 AVZ BIODIV fid 35.29 bkl Missing value imputation strategies for metabolomics data CE‐MS Data False‐discovery rate k‐nearest neighbour Imputation Missing values Metabolomics
topic	ddc 540 ddc 570 fid BIODIV bkl 35.29 misc CE‐MS misc Data misc False‐discovery rate misc k‐nearest neighbour misc Imputation misc Missing values misc Metabolomics
topic_unstemmed	ddc 540 ddc 570 fid BIODIV bkl 35.29 misc CE‐MS misc Data misc False‐discovery rate misc k‐nearest neighbour misc Imputation misc Missing values misc Metabolomics
topic_browse	ddc 540 ddc 570 fid BIODIV bkl 35.29 misc CE‐MS misc Data misc False‐discovery rate misc k‐nearest neighbour misc Imputation misc Missing values misc Metabolomics
format_facet	Aufsätze Gedruckte Aufsätze
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	nc
author2_variant	j g jg v a va á l ál c b cb
hierarchy_parent_title	Electrophoresis
hierarchy_parent_id	130409952
dewey-tens	540 - Chemistry 570 - Life sciences; biology
hierarchy_top_title	Electrophoresis
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)130409952 (DE-600)619001-7 (DE-576)015913732
title	Missing value imputation strategies for metabolomics data
ctrlnum	(DE-627)OLC1958967114 (DE-599)GBVOLC1958967114 (PRQ)p927-b5b5a994d8613c28550d7e445dce004084555a05a4eb5508265a49738119ef863 (KEY)0204026320150000036002403050missingvalueimputationstrategiesformetabolomicsdat
title_full	Missing value imputation strategies for metabolomics data
author_sort	Armitage, Emily Grace
journal	Electrophoresis
journalStr	Electrophoresis
lang_code	eng
isOA_bool	false
dewey-hundreds	500 - Science
recordtype	marc
publishDateSort	2015
contenttype_str_mv	txt
container_start_page	3050
author_browse	Armitage, Emily Grace
container_volume	36
class	540 570 DNB 570 AVZ BIODIV fid 35.29 bkl
format_se	Aufsätze
author-letter	Armitage, Emily Grace
doi_str_mv	10.1002/elps.201500352
dewey-full	540 570
title_sort	missing value imputation strategies for metabolomics data
title_auth	Missing value imputation strategies for metabolomics data
abstract	The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k‐means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a “gray area” and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros.
abstractGer	The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k‐means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a “gray area” and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros.
abstract_unstemmed	The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k‐means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a “gray area” and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros.
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_OLC FID-BIODIV SSG-OLC-TEC SSG-OLC-CHE SSG-OLC-PHA SSG-OLC-DE-84 GBV_ILN_70 GBV_ILN_267 GBV_ILN_2018 GBV_ILN_2219 GBV_ILN_4012
container_issue	24
title_short	Missing value imputation strategies for metabolomics data
url	http://dx.doi.org/10.1002/elps.201500352 http://onlinelibrary.wiley.com/doi/10.1002/elps.201500352/abstract http://www.ncbi.nlm.nih.gov/pubmed/26376450
remote_bool	false
author2	Godzien, Joanna Alonso‐Herranz, Vanesa López‐Gonzálvez, Ángeles Barbas, Coral
author2Str	Godzien, Joanna Alonso‐Herranz, Vanesa López‐Gonzálvez, Ángeles Barbas, Coral
ppnlink	130409952
mediatype_str_mv	n
isOA_txt	false
hochschulschrift_bool	false
author2_role	oth oth oth oth
doi_str	10.1002/elps.201500352
up_date	2024-07-03T15:15:53.794Z
_version_	1803571433356394496
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1958967114</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230519020931.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160206s2015 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1002/elps.201500352</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160617</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1958967114</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1958967114</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)p927-b5b5a994d8613c28550d7e445dce004084555a05a4eb5508265a49738119ef863</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0204026320150000036002403050missingvalueimputationstrategiesformetabolomicsdat</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">540</subfield><subfield code="a">570</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">570</subfield><subfield code="q">AVZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">BIODIV</subfield><subfield code="2">fid</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">35.29</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Armitage, Emily Grace</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Missing value imputation strategies for metabolomics data</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2015</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">The origin of missing values can be caused by different reasons and depending on these origins missing values should be considered differently and dealt with in different ways. In this research, four methods of imputation have been compared with respect to revealing their effects on the normality and variance of data, on statistical significance and on the approximation of a suitable threshold to accept missing data as truly missing. Additionally, the effects of different strategies for controlling familywise error rate or false discovery and how they work with the different strategies for missing value imputation have been evaluated. Missing values were found to affect normality and variance of data and k‐means nearest neighbour imputation was the best method tested for restoring this. Bonferroni correction was the best method for maximizing true positives and minimizing false positives and it was observed that as low as 40% missing data could be truly missing. The range between 40 and 70% missing values was defined as a “gray area” and therefore a strategy has been proposed that provides a balance between the optimal imputation strategy that was k‐means nearest neighbor and the best approximation of positioning real zeros.</subfield></datafield><datafield tag="540" ind1=" " ind2=" "><subfield code="a">Nutzungsrecht: © 2015 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim</subfield></datafield><datafield tag="540" ind1=" " ind2=" "><subfield code="a">© 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">CE‐MS</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">False‐discovery rate</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">k‐nearest neighbour</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Imputation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Missing values</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Metabolomics</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Godzien, Joanna</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Alonso‐Herranz, Vanesa</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">López‐Gonzálvez, Ángeles</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Barbas, Coral</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Electrophoresis</subfield><subfield code="d">Weinheim : Wiley-VCH, 1980</subfield><subfield code="g">36(2015), 24, Seite 3050-3060</subfield><subfield code="w">(DE-627)130409952</subfield><subfield code="w">(DE-600)619001-7</subfield><subfield code="w">(DE-576)015913732</subfield><subfield code="x">0173-0835</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:36</subfield><subfield code="g">year:2015</subfield><subfield code="g">number:24</subfield><subfield code="g">pages:3050-3060</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1002/elps.201500352</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://onlinelibrary.wiley.com/doi/10.1002/elps.201500352/abstract</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://www.ncbi.nlm.nih.gov/pubmed/26376450</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">FID-BIODIV</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-CHE</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHA</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-DE-84</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_267</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2018</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2219</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">35.29</subfield><subfield code="q">AVZ</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">36</subfield><subfield code="j">2015</subfield><subfield code="e">24</subfield><subfield code="h">3050-3060</subfield></datafield></record></collection>
score	7.398837

Nicht das Richtige dabei?

Schreiben Sie uns!

Missing value imputation strategies for metabolomics data

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?