Comparison of Bayesian predictive methods for model selection

Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification a...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Piironen, Juho [verfasserIn] Vehtari, Aki

Format:	Artikel
Sprache:	Englisch

Erschienen:	2016

Schlagwörter:	Bayesian model selection Cross-validation Reference model Projection Selection bias

Anmerkung:	© The Author(s) 2016

Übergeordnetes Werk:	Enthalten in: Statistics and computing - Springer US, 1991, 27(2016), 3 vom: 07. Apr., Seite 711-735
Übergeordnetes Werk:	volume:27 ; year:2016 ; number:3 ; day:07 ; month:04 ; pages:711-735

Links:	Volltext

DOI / URN:	10.1007/s11222-016-9649-y

Katalog-ID:	OLC2033748898

Internformat


LEADER	01000caa a22002652 4500
001	OLC2033748898
003	DE-627
005	20230504051513.0
007	tu
008	200819s2016 xx \|\|\|\|\| 00\| \|\|eng c
024	7		\|a 10.1007/s11222-016-9649-y \|2 doi
035			\|a (DE-627)OLC2033748898
035			\|a (DE-He213)s11222-016-9649-y-p
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 004 \|a 620 \|q VZ
100	1		\|a Piironen, Juho \|e verfasserin \|4 aut
245	1	0	\|a Comparison of Bayesian predictive methods for model selection
264		1	\|c 2016
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
500			\|a © The Author(s) 2016
520			\|a Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.
650		4	\|a Bayesian model selection
650		4	\|a Cross-validation
650		4	\|a Reference model
650		4	\|a Projection
650		4	\|a Selection bias
700	1		\|a Vehtari, Aki \|4 aut
773	0	8	\|i Enthalten in \|t Statistics and computing \|d Springer US, 1991 \|g 27(2016), 3 vom: 07. Apr., Seite 711-735 \|w (DE-627)131007963 \|w (DE-600)1087487-2 \|w (DE-576)052732894 \|x 0960-3174 \|7 nnns
773	1	8	\|g volume:27 \|g year:2016 \|g number:3 \|g day:07 \|g month:04 \|g pages:711-735
856	4	1	\|u https://doi.org/10.1007/s11222-016-9649-y \|z lizenzpflichtig \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_OLC
912			\|a SSG-OLC-TEC
912			\|a SSG-OLC-MAT
912			\|a GBV_ILN_70
912			\|a GBV_ILN_4126
951			\|a AR
952			\|d 27 \|j 2016 \|e 3 \|b 07 \|c 04 \|h 711-735

Indexfelder

author_variant	j p jp a v av
matchkey_str	article:09603174:2016----::oprsnfaeinrdcieehdf
hierarchy_sort_str	2016
publishDate	2016
allfields	10.1007/s11222-016-9649-y doi (DE-627)OLC2033748898 (DE-He213)s11222-016-9649-y-p DE-627 ger DE-627 rakwb eng 004 620 VZ Piironen, Juho verfasserin aut Comparison of Bayesian predictive methods for model selection 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s) 2016 Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model. Bayesian model selection Cross-validation Reference model Projection Selection bias Vehtari, Aki aut Enthalten in Statistics and computing Springer US, 1991 27(2016), 3 vom: 07. Apr., Seite 711-735 (DE-627)131007963 (DE-600)1087487-2 (DE-576)052732894 0960-3174 nnns volume:27 year:2016 number:3 day:07 month:04 pages:711-735 https://doi.org/10.1007/s11222-016-9649-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4126 AR 27 2016 3 07 04 711-735
spelling	10.1007/s11222-016-9649-y doi (DE-627)OLC2033748898 (DE-He213)s11222-016-9649-y-p DE-627 ger DE-627 rakwb eng 004 620 VZ Piironen, Juho verfasserin aut Comparison of Bayesian predictive methods for model selection 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s) 2016 Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model. Bayesian model selection Cross-validation Reference model Projection Selection bias Vehtari, Aki aut Enthalten in Statistics and computing Springer US, 1991 27(2016), 3 vom: 07. Apr., Seite 711-735 (DE-627)131007963 (DE-600)1087487-2 (DE-576)052732894 0960-3174 nnns volume:27 year:2016 number:3 day:07 month:04 pages:711-735 https://doi.org/10.1007/s11222-016-9649-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4126 AR 27 2016 3 07 04 711-735
allfields_unstemmed	10.1007/s11222-016-9649-y doi (DE-627)OLC2033748898 (DE-He213)s11222-016-9649-y-p DE-627 ger DE-627 rakwb eng 004 620 VZ Piironen, Juho verfasserin aut Comparison of Bayesian predictive methods for model selection 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s) 2016 Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model. Bayesian model selection Cross-validation Reference model Projection Selection bias Vehtari, Aki aut Enthalten in Statistics and computing Springer US, 1991 27(2016), 3 vom: 07. Apr., Seite 711-735 (DE-627)131007963 (DE-600)1087487-2 (DE-576)052732894 0960-3174 nnns volume:27 year:2016 number:3 day:07 month:04 pages:711-735 https://doi.org/10.1007/s11222-016-9649-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4126 AR 27 2016 3 07 04 711-735
allfieldsGer	10.1007/s11222-016-9649-y doi (DE-627)OLC2033748898 (DE-He213)s11222-016-9649-y-p DE-627 ger DE-627 rakwb eng 004 620 VZ Piironen, Juho verfasserin aut Comparison of Bayesian predictive methods for model selection 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s) 2016 Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model. Bayesian model selection Cross-validation Reference model Projection Selection bias Vehtari, Aki aut Enthalten in Statistics and computing Springer US, 1991 27(2016), 3 vom: 07. Apr., Seite 711-735 (DE-627)131007963 (DE-600)1087487-2 (DE-576)052732894 0960-3174 nnns volume:27 year:2016 number:3 day:07 month:04 pages:711-735 https://doi.org/10.1007/s11222-016-9649-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4126 AR 27 2016 3 07 04 711-735
allfieldsSound	10.1007/s11222-016-9649-y doi (DE-627)OLC2033748898 (DE-He213)s11222-016-9649-y-p DE-627 ger DE-627 rakwb eng 004 620 VZ Piironen, Juho verfasserin aut Comparison of Bayesian predictive methods for model selection 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s) 2016 Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model. Bayesian model selection Cross-validation Reference model Projection Selection bias Vehtari, Aki aut Enthalten in Statistics and computing Springer US, 1991 27(2016), 3 vom: 07. Apr., Seite 711-735 (DE-627)131007963 (DE-600)1087487-2 (DE-576)052732894 0960-3174 nnns volume:27 year:2016 number:3 day:07 month:04 pages:711-735 https://doi.org/10.1007/s11222-016-9649-y lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4126 AR 27 2016 3 07 04 711-735
language	English
source	Enthalten in Statistics and computing 27(2016), 3 vom: 07. Apr., Seite 711-735 volume:27 year:2016 number:3 day:07 month:04 pages:711-735
sourceStr	Enthalten in Statistics and computing 27(2016), 3 vom: 07. Apr., Seite 711-735 volume:27 year:2016 number:3 day:07 month:04 pages:711-735
format_phy_str_mv	Article
institution	findex.gbv.de
topic_facet	Bayesian model selection Cross-validation Reference model Projection Selection bias
dewey-raw	004
isfreeaccess_bool	false
container_title	Statistics and computing
authorswithroles_txt_mv	Piironen, Juho @@aut@@ Vehtari, Aki @@aut@@
publishDateDaySort_date	2016-04-07T00:00:00Z
hierarchy_top_id	131007963
dewey-sort	14
id	OLC2033748898
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2033748898</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230504051513.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2016 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11222-016-9649-y</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2033748898</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11222-016-9649-y-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="a">620</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Piironen, Juho</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Comparison of Bayesian predictive methods for model selection</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s) 2016</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Bayesian model selection</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cross-validation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Reference model</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Projection</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Selection bias</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Vehtari, Aki</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Statistics and computing</subfield><subfield code="d">Springer US, 1991</subfield><subfield code="g">27(2016), 3 vom: 07. Apr., Seite 711-735</subfield><subfield code="w">(DE-627)131007963</subfield><subfield code="w">(DE-600)1087487-2</subfield><subfield code="w">(DE-576)052732894</subfield><subfield code="x">0960-3174</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:27</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:3</subfield><subfield code="g">day:07</subfield><subfield code="g">month:04</subfield><subfield code="g">pages:711-735</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11222-016-9649-y</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">27</subfield><subfield code="j">2016</subfield><subfield code="e">3</subfield><subfield code="b">07</subfield><subfield code="c">04</subfield><subfield code="h">711-735</subfield></datafield></record></collection>
author	Piironen, Juho
spellingShingle	Piironen, Juho ddc 004 misc Bayesian model selection misc Cross-validation misc Reference model misc Projection misc Selection bias Comparison of Bayesian predictive methods for model selection
authorStr	Piironen, Juho
ppnlink_with_tag_str_mv	@@773@@(DE-627)131007963
format	Article
dewey-ones	004 - Data processing & computer science 620 - Engineering & allied operations
delete_txt_mv	keep
author_role	aut aut
collection	OLC
remote_str	false
illustrated	Not Illustrated
issn	0960-3174
topic_title	004 620 VZ Comparison of Bayesian predictive methods for model selection Bayesian model selection Cross-validation Reference model Projection Selection bias
topic	ddc 004 misc Bayesian model selection misc Cross-validation misc Reference model misc Projection misc Selection bias
topic_unstemmed	ddc 004 misc Bayesian model selection misc Cross-validation misc Reference model misc Projection misc Selection bias
topic_browse	ddc 004 misc Bayesian model selection misc Cross-validation misc Reference model misc Projection misc Selection bias
format_facet	Aufsätze Gedruckte Aufsätze
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	nc
hierarchy_parent_title	Statistics and computing
hierarchy_parent_id	131007963
dewey-tens	000 - Computer science, knowledge & systems 620 - Engineering
hierarchy_top_title	Statistics and computing
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)131007963 (DE-600)1087487-2 (DE-576)052732894
title	Comparison of Bayesian predictive methods for model selection
ctrlnum	(DE-627)OLC2033748898 (DE-He213)s11222-016-9649-y-p
title_full	Comparison of Bayesian predictive methods for model selection
author_sort	Piironen, Juho
journal	Statistics and computing
journalStr	Statistics and computing
lang_code	eng
isOA_bool	false
dewey-hundreds	000 - Computer science, information & general works 600 - Technology
recordtype	marc
publishDateSort	2016
contenttype_str_mv	txt
container_start_page	711
author_browse	Piironen, Juho Vehtari, Aki
container_volume	27
class	004 620 VZ
format_se	Aufsätze
author-letter	Piironen, Juho
doi_str_mv	10.1007/s11222-016-9649-y
dewey-full	004 620
title_sort	comparison of bayesian predictive methods for model selection
title_auth	Comparison of Bayesian predictive methods for model selection
abstract	Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model. © The Author(s) 2016
abstractGer	Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model. © The Author(s) 2016
abstract_unstemmed	Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model. © The Author(s) 2016
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_4126
container_issue	3
title_short	Comparison of Bayesian predictive methods for model selection
url	https://doi.org/10.1007/s11222-016-9649-y
remote_bool	false
author2	Vehtari, Aki
author2Str	Vehtari, Aki
ppnlink	131007963
mediatype_str_mv	n
isOA_txt	false
hochschulschrift_bool	false
doi_str	10.1007/s11222-016-9649-y
up_date	2024-07-03T18:17:29.634Z
_version_	1803582858473766912
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2033748898</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230504051513.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2016 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s11222-016-9649-y</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2033748898</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s11222-016-9649-y-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="a">620</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Piironen, Juho</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Comparison of Bayesian predictive methods for model selection</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s) 2016</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract The goal of this paper is to compare several widely used Bayesian model selection methods in practical model selection problems, highlight their differences and give recommendations about the preferred approaches. We focus on the variable subset selection for regression and classification and perform several numerical experiments using both simulated and real world data. The results show that the optimization of a utility estimate such as the cross-validation (CV) score is liable to finding overfitted models due to relatively high variance in the utility estimates when the data is scarce. This can also lead to substantial selection induced bias and optimism in the performance evaluation for the selected model. From a predictive viewpoint, best results are obtained by accounting for model uncertainty by forming the full encompassing model, such as the Bayesian model averaging solution over the candidate models. If the encompassing model is too complex, it can be robustly simplified by the projection method, in which the information of the full model is projected onto the submodels. This approach is substantially less prone to overfitting than selection based on CV-score. Overall, the projection method appears to outperform also the maximum a posteriori model and the selection of the most probable variables. The study also demonstrates that the model selection can greatly benefit from using cross-validation outside the searching process both for guiding the model size selection and assessing the predictive performance of the finally selected model.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Bayesian model selection</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cross-validation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Reference model</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Projection</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Selection bias</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Vehtari, Aki</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Statistics and computing</subfield><subfield code="d">Springer US, 1991</subfield><subfield code="g">27(2016), 3 vom: 07. Apr., Seite 711-735</subfield><subfield code="w">(DE-627)131007963</subfield><subfield code="w">(DE-600)1087487-2</subfield><subfield code="w">(DE-576)052732894</subfield><subfield code="x">0960-3174</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:27</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:3</subfield><subfield code="g">day:07</subfield><subfield code="g">month:04</subfield><subfield code="g">pages:711-735</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s11222-016-9649-y</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">27</subfield><subfield code="j">2016</subfield><subfield code="e">3</subfield><subfield code="b">07</subfield><subfield code="c">04</subfield><subfield code="h">711-735</subfield></datafield></record></collection>
score	7.4005537

Nicht das Richtige dabei?

Schreiben Sie uns!

Comparison of Bayesian predictive methods for model selection

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?