A Universal Approximation Theorem for Mixture-of-Experts Models
The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently...
Ausführliche Beschreibung
Autor*in: |
Hien D Nguyen [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2016 |
---|
Schlagwörter: |
---|
Übergeordnetes Werk: |
Enthalten in: Neural computation - Cambridge, Mass. : MIT Press, 1989, 28(2016), 12, Seite 2585-2593 |
---|---|
Übergeordnetes Werk: |
volume:28 ; year:2016 ; number:12 ; pages:2585-2593 |
Links: |
---|
DOI / URN: |
10.1162/NECO_a_00892 |
---|
Katalog-ID: |
OLC1989106781 |
---|
LEADER | 01000caa a2200265 4500 | ||
---|---|---|---|
001 | OLC1989106781 | ||
003 | DE-627 | ||
005 | 20210716170724.0 | ||
007 | tu | ||
008 | 170207s2016 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1162/NECO_a_00892 |2 doi | |
028 | 5 | 2 | |a PQ20170301 |
035 | |a (DE-627)OLC1989106781 | ||
035 | |a (DE-599)GBVOLC1989106781 | ||
035 | |a (PRQ)c1301-c61db31d77d2ccc99c22bac5a333e04122cd90cf34453e57d56a2ae7ea81f45e0 | ||
035 | |a (KEY)0175809820160000028001202585universalapproximationtheoremformixtureofexpertsmo | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 004 |q DE-600 |
100 | 0 | |a Hien D Nguyen |e verfasserin |4 aut | |
245 | 1 | 2 | |a A Universal Approximation Theorem for Mixture-of-Experts Models |
264 | 1 | |c 2016 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
520 | |a The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes. | ||
650 | 4 | |a Neurosciences | |
650 | 4 | |a Theorems | |
650 | 4 | |a Estimating techniques | |
650 | 4 | |a Approximations | |
650 | 4 | |a Mathematical models | |
650 | 4 | |a Neural networks | |
700 | 0 | |a Luke R Lloyd-Jones |4 oth | |
700 | 0 | |a Geoffrey J McLachlan |4 oth | |
773 | 0 | 8 | |i Enthalten in |t Neural computation |d Cambridge, Mass. : MIT Press, 1989 |g 28(2016), 12, Seite 2585-2593 |w (DE-627)16566682X |w (DE-600)1025692-1 |w (DE-576)023099836 |x 0899-7667 |7 nnns |
773 | 1 | 8 | |g volume:28 |g year:2016 |g number:12 |g pages:2585-2593 |
856 | 4 | 1 | |u http://dx.doi.org/10.1162/NECO_a_00892 |3 Volltext |
856 | 4 | 2 | |u http://search.proquest.com/docview/1844210931 |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-PHY | ||
912 | |a SSG-OLC-MAT | ||
912 | |a GBV_ILN_59 | ||
912 | |a GBV_ILN_2192 | ||
951 | |a AR | ||
952 | |d 28 |j 2016 |e 12 |h 2585-2593 |
author_variant |
h d n hdn |
---|---|
matchkey_str |
article:08997667:2016----::uieslprxmtotermomxu |
hierarchy_sort_str |
2016 |
publishDate |
2016 |
allfields |
10.1162/NECO_a_00892 doi PQ20170301 (DE-627)OLC1989106781 (DE-599)GBVOLC1989106781 (PRQ)c1301-c61db31d77d2ccc99c22bac5a333e04122cd90cf34453e57d56a2ae7ea81f45e0 (KEY)0175809820160000028001202585universalapproximationtheoremformixtureofexpertsmo DE-627 ger DE-627 rakwb eng 004 DE-600 Hien D Nguyen verfasserin aut A Universal Approximation Theorem for Mixture-of-Experts Models 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes. Neurosciences Theorems Estimating techniques Approximations Mathematical models Neural networks Luke R Lloyd-Jones oth Geoffrey J McLachlan oth Enthalten in Neural computation Cambridge, Mass. : MIT Press, 1989 28(2016), 12, Seite 2585-2593 (DE-627)16566682X (DE-600)1025692-1 (DE-576)023099836 0899-7667 nnns volume:28 year:2016 number:12 pages:2585-2593 http://dx.doi.org/10.1162/NECO_a_00892 Volltext http://search.proquest.com/docview/1844210931 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-PHY SSG-OLC-MAT GBV_ILN_59 GBV_ILN_2192 AR 28 2016 12 2585-2593 |
spelling |
10.1162/NECO_a_00892 doi PQ20170301 (DE-627)OLC1989106781 (DE-599)GBVOLC1989106781 (PRQ)c1301-c61db31d77d2ccc99c22bac5a333e04122cd90cf34453e57d56a2ae7ea81f45e0 (KEY)0175809820160000028001202585universalapproximationtheoremformixtureofexpertsmo DE-627 ger DE-627 rakwb eng 004 DE-600 Hien D Nguyen verfasserin aut A Universal Approximation Theorem for Mixture-of-Experts Models 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes. Neurosciences Theorems Estimating techniques Approximations Mathematical models Neural networks Luke R Lloyd-Jones oth Geoffrey J McLachlan oth Enthalten in Neural computation Cambridge, Mass. : MIT Press, 1989 28(2016), 12, Seite 2585-2593 (DE-627)16566682X (DE-600)1025692-1 (DE-576)023099836 0899-7667 nnns volume:28 year:2016 number:12 pages:2585-2593 http://dx.doi.org/10.1162/NECO_a_00892 Volltext http://search.proquest.com/docview/1844210931 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-PHY SSG-OLC-MAT GBV_ILN_59 GBV_ILN_2192 AR 28 2016 12 2585-2593 |
allfields_unstemmed |
10.1162/NECO_a_00892 doi PQ20170301 (DE-627)OLC1989106781 (DE-599)GBVOLC1989106781 (PRQ)c1301-c61db31d77d2ccc99c22bac5a333e04122cd90cf34453e57d56a2ae7ea81f45e0 (KEY)0175809820160000028001202585universalapproximationtheoremformixtureofexpertsmo DE-627 ger DE-627 rakwb eng 004 DE-600 Hien D Nguyen verfasserin aut A Universal Approximation Theorem for Mixture-of-Experts Models 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes. Neurosciences Theorems Estimating techniques Approximations Mathematical models Neural networks Luke R Lloyd-Jones oth Geoffrey J McLachlan oth Enthalten in Neural computation Cambridge, Mass. : MIT Press, 1989 28(2016), 12, Seite 2585-2593 (DE-627)16566682X (DE-600)1025692-1 (DE-576)023099836 0899-7667 nnns volume:28 year:2016 number:12 pages:2585-2593 http://dx.doi.org/10.1162/NECO_a_00892 Volltext http://search.proquest.com/docview/1844210931 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-PHY SSG-OLC-MAT GBV_ILN_59 GBV_ILN_2192 AR 28 2016 12 2585-2593 |
allfieldsGer |
10.1162/NECO_a_00892 doi PQ20170301 (DE-627)OLC1989106781 (DE-599)GBVOLC1989106781 (PRQ)c1301-c61db31d77d2ccc99c22bac5a333e04122cd90cf34453e57d56a2ae7ea81f45e0 (KEY)0175809820160000028001202585universalapproximationtheoremformixtureofexpertsmo DE-627 ger DE-627 rakwb eng 004 DE-600 Hien D Nguyen verfasserin aut A Universal Approximation Theorem for Mixture-of-Experts Models 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes. Neurosciences Theorems Estimating techniques Approximations Mathematical models Neural networks Luke R Lloyd-Jones oth Geoffrey J McLachlan oth Enthalten in Neural computation Cambridge, Mass. : MIT Press, 1989 28(2016), 12, Seite 2585-2593 (DE-627)16566682X (DE-600)1025692-1 (DE-576)023099836 0899-7667 nnns volume:28 year:2016 number:12 pages:2585-2593 http://dx.doi.org/10.1162/NECO_a_00892 Volltext http://search.proquest.com/docview/1844210931 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-PHY SSG-OLC-MAT GBV_ILN_59 GBV_ILN_2192 AR 28 2016 12 2585-2593 |
allfieldsSound |
10.1162/NECO_a_00892 doi PQ20170301 (DE-627)OLC1989106781 (DE-599)GBVOLC1989106781 (PRQ)c1301-c61db31d77d2ccc99c22bac5a333e04122cd90cf34453e57d56a2ae7ea81f45e0 (KEY)0175809820160000028001202585universalapproximationtheoremformixtureofexpertsmo DE-627 ger DE-627 rakwb eng 004 DE-600 Hien D Nguyen verfasserin aut A Universal Approximation Theorem for Mixture-of-Experts Models 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes. Neurosciences Theorems Estimating techniques Approximations Mathematical models Neural networks Luke R Lloyd-Jones oth Geoffrey J McLachlan oth Enthalten in Neural computation Cambridge, Mass. : MIT Press, 1989 28(2016), 12, Seite 2585-2593 (DE-627)16566682X (DE-600)1025692-1 (DE-576)023099836 0899-7667 nnns volume:28 year:2016 number:12 pages:2585-2593 http://dx.doi.org/10.1162/NECO_a_00892 Volltext http://search.proquest.com/docview/1844210931 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-PHY SSG-OLC-MAT GBV_ILN_59 GBV_ILN_2192 AR 28 2016 12 2585-2593 |
language |
English |
source |
Enthalten in Neural computation 28(2016), 12, Seite 2585-2593 volume:28 year:2016 number:12 pages:2585-2593 |
sourceStr |
Enthalten in Neural computation 28(2016), 12, Seite 2585-2593 volume:28 year:2016 number:12 pages:2585-2593 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Neurosciences Theorems Estimating techniques Approximations Mathematical models Neural networks |
dewey-raw |
004 |
isfreeaccess_bool |
false |
container_title |
Neural computation |
authorswithroles_txt_mv |
Hien D Nguyen @@aut@@ Luke R Lloyd-Jones @@oth@@ Geoffrey J McLachlan @@oth@@ |
publishDateDaySort_date |
2016-01-01T00:00:00Z |
hierarchy_top_id |
16566682X |
dewey-sort |
14 |
id |
OLC1989106781 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1989106781</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20210716170724.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">170207s2016 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1162/NECO_a_00892</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20170301</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1989106781</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1989106781</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)c1301-c61db31d77d2ccc99c22bac5a333e04122cd90cf34453e57d56a2ae7ea81f45e0</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0175809820160000028001202585universalapproximationtheoremformixtureofexpertsmo</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DE-600</subfield></datafield><datafield tag="100" ind1="0" ind2=" "><subfield code="a">Hien D Nguyen</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="2"><subfield code="a">A Universal Approximation Theorem for Mixture-of-Experts Models</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Neurosciences</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Theorems</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Estimating techniques</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Approximations</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Mathematical models</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Neural networks</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Luke R Lloyd-Jones</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Geoffrey J McLachlan</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Neural computation</subfield><subfield code="d">Cambridge, Mass. : MIT Press, 1989</subfield><subfield code="g">28(2016), 12, Seite 2585-2593</subfield><subfield code="w">(DE-627)16566682X</subfield><subfield code="w">(DE-600)1025692-1</subfield><subfield code="w">(DE-576)023099836</subfield><subfield code="x">0899-7667</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:28</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:12</subfield><subfield code="g">pages:2585-2593</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1162/NECO_a_00892</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://search.proquest.com/docview/1844210931</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHY</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_59</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2192</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">28</subfield><subfield code="j">2016</subfield><subfield code="e">12</subfield><subfield code="h">2585-2593</subfield></datafield></record></collection>
|
author |
Hien D Nguyen |
spellingShingle |
Hien D Nguyen ddc 004 misc Neurosciences misc Theorems misc Estimating techniques misc Approximations misc Mathematical models misc Neural networks A Universal Approximation Theorem for Mixture-of-Experts Models |
authorStr |
Hien D Nguyen |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)16566682X |
format |
Article |
dewey-ones |
004 - Data processing & computer science |
delete_txt_mv |
keep |
author_role |
aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
0899-7667 |
topic_title |
004 DE-600 A Universal Approximation Theorem for Mixture-of-Experts Models Neurosciences Theorems Estimating techniques Approximations Mathematical models Neural networks |
topic |
ddc 004 misc Neurosciences misc Theorems misc Estimating techniques misc Approximations misc Mathematical models misc Neural networks |
topic_unstemmed |
ddc 004 misc Neurosciences misc Theorems misc Estimating techniques misc Approximations misc Mathematical models misc Neural networks |
topic_browse |
ddc 004 misc Neurosciences misc Theorems misc Estimating techniques misc Approximations misc Mathematical models misc Neural networks |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
author2_variant |
l r l j lrlj g j m gjm |
hierarchy_parent_title |
Neural computation |
hierarchy_parent_id |
16566682X |
dewey-tens |
000 - Computer science, knowledge & systems |
hierarchy_top_title |
Neural computation |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)16566682X (DE-600)1025692-1 (DE-576)023099836 |
title |
A Universal Approximation Theorem for Mixture-of-Experts Models |
ctrlnum |
(DE-627)OLC1989106781 (DE-599)GBVOLC1989106781 (PRQ)c1301-c61db31d77d2ccc99c22bac5a333e04122cd90cf34453e57d56a2ae7ea81f45e0 (KEY)0175809820160000028001202585universalapproximationtheoremformixtureofexpertsmo |
title_full |
A Universal Approximation Theorem for Mixture-of-Experts Models |
author_sort |
Hien D Nguyen |
journal |
Neural computation |
journalStr |
Neural computation |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works |
recordtype |
marc |
publishDateSort |
2016 |
contenttype_str_mv |
txt |
container_start_page |
2585 |
author_browse |
Hien D Nguyen |
container_volume |
28 |
class |
004 DE-600 |
format_se |
Aufsätze |
author-letter |
Hien D Nguyen |
doi_str_mv |
10.1162/NECO_a_00892 |
dewey-full |
004 |
title_sort |
universal approximation theorem for mixture-of-experts models |
title_auth |
A Universal Approximation Theorem for Mixture-of-Experts Models |
abstract |
The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes. |
abstractGer |
The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes. |
abstract_unstemmed |
The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes. |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-PHY SSG-OLC-MAT GBV_ILN_59 GBV_ILN_2192 |
container_issue |
12 |
title_short |
A Universal Approximation Theorem for Mixture-of-Experts Models |
url |
http://dx.doi.org/10.1162/NECO_a_00892 http://search.proquest.com/docview/1844210931 |
remote_bool |
false |
author2 |
Luke R Lloyd-Jones Geoffrey J McLachlan |
author2Str |
Luke R Lloyd-Jones Geoffrey J McLachlan |
ppnlink |
16566682X |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
author2_role |
oth oth |
doi_str |
10.1162/NECO_a_00892 |
up_date |
2024-07-03T20:21:02.340Z |
_version_ |
1803590631257276416 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1989106781</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20210716170724.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">170207s2016 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1162/NECO_a_00892</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20170301</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1989106781</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1989106781</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)c1301-c61db31d77d2ccc99c22bac5a333e04122cd90cf34453e57d56a2ae7ea81f45e0</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0175809820160000028001202585universalapproximationtheoremformixtureofexpertsmo</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DE-600</subfield></datafield><datafield tag="100" ind1="0" ind2=" "><subfield code="a">Hien D Nguyen</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="2"><subfield code="a">A Universal Approximation Theorem for Mixture-of-Experts Models</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">The mixture-of-experts (MoE) model is a popular neural network architecture for nonlinear regression and classification. The class of MoE mean functions is known to be uniformly convergent to any unknown target function, assuming that the target function is from a Sobolev space that is sufficiently differentiable and that the domain of estimation is a compact unit hypercube. We provide an alternative result, which shows that the class of MoE mean functions is dense in the class of all continuous functions over arbitrary compact domains of estimation. Our result can be viewed as a universal approximation theorem for MoE models. The theorem we present allows MoE users to be confident in applying such models for estimation when data arise from nonlinear and nondifferentiable generative processes.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Neurosciences</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Theorems</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Estimating techniques</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Approximations</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Mathematical models</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Neural networks</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Luke R Lloyd-Jones</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">Geoffrey J McLachlan</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Neural computation</subfield><subfield code="d">Cambridge, Mass. : MIT Press, 1989</subfield><subfield code="g">28(2016), 12, Seite 2585-2593</subfield><subfield code="w">(DE-627)16566682X</subfield><subfield code="w">(DE-600)1025692-1</subfield><subfield code="w">(DE-576)023099836</subfield><subfield code="x">0899-7667</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:28</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:12</subfield><subfield code="g">pages:2585-2593</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1162/NECO_a_00892</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://search.proquest.com/docview/1844210931</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-PHY</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_59</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2192</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">28</subfield><subfield code="j">2016</subfield><subfield code="e">12</subfield><subfield code="h">2585-2593</subfield></datafield></record></collection>
|
score |
7.3993816 |