On the Minimax Risk of Dictionary Learning

We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of comp...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	A Jung [verfasserIn] YC Eldar N Gortz

Format:	Artikel
Sprache:	Englisch

Erschienen:	2016

Schlagwörter:	Normal distribution Dictionaries Matrix Learning Sample size Signal to noise ratio

Systematik:	SA 5570

Übergeordnetes Werk:	Enthalten in: IEEE transactions on information theory - Piscataway, NJ : IEEE, 1963, 62(2016), 3, Seite 1501-1
Übergeordnetes Werk:	volume:62 ; year:2016 ; number:3 ; pages:1501-1

Links:	Volltext

DOI / URN:	10.1109/TIT.2016.2517006

Katalog-ID:	OLC1972621599

Internformat


LEADER	01000caa a2200265 4500
001	OLC1972621599
003	DE-627
005	20220221163259.0
007	tu
008	160427s2016 xx \|\|\|\|\| 00\| \|\|eng c
024	7		\|a 10.1109/TIT.2016.2517006 \|2 doi
028	5	2	\|a PQ20160430
035			\|a (DE-627)OLC1972621599
035			\|a (DE-599)GBVOLC1972621599
035			\|a (PRQ)c1067-de8c2131eeb76fca609373f00383d9eae73fe6d76d78b289e05139cf5344e0420
035			\|a (KEY)0023448620160000062000301501ontheminimaxriskofdictionarylearning
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 070 \|a 620 \|q DNB
084			\|a SA 5570 \|q AVZ \|2 rvk
100	0		\|a A Jung \|e verfasserin \|4 aut
245	1	0	\|a On the Minimax Risk of Dictionary Learning
264		1	\|c 2016
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
520			\|a We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size.
650		4	\|a Normal distribution
650		4	\|a Dictionaries
650		4	\|a Matrix
650		4	\|a Learning
650		4	\|a Sample size
650		4	\|a Signal to noise ratio
700	0		\|a YC Eldar \|4 oth
700	0		\|a N Gortz \|4 oth
773	0	8	\|i Enthalten in \|t IEEE transactions on information theory \|d Piscataway, NJ : IEEE, 1963 \|g 62(2016), 3, Seite 1501-1 \|w (DE-627)12954731X \|w (DE-600)218505-2 \|w (DE-576)01499819X \|x 0018-9448 \|7 nnns
773	1	8	\|g volume:62 \|g year:2016 \|g number:3 \|g pages:1501-1
856	4	1	\|u http://dx.doi.org/10.1109/TIT.2016.2517006 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_OLC
912			\|a SSG-OLC-TEC
912			\|a SSG-OLC-MAT
912			\|a SSG-OLC-BUB
912			\|a SSG-OPC-BBI
912			\|a GBV_ILN_65
912			\|a GBV_ILN_70
912			\|a GBV_ILN_2002
912			\|a GBV_ILN_2088
936	r	v	\|a SA 5570
951			\|a AR
952			\|d 62 \|j 2016 \|e 3 \|h 1501-1

Indexfelder

author_variant	a j aj
matchkey_str	article:00189448:2016----::nhmnmxikfitoa
hierarchy_sort_str	2016
publishDate	2016
allfields	10.1109/TIT.2016.2517006 doi PQ20160430 (DE-627)OLC1972621599 (DE-599)GBVOLC1972621599 (PRQ)c1067-de8c2131eeb76fca609373f00383d9eae73fe6d76d78b289e05139cf5344e0420 (KEY)0023448620160000062000301501ontheminimaxriskofdictionarylearning DE-627 ger DE-627 rakwb eng 070 620 DNB SA 5570 AVZ rvk A Jung verfasserin aut On the Minimax Risk of Dictionary Learning 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size. Normal distribution Dictionaries Matrix Learning Sample size Signal to noise ratio YC Eldar oth N Gortz oth Enthalten in IEEE transactions on information theory Piscataway, NJ : IEEE, 1963 62(2016), 3, Seite 1501-1 (DE-627)12954731X (DE-600)218505-2 (DE-576)01499819X 0018-9448 nnns volume:62 year:2016 number:3 pages:1501-1 http://dx.doi.org/10.1109/TIT.2016.2517006 Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_65 GBV_ILN_70 GBV_ILN_2002 GBV_ILN_2088 SA 5570 AR 62 2016 3 1501-1
spelling	10.1109/TIT.2016.2517006 doi PQ20160430 (DE-627)OLC1972621599 (DE-599)GBVOLC1972621599 (PRQ)c1067-de8c2131eeb76fca609373f00383d9eae73fe6d76d78b289e05139cf5344e0420 (KEY)0023448620160000062000301501ontheminimaxriskofdictionarylearning DE-627 ger DE-627 rakwb eng 070 620 DNB SA 5570 AVZ rvk A Jung verfasserin aut On the Minimax Risk of Dictionary Learning 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size. Normal distribution Dictionaries Matrix Learning Sample size Signal to noise ratio YC Eldar oth N Gortz oth Enthalten in IEEE transactions on information theory Piscataway, NJ : IEEE, 1963 62(2016), 3, Seite 1501-1 (DE-627)12954731X (DE-600)218505-2 (DE-576)01499819X 0018-9448 nnns volume:62 year:2016 number:3 pages:1501-1 http://dx.doi.org/10.1109/TIT.2016.2517006 Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_65 GBV_ILN_70 GBV_ILN_2002 GBV_ILN_2088 SA 5570 AR 62 2016 3 1501-1
allfields_unstemmed	10.1109/TIT.2016.2517006 doi PQ20160430 (DE-627)OLC1972621599 (DE-599)GBVOLC1972621599 (PRQ)c1067-de8c2131eeb76fca609373f00383d9eae73fe6d76d78b289e05139cf5344e0420 (KEY)0023448620160000062000301501ontheminimaxriskofdictionarylearning DE-627 ger DE-627 rakwb eng 070 620 DNB SA 5570 AVZ rvk A Jung verfasserin aut On the Minimax Risk of Dictionary Learning 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size. Normal distribution Dictionaries Matrix Learning Sample size Signal to noise ratio YC Eldar oth N Gortz oth Enthalten in IEEE transactions on information theory Piscataway, NJ : IEEE, 1963 62(2016), 3, Seite 1501-1 (DE-627)12954731X (DE-600)218505-2 (DE-576)01499819X 0018-9448 nnns volume:62 year:2016 number:3 pages:1501-1 http://dx.doi.org/10.1109/TIT.2016.2517006 Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_65 GBV_ILN_70 GBV_ILN_2002 GBV_ILN_2088 SA 5570 AR 62 2016 3 1501-1
allfieldsGer	10.1109/TIT.2016.2517006 doi PQ20160430 (DE-627)OLC1972621599 (DE-599)GBVOLC1972621599 (PRQ)c1067-de8c2131eeb76fca609373f00383d9eae73fe6d76d78b289e05139cf5344e0420 (KEY)0023448620160000062000301501ontheminimaxriskofdictionarylearning DE-627 ger DE-627 rakwb eng 070 620 DNB SA 5570 AVZ rvk A Jung verfasserin aut On the Minimax Risk of Dictionary Learning 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size. Normal distribution Dictionaries Matrix Learning Sample size Signal to noise ratio YC Eldar oth N Gortz oth Enthalten in IEEE transactions on information theory Piscataway, NJ : IEEE, 1963 62(2016), 3, Seite 1501-1 (DE-627)12954731X (DE-600)218505-2 (DE-576)01499819X 0018-9448 nnns volume:62 year:2016 number:3 pages:1501-1 http://dx.doi.org/10.1109/TIT.2016.2517006 Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_65 GBV_ILN_70 GBV_ILN_2002 GBV_ILN_2088 SA 5570 AR 62 2016 3 1501-1
allfieldsSound	10.1109/TIT.2016.2517006 doi PQ20160430 (DE-627)OLC1972621599 (DE-599)GBVOLC1972621599 (PRQ)c1067-de8c2131eeb76fca609373f00383d9eae73fe6d76d78b289e05139cf5344e0420 (KEY)0023448620160000062000301501ontheminimaxriskofdictionarylearning DE-627 ger DE-627 rakwb eng 070 620 DNB SA 5570 AVZ rvk A Jung verfasserin aut On the Minimax Risk of Dictionary Learning 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size. Normal distribution Dictionaries Matrix Learning Sample size Signal to noise ratio YC Eldar oth N Gortz oth Enthalten in IEEE transactions on information theory Piscataway, NJ : IEEE, 1963 62(2016), 3, Seite 1501-1 (DE-627)12954731X (DE-600)218505-2 (DE-576)01499819X 0018-9448 nnns volume:62 year:2016 number:3 pages:1501-1 http://dx.doi.org/10.1109/TIT.2016.2517006 Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_65 GBV_ILN_70 GBV_ILN_2002 GBV_ILN_2088 SA 5570 AR 62 2016 3 1501-1
language	English
source	Enthalten in IEEE transactions on information theory 62(2016), 3, Seite 1501-1 volume:62 year:2016 number:3 pages:1501-1
sourceStr	Enthalten in IEEE transactions on information theory 62(2016), 3, Seite 1501-1 volume:62 year:2016 number:3 pages:1501-1
format_phy_str_mv	Article
institution	findex.gbv.de
topic_facet	Normal distribution Dictionaries Matrix Learning Sample size Signal to noise ratio
dewey-raw	070
isfreeaccess_bool	false
container_title	IEEE transactions on information theory
authorswithroles_txt_mv	A Jung @@aut@@ YC Eldar @@oth@@ N Gortz @@oth@@
publishDateDaySort_date	2016-01-01T00:00:00Z
hierarchy_top_id	12954731X
dewey-sort	270
id	OLC1972621599
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1972621599</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20220221163259.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160427s2016 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1109/TIT.2016.2517006</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160430</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1972621599</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1972621599</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)c1067-de8c2131eeb76fca609373f00383d9eae73fe6d76d78b289e05139cf5344e0420</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0023448620160000062000301501ontheminimaxriskofdictionarylearning</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">070</subfield><subfield code="a">620</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SA 5570</subfield><subfield code="q">AVZ</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="0" ind2=" "><subfield code="a">A Jung</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">On the Minimax Risk of Dictionary Learning</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Normal distribution</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Dictionaries</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Matrix</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Sample size</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Signal to noise ratio</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">YC Eldar</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">N Gortz</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">IEEE transactions on information theory</subfield><subfield code="d">Piscataway, NJ : IEEE, 1963</subfield><subfield code="g">62(2016), 3, Seite 1501-1</subfield><subfield code="w">(DE-627)12954731X</subfield><subfield code="w">(DE-600)218505-2</subfield><subfield code="w">(DE-576)01499819X</subfield><subfield code="x">0018-9448</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:62</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:3</subfield><subfield code="g">pages:1501-1</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1109/TIT.2016.2517006</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2002</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2088</subfield></datafield><datafield tag="936" ind1="r" ind2="v"><subfield code="a">SA 5570</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">62</subfield><subfield code="j">2016</subfield><subfield code="e">3</subfield><subfield code="h">1501-1</subfield></datafield></record></collection>
author	A Jung
spellingShingle	A Jung ddc 070 rvk SA 5570 misc Normal distribution misc Dictionaries misc Matrix misc Learning misc Sample size misc Signal to noise ratio On the Minimax Risk of Dictionary Learning
authorStr	A Jung
ppnlink_with_tag_str_mv	@@773@@(DE-627)12954731X
format	Article
dewey-ones	070 - News media, journalism & publishing 620 - Engineering & allied operations
delete_txt_mv	keep
author_role	aut
collection	OLC
remote_str	false
illustrated	Not Illustrated
issn	0018-9448
topic_title	070 620 DNB SA 5570 AVZ rvk On the Minimax Risk of Dictionary Learning Normal distribution Dictionaries Matrix Learning Sample size Signal to noise ratio
topic	ddc 070 rvk SA 5570 misc Normal distribution misc Dictionaries misc Matrix misc Learning misc Sample size misc Signal to noise ratio
topic_unstemmed	ddc 070 rvk SA 5570 misc Normal distribution misc Dictionaries misc Matrix misc Learning misc Sample size misc Signal to noise ratio
topic_browse	ddc 070 rvk SA 5570 misc Normal distribution misc Dictionaries misc Matrix misc Learning misc Sample size misc Signal to noise ratio
format_facet	Aufsätze Gedruckte Aufsätze
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	nc
author2_variant	y e ye n g ng
hierarchy_parent_title	IEEE transactions on information theory
hierarchy_parent_id	12954731X
dewey-tens	070 - News media, journalism & publishing 620 - Engineering
hierarchy_top_title	IEEE transactions on information theory
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)12954731X (DE-600)218505-2 (DE-576)01499819X
title	On the Minimax Risk of Dictionary Learning
ctrlnum	(DE-627)OLC1972621599 (DE-599)GBVOLC1972621599 (PRQ)c1067-de8c2131eeb76fca609373f00383d9eae73fe6d76d78b289e05139cf5344e0420 (KEY)0023448620160000062000301501ontheminimaxriskofdictionarylearning
title_full	On the Minimax Risk of Dictionary Learning
author_sort	A Jung
journal	IEEE transactions on information theory
journalStr	IEEE transactions on information theory
lang_code	eng
isOA_bool	false
dewey-hundreds	000 - Computer science, information & general works 600 - Technology
recordtype	marc
publishDateSort	2016
contenttype_str_mv	txt
container_start_page	1501
author_browse	A Jung
container_volume	62
class	070 620 DNB SA 5570 AVZ rvk
format_se	Aufsätze
author-letter	A Jung
doi_str_mv	10.1109/TIT.2016.2517006
dewey-full	070 620
title_sort	on the minimax risk of dictionary learning
title_auth	On the Minimax Risk of Dictionary Learning
abstract	We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size.
abstractGer	We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size.
abstract_unstemmed	We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size.
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_65 GBV_ILN_70 GBV_ILN_2002 GBV_ILN_2088
container_issue	3
title_short	On the Minimax Risk of Dictionary Learning
url	http://dx.doi.org/10.1109/TIT.2016.2517006
remote_bool	false
author2	YC Eldar N Gortz
author2Str	YC Eldar N Gortz
ppnlink	12954731X
mediatype_str_mv	n
isOA_txt	false
hochschulschrift_bool	false
author2_role	oth oth
doi_str	10.1109/TIT.2016.2517006
up_date	2024-07-03T23:52:15.378Z
_version_	1803603919907061760
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1972621599</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20220221163259.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160427s2016 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1109/TIT.2016.2517006</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160430</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1972621599</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1972621599</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)c1067-de8c2131eeb76fca609373f00383d9eae73fe6d76d78b289e05139cf5344e0420</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0023448620160000062000301501ontheminimaxriskofdictionarylearning</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">070</subfield><subfield code="a">620</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">SA 5570</subfield><subfield code="q">AVZ</subfield><subfield code="2">rvk</subfield></datafield><datafield tag="100" ind1="0" ind2=" "><subfield code="a">A Jung</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">On the Minimax Risk of Dictionary Learning</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">We consider the problem of learning a dictionary matrix from a number of observed signals, which are assumed to be generated via a linear model with a common underlying dictionary. In particular, we derive lower bounds on the minimum achievable worst case mean squared error (MSE), regardless of computational complexity of the dictionary learning (DL) schemes. By casting DL as a classical (or frequentist) estimation problem, the lower bounds on the worst case MSE are derived following an established information-theoretic approach to minimax estimation. The main contribution of this paper is the adaption of these information-theoretic tools to the DL problem in order to derive lower bounds on the worst case MSE of any DL algorithm. We derive three different lower bounds applying to different generative models for the observed signals. The first bound only requires the existence of a covariance matrix of the (unknown) underlying coefficient vector. By specializing this bound to the case of sparse coefficient distributions and assuming the true dictionary satisfies the restricted isometry property, we obtain a lower bound on the worst case MSE of DL methods in terms of the signal-to-noise ratio (SNR). The third bound applies to a more restrictive subclass of coefficient distributions by requiring the non-zero coefficients to be Gaussian. Although the applicability of this bound is the most limited, it is the tightest of the three bounds in the low SNR regime. A particular use of our lower bounds is the derivation of necessary conditions on the required number of observations (sample size), such that DL is feasible, i.e., accurate DL schemes might exist. By comparing these necessary conditions with sufficient conditions on the sample size such that a particular DL technique is successful, we are able to characterize the regimes, where those algorithms are optimal in terms of required sample size.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Normal distribution</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Dictionaries</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Matrix</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Learning</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Sample size</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Signal to noise ratio</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">YC Eldar</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="0" ind2=" "><subfield code="a">N Gortz</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">IEEE transactions on information theory</subfield><subfield code="d">Piscataway, NJ : IEEE, 1963</subfield><subfield code="g">62(2016), 3, Seite 1501-1</subfield><subfield code="w">(DE-627)12954731X</subfield><subfield code="w">(DE-600)218505-2</subfield><subfield code="w">(DE-576)01499819X</subfield><subfield code="x">0018-9448</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:62</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:3</subfield><subfield code="g">pages:1501-1</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1109/TIT.2016.2517006</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2002</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2088</subfield></datafield><datafield tag="936" ind1="r" ind2="v"><subfield code="a">SA 5570</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">62</subfield><subfield code="j">2016</subfield><subfield code="e">3</subfield><subfield code="h">1501-1</subfield></datafield></record></collection>
score	7.39956

Nicht das Richtige dabei?

Schreiben Sie uns!

On the Minimax Risk of Dictionary Learning

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?