DynK-hydra: improved dynamic architecture ensembling for efficient inference
Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets w...
Ausführliche Beschreibung
Autor*in: |
Ileni, Tudor Alexandru [verfasserIn] |
---|
Format: |
E-Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2022 |
---|
Schlagwörter: |
---|
Anmerkung: |
© The Author(s) 2022 |
---|
Übergeordnetes Werk: |
Enthalten in: Complex & intelligent systems - Berlin : SpringerOpen, 2015, 9(2022), 2 vom: 16. Nov., Seite 2177-2188 |
---|---|
Übergeordnetes Werk: |
volume:9 ; year:2022 ; number:2 ; day:16 ; month:11 ; pages:2177-2188 |
Links: |
---|
DOI / URN: |
10.1007/s40747-022-00897-1 |
---|
Katalog-ID: |
SPR050101056 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | SPR050101056 | ||
003 | DE-627 | ||
005 | 20230419064805.0 | ||
007 | cr uuu---uuuuu | ||
008 | 230419s2022 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1007/s40747-022-00897-1 |2 doi | |
035 | |a (DE-627)SPR050101056 | ||
035 | |a (SPR)s40747-022-00897-1-e | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Ileni, Tudor Alexandru |e verfasserin |4 aut | |
245 | 1 | 0 | |a DynK-hydra: improved dynamic architecture ensembling for efficient inference |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a Computermedien |b c |2 rdamedia | ||
338 | |a Online-Ressource |b cr |2 rdacarrier | ||
500 | |a © The Author(s) 2022 | ||
520 | |a Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets with a high number of classes, but uses, during the inference, only a subset of necessary branches. We achieve this by training a cohort of specialized branches (deep network of reduced size) and a gater/supervisor, that decides dynamically what branch to use for any specific input. An original contribution is that the number of chosen models is dynamically set, based on how confident the gater is (similar works use a static parameter for this). Another contribution is the way we ensure the branches’ specialization. We divide the data set classes into multiple clusters, and we assign a cluster to each branch while enforcing its specialization on this cluster by a separate loss function. We evaluate DynK-Hydra on CIFAR-100, Food-101, CUB-200, and ImageNet32 data sets and we obtain improvements of up to 4.3% accuracy compared with state-of-the-art ResNet. All this while reducing the number of inference flops by a factor of 2–5.5 times. Compared to a similar work (HydraRes), we obtain marginal accuracy improvements of up to 1.2% on the pairwise inference time architectures. However, we improve the inference times by up to 2.8 times compared to HydraRes. | ||
650 | 4 | |a Dynamic network |7 (dpeaa)DE-He213 | |
650 | 4 | |a Ensemble |7 (dpeaa)DE-He213 | |
650 | 4 | |a Optimal inference |7 (dpeaa)DE-He213 | |
650 | 4 | |a Conditional computation |7 (dpeaa)DE-He213 | |
700 | 1 | |a Darabant, Adrian Sergiu |0 (orcid)0000-0002-7580-5722 |4 aut | |
700 | 1 | |a Borza, Diana Laura |4 aut | |
700 | 1 | |a Marinescu, Alexandru Ion |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Complex & intelligent systems |d Berlin : SpringerOpen, 2015 |g 9(2022), 2 vom: 16. Nov., Seite 2177-2188 |w (DE-627)835589269 |w (DE-600)2834740-7 |x 2198-6053 |7 nnns |
773 | 1 | 8 | |g volume:9 |g year:2022 |g number:2 |g day:16 |g month:11 |g pages:2177-2188 |
856 | 4 | 0 | |u https://dx.doi.org/10.1007/s40747-022-00897-1 |z kostenfrei |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_SPRINGER | ||
912 | |a GBV_ILN_11 | ||
912 | |a GBV_ILN_20 | ||
912 | |a GBV_ILN_22 | ||
912 | |a GBV_ILN_23 | ||
912 | |a GBV_ILN_24 | ||
912 | |a GBV_ILN_31 | ||
912 | |a GBV_ILN_39 | ||
912 | |a GBV_ILN_40 | ||
912 | |a GBV_ILN_60 | ||
912 | |a GBV_ILN_62 | ||
912 | |a GBV_ILN_63 | ||
912 | |a GBV_ILN_65 | ||
912 | |a GBV_ILN_69 | ||
912 | |a GBV_ILN_70 | ||
912 | |a GBV_ILN_73 | ||
912 | |a GBV_ILN_95 | ||
912 | |a GBV_ILN_105 | ||
912 | |a GBV_ILN_110 | ||
912 | |a GBV_ILN_151 | ||
912 | |a GBV_ILN_161 | ||
912 | |a GBV_ILN_170 | ||
912 | |a GBV_ILN_213 | ||
912 | |a GBV_ILN_230 | ||
912 | |a GBV_ILN_285 | ||
912 | |a GBV_ILN_293 | ||
912 | |a GBV_ILN_370 | ||
912 | |a GBV_ILN_602 | ||
912 | |a GBV_ILN_2014 | ||
912 | |a GBV_ILN_4012 | ||
912 | |a GBV_ILN_4037 | ||
912 | |a GBV_ILN_4112 | ||
912 | |a GBV_ILN_4125 | ||
912 | |a GBV_ILN_4126 | ||
912 | |a GBV_ILN_4249 | ||
912 | |a GBV_ILN_4305 | ||
912 | |a GBV_ILN_4306 | ||
912 | |a GBV_ILN_4307 | ||
912 | |a GBV_ILN_4313 | ||
912 | |a GBV_ILN_4322 | ||
912 | |a GBV_ILN_4323 | ||
912 | |a GBV_ILN_4324 | ||
912 | |a GBV_ILN_4325 | ||
912 | |a GBV_ILN_4326 | ||
912 | |a GBV_ILN_4335 | ||
912 | |a GBV_ILN_4338 | ||
912 | |a GBV_ILN_4367 | ||
912 | |a GBV_ILN_4700 | ||
951 | |a AR | ||
952 | |d 9 |j 2022 |e 2 |b 16 |c 11 |h 2177-2188 |
author_variant |
t a i ta tai a s d as asd d l b dl dlb a i m ai aim |
---|---|
matchkey_str |
article:21986053:2022----::ykyripoednmcrhtcuenebigo |
hierarchy_sort_str |
2022 |
publishDate |
2022 |
allfields |
10.1007/s40747-022-00897-1 doi (DE-627)SPR050101056 (SPR)s40747-022-00897-1-e DE-627 ger DE-627 rakwb eng Ileni, Tudor Alexandru verfasserin aut DynK-hydra: improved dynamic architecture ensembling for efficient inference 2022 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s) 2022 Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets with a high number of classes, but uses, during the inference, only a subset of necessary branches. We achieve this by training a cohort of specialized branches (deep network of reduced size) and a gater/supervisor, that decides dynamically what branch to use for any specific input. An original contribution is that the number of chosen models is dynamically set, based on how confident the gater is (similar works use a static parameter for this). Another contribution is the way we ensure the branches’ specialization. We divide the data set classes into multiple clusters, and we assign a cluster to each branch while enforcing its specialization on this cluster by a separate loss function. We evaluate DynK-Hydra on CIFAR-100, Food-101, CUB-200, and ImageNet32 data sets and we obtain improvements of up to 4.3% accuracy compared with state-of-the-art ResNet. All this while reducing the number of inference flops by a factor of 2–5.5 times. Compared to a similar work (HydraRes), we obtain marginal accuracy improvements of up to 1.2% on the pairwise inference time architectures. However, we improve the inference times by up to 2.8 times compared to HydraRes. Dynamic network (dpeaa)DE-He213 Ensemble (dpeaa)DE-He213 Optimal inference (dpeaa)DE-He213 Conditional computation (dpeaa)DE-He213 Darabant, Adrian Sergiu (orcid)0000-0002-7580-5722 aut Borza, Diana Laura aut Marinescu, Alexandru Ion aut Enthalten in Complex & intelligent systems Berlin : SpringerOpen, 2015 9(2022), 2 vom: 16. Nov., Seite 2177-2188 (DE-627)835589269 (DE-600)2834740-7 2198-6053 nnns volume:9 year:2022 number:2 day:16 month:11 pages:2177-2188 https://dx.doi.org/10.1007/s40747-022-00897-1 kostenfrei Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 9 2022 2 16 11 2177-2188 |
spelling |
10.1007/s40747-022-00897-1 doi (DE-627)SPR050101056 (SPR)s40747-022-00897-1-e DE-627 ger DE-627 rakwb eng Ileni, Tudor Alexandru verfasserin aut DynK-hydra: improved dynamic architecture ensembling for efficient inference 2022 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s) 2022 Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets with a high number of classes, but uses, during the inference, only a subset of necessary branches. We achieve this by training a cohort of specialized branches (deep network of reduced size) and a gater/supervisor, that decides dynamically what branch to use for any specific input. An original contribution is that the number of chosen models is dynamically set, based on how confident the gater is (similar works use a static parameter for this). Another contribution is the way we ensure the branches’ specialization. We divide the data set classes into multiple clusters, and we assign a cluster to each branch while enforcing its specialization on this cluster by a separate loss function. We evaluate DynK-Hydra on CIFAR-100, Food-101, CUB-200, and ImageNet32 data sets and we obtain improvements of up to 4.3% accuracy compared with state-of-the-art ResNet. All this while reducing the number of inference flops by a factor of 2–5.5 times. Compared to a similar work (HydraRes), we obtain marginal accuracy improvements of up to 1.2% on the pairwise inference time architectures. However, we improve the inference times by up to 2.8 times compared to HydraRes. Dynamic network (dpeaa)DE-He213 Ensemble (dpeaa)DE-He213 Optimal inference (dpeaa)DE-He213 Conditional computation (dpeaa)DE-He213 Darabant, Adrian Sergiu (orcid)0000-0002-7580-5722 aut Borza, Diana Laura aut Marinescu, Alexandru Ion aut Enthalten in Complex & intelligent systems Berlin : SpringerOpen, 2015 9(2022), 2 vom: 16. Nov., Seite 2177-2188 (DE-627)835589269 (DE-600)2834740-7 2198-6053 nnns volume:9 year:2022 number:2 day:16 month:11 pages:2177-2188 https://dx.doi.org/10.1007/s40747-022-00897-1 kostenfrei Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 9 2022 2 16 11 2177-2188 |
allfields_unstemmed |
10.1007/s40747-022-00897-1 doi (DE-627)SPR050101056 (SPR)s40747-022-00897-1-e DE-627 ger DE-627 rakwb eng Ileni, Tudor Alexandru verfasserin aut DynK-hydra: improved dynamic architecture ensembling for efficient inference 2022 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s) 2022 Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets with a high number of classes, but uses, during the inference, only a subset of necessary branches. We achieve this by training a cohort of specialized branches (deep network of reduced size) and a gater/supervisor, that decides dynamically what branch to use for any specific input. An original contribution is that the number of chosen models is dynamically set, based on how confident the gater is (similar works use a static parameter for this). Another contribution is the way we ensure the branches’ specialization. We divide the data set classes into multiple clusters, and we assign a cluster to each branch while enforcing its specialization on this cluster by a separate loss function. We evaluate DynK-Hydra on CIFAR-100, Food-101, CUB-200, and ImageNet32 data sets and we obtain improvements of up to 4.3% accuracy compared with state-of-the-art ResNet. All this while reducing the number of inference flops by a factor of 2–5.5 times. Compared to a similar work (HydraRes), we obtain marginal accuracy improvements of up to 1.2% on the pairwise inference time architectures. However, we improve the inference times by up to 2.8 times compared to HydraRes. Dynamic network (dpeaa)DE-He213 Ensemble (dpeaa)DE-He213 Optimal inference (dpeaa)DE-He213 Conditional computation (dpeaa)DE-He213 Darabant, Adrian Sergiu (orcid)0000-0002-7580-5722 aut Borza, Diana Laura aut Marinescu, Alexandru Ion aut Enthalten in Complex & intelligent systems Berlin : SpringerOpen, 2015 9(2022), 2 vom: 16. Nov., Seite 2177-2188 (DE-627)835589269 (DE-600)2834740-7 2198-6053 nnns volume:9 year:2022 number:2 day:16 month:11 pages:2177-2188 https://dx.doi.org/10.1007/s40747-022-00897-1 kostenfrei Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 9 2022 2 16 11 2177-2188 |
allfieldsGer |
10.1007/s40747-022-00897-1 doi (DE-627)SPR050101056 (SPR)s40747-022-00897-1-e DE-627 ger DE-627 rakwb eng Ileni, Tudor Alexandru verfasserin aut DynK-hydra: improved dynamic architecture ensembling for efficient inference 2022 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s) 2022 Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets with a high number of classes, but uses, during the inference, only a subset of necessary branches. We achieve this by training a cohort of specialized branches (deep network of reduced size) and a gater/supervisor, that decides dynamically what branch to use for any specific input. An original contribution is that the number of chosen models is dynamically set, based on how confident the gater is (similar works use a static parameter for this). Another contribution is the way we ensure the branches’ specialization. We divide the data set classes into multiple clusters, and we assign a cluster to each branch while enforcing its specialization on this cluster by a separate loss function. We evaluate DynK-Hydra on CIFAR-100, Food-101, CUB-200, and ImageNet32 data sets and we obtain improvements of up to 4.3% accuracy compared with state-of-the-art ResNet. All this while reducing the number of inference flops by a factor of 2–5.5 times. Compared to a similar work (HydraRes), we obtain marginal accuracy improvements of up to 1.2% on the pairwise inference time architectures. However, we improve the inference times by up to 2.8 times compared to HydraRes. Dynamic network (dpeaa)DE-He213 Ensemble (dpeaa)DE-He213 Optimal inference (dpeaa)DE-He213 Conditional computation (dpeaa)DE-He213 Darabant, Adrian Sergiu (orcid)0000-0002-7580-5722 aut Borza, Diana Laura aut Marinescu, Alexandru Ion aut Enthalten in Complex & intelligent systems Berlin : SpringerOpen, 2015 9(2022), 2 vom: 16. Nov., Seite 2177-2188 (DE-627)835589269 (DE-600)2834740-7 2198-6053 nnns volume:9 year:2022 number:2 day:16 month:11 pages:2177-2188 https://dx.doi.org/10.1007/s40747-022-00897-1 kostenfrei Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 9 2022 2 16 11 2177-2188 |
allfieldsSound |
10.1007/s40747-022-00897-1 doi (DE-627)SPR050101056 (SPR)s40747-022-00897-1-e DE-627 ger DE-627 rakwb eng Ileni, Tudor Alexandru verfasserin aut DynK-hydra: improved dynamic architecture ensembling for efficient inference 2022 Text txt rdacontent Computermedien c rdamedia Online-Ressource cr rdacarrier © The Author(s) 2022 Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets with a high number of classes, but uses, during the inference, only a subset of necessary branches. We achieve this by training a cohort of specialized branches (deep network of reduced size) and a gater/supervisor, that decides dynamically what branch to use for any specific input. An original contribution is that the number of chosen models is dynamically set, based on how confident the gater is (similar works use a static parameter for this). Another contribution is the way we ensure the branches’ specialization. We divide the data set classes into multiple clusters, and we assign a cluster to each branch while enforcing its specialization on this cluster by a separate loss function. We evaluate DynK-Hydra on CIFAR-100, Food-101, CUB-200, and ImageNet32 data sets and we obtain improvements of up to 4.3% accuracy compared with state-of-the-art ResNet. All this while reducing the number of inference flops by a factor of 2–5.5 times. Compared to a similar work (HydraRes), we obtain marginal accuracy improvements of up to 1.2% on the pairwise inference time architectures. However, we improve the inference times by up to 2.8 times compared to HydraRes. Dynamic network (dpeaa)DE-He213 Ensemble (dpeaa)DE-He213 Optimal inference (dpeaa)DE-He213 Conditional computation (dpeaa)DE-He213 Darabant, Adrian Sergiu (orcid)0000-0002-7580-5722 aut Borza, Diana Laura aut Marinescu, Alexandru Ion aut Enthalten in Complex & intelligent systems Berlin : SpringerOpen, 2015 9(2022), 2 vom: 16. Nov., Seite 2177-2188 (DE-627)835589269 (DE-600)2834740-7 2198-6053 nnns volume:9 year:2022 number:2 day:16 month:11 pages:2177-2188 https://dx.doi.org/10.1007/s40747-022-00897-1 kostenfrei Volltext GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 AR 9 2022 2 16 11 2177-2188 |
language |
English |
source |
Enthalten in Complex & intelligent systems 9(2022), 2 vom: 16. Nov., Seite 2177-2188 volume:9 year:2022 number:2 day:16 month:11 pages:2177-2188 |
sourceStr |
Enthalten in Complex & intelligent systems 9(2022), 2 vom: 16. Nov., Seite 2177-2188 volume:9 year:2022 number:2 day:16 month:11 pages:2177-2188 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Dynamic network Ensemble Optimal inference Conditional computation |
isfreeaccess_bool |
true |
container_title |
Complex & intelligent systems |
authorswithroles_txt_mv |
Ileni, Tudor Alexandru @@aut@@ Darabant, Adrian Sergiu @@aut@@ Borza, Diana Laura @@aut@@ Marinescu, Alexandru Ion @@aut@@ |
publishDateDaySort_date |
2022-11-16T00:00:00Z |
hierarchy_top_id |
835589269 |
id |
SPR050101056 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">SPR050101056</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230419064805.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">230419s2022 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s40747-022-00897-1</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)SPR050101056</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(SPR)s40747-022-00897-1-e</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Ileni, Tudor Alexandru</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">DynK-hydra: improved dynamic architecture ensembling for efficient inference</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s) 2022</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets with a high number of classes, but uses, during the inference, only a subset of necessary branches. We achieve this by training a cohort of specialized branches (deep network of reduced size) and a gater/supervisor, that decides dynamically what branch to use for any specific input. An original contribution is that the number of chosen models is dynamically set, based on how confident the gater is (similar works use a static parameter for this). Another contribution is the way we ensure the branches’ specialization. We divide the data set classes into multiple clusters, and we assign a cluster to each branch while enforcing its specialization on this cluster by a separate loss function. We evaluate DynK-Hydra on CIFAR-100, Food-101, CUB-200, and ImageNet32 data sets and we obtain improvements of up to 4.3% accuracy compared with state-of-the-art ResNet. All this while reducing the number of inference flops by a factor of 2–5.5 times. Compared to a similar work (HydraRes), we obtain marginal accuracy improvements of up to 1.2% on the pairwise inference time architectures. However, we improve the inference times by up to 2.8 times compared to HydraRes.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Dynamic network</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Ensemble</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Optimal inference</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Conditional computation</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Darabant, Adrian Sergiu</subfield><subfield code="0">(orcid)0000-0002-7580-5722</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Borza, Diana Laura</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Marinescu, Alexandru Ion</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Complex & intelligent systems</subfield><subfield code="d">Berlin : SpringerOpen, 2015</subfield><subfield code="g">9(2022), 2 vom: 16. Nov., Seite 2177-2188</subfield><subfield code="w">(DE-627)835589269</subfield><subfield code="w">(DE-600)2834740-7</subfield><subfield code="x">2198-6053</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:9</subfield><subfield code="g">year:2022</subfield><subfield code="g">number:2</subfield><subfield code="g">day:16</subfield><subfield code="g">month:11</subfield><subfield code="g">pages:2177-2188</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://dx.doi.org/10.1007/s40747-022-00897-1</subfield><subfield code="z">kostenfrei</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_SPRINGER</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_20</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_22</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_23</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_24</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_39</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_60</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_63</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_69</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_73</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_95</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_105</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_110</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_151</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_161</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_170</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_213</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_230</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_285</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_293</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_370</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_602</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2014</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4037</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4112</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4125</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4249</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4306</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4307</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4313</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4322</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4323</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4324</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4325</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4326</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4335</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4338</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4367</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4700</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">9</subfield><subfield code="j">2022</subfield><subfield code="e">2</subfield><subfield code="b">16</subfield><subfield code="c">11</subfield><subfield code="h">2177-2188</subfield></datafield></record></collection>
|
author |
Ileni, Tudor Alexandru |
spellingShingle |
Ileni, Tudor Alexandru misc Dynamic network misc Ensemble misc Optimal inference misc Conditional computation DynK-hydra: improved dynamic architecture ensembling for efficient inference |
authorStr |
Ileni, Tudor Alexandru |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)835589269 |
format |
electronic Article |
delete_txt_mv |
keep |
author_role |
aut aut aut aut |
collection |
springer |
remote_str |
true |
illustrated |
Not Illustrated |
issn |
2198-6053 |
topic_title |
DynK-hydra: improved dynamic architecture ensembling for efficient inference Dynamic network (dpeaa)DE-He213 Ensemble (dpeaa)DE-He213 Optimal inference (dpeaa)DE-He213 Conditional computation (dpeaa)DE-He213 |
topic |
misc Dynamic network misc Ensemble misc Optimal inference misc Conditional computation |
topic_unstemmed |
misc Dynamic network misc Ensemble misc Optimal inference misc Conditional computation |
topic_browse |
misc Dynamic network misc Ensemble misc Optimal inference misc Conditional computation |
format_facet |
Elektronische Aufsätze Aufsätze Elektronische Ressource |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
cr |
hierarchy_parent_title |
Complex & intelligent systems |
hierarchy_parent_id |
835589269 |
hierarchy_top_title |
Complex & intelligent systems |
isfreeaccess_txt |
true |
familylinks_str_mv |
(DE-627)835589269 (DE-600)2834740-7 |
title |
DynK-hydra: improved dynamic architecture ensembling for efficient inference |
ctrlnum |
(DE-627)SPR050101056 (SPR)s40747-022-00897-1-e |
title_full |
DynK-hydra: improved dynamic architecture ensembling for efficient inference |
author_sort |
Ileni, Tudor Alexandru |
journal |
Complex & intelligent systems |
journalStr |
Complex & intelligent systems |
lang_code |
eng |
isOA_bool |
true |
recordtype |
marc |
publishDateSort |
2022 |
contenttype_str_mv |
txt |
container_start_page |
2177 |
author_browse |
Ileni, Tudor Alexandru Darabant, Adrian Sergiu Borza, Diana Laura Marinescu, Alexandru Ion |
container_volume |
9 |
format_se |
Elektronische Aufsätze |
author-letter |
Ileni, Tudor Alexandru |
doi_str_mv |
10.1007/s40747-022-00897-1 |
normlink |
(ORCID)0000-0002-7580-5722 |
normlink_prefix_str_mv |
(orcid)0000-0002-7580-5722 |
title_sort |
dynk-hydra: improved dynamic architecture ensembling for efficient inference |
title_auth |
DynK-hydra: improved dynamic architecture ensembling for efficient inference |
abstract |
Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets with a high number of classes, but uses, during the inference, only a subset of necessary branches. We achieve this by training a cohort of specialized branches (deep network of reduced size) and a gater/supervisor, that decides dynamically what branch to use for any specific input. An original contribution is that the number of chosen models is dynamically set, based on how confident the gater is (similar works use a static parameter for this). Another contribution is the way we ensure the branches’ specialization. We divide the data set classes into multiple clusters, and we assign a cluster to each branch while enforcing its specialization on this cluster by a separate loss function. We evaluate DynK-Hydra on CIFAR-100, Food-101, CUB-200, and ImageNet32 data sets and we obtain improvements of up to 4.3% accuracy compared with state-of-the-art ResNet. All this while reducing the number of inference flops by a factor of 2–5.5 times. Compared to a similar work (HydraRes), we obtain marginal accuracy improvements of up to 1.2% on the pairwise inference time architectures. However, we improve the inference times by up to 2.8 times compared to HydraRes. © The Author(s) 2022 |
abstractGer |
Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets with a high number of classes, but uses, during the inference, only a subset of necessary branches. We achieve this by training a cohort of specialized branches (deep network of reduced size) and a gater/supervisor, that decides dynamically what branch to use for any specific input. An original contribution is that the number of chosen models is dynamically set, based on how confident the gater is (similar works use a static parameter for this). Another contribution is the way we ensure the branches’ specialization. We divide the data set classes into multiple clusters, and we assign a cluster to each branch while enforcing its specialization on this cluster by a separate loss function. We evaluate DynK-Hydra on CIFAR-100, Food-101, CUB-200, and ImageNet32 data sets and we obtain improvements of up to 4.3% accuracy compared with state-of-the-art ResNet. All this while reducing the number of inference flops by a factor of 2–5.5 times. Compared to a similar work (HydraRes), we obtain marginal accuracy improvements of up to 1.2% on the pairwise inference time architectures. However, we improve the inference times by up to 2.8 times compared to HydraRes. © The Author(s) 2022 |
abstract_unstemmed |
Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets with a high number of classes, but uses, during the inference, only a subset of necessary branches. We achieve this by training a cohort of specialized branches (deep network of reduced size) and a gater/supervisor, that decides dynamically what branch to use for any specific input. An original contribution is that the number of chosen models is dynamically set, based on how confident the gater is (similar works use a static parameter for this). Another contribution is the way we ensure the branches’ specialization. We divide the data set classes into multiple clusters, and we assign a cluster to each branch while enforcing its specialization on this cluster by a separate loss function. We evaluate DynK-Hydra on CIFAR-100, Food-101, CUB-200, and ImageNet32 data sets and we obtain improvements of up to 4.3% accuracy compared with state-of-the-art ResNet. All this while reducing the number of inference flops by a factor of 2–5.5 times. Compared to a similar work (HydraRes), we obtain marginal accuracy improvements of up to 1.2% on the pairwise inference time architectures. However, we improve the inference times by up to 2.8 times compared to HydraRes. © The Author(s) 2022 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_SPRINGER GBV_ILN_11 GBV_ILN_20 GBV_ILN_22 GBV_ILN_23 GBV_ILN_24 GBV_ILN_31 GBV_ILN_39 GBV_ILN_40 GBV_ILN_60 GBV_ILN_62 GBV_ILN_63 GBV_ILN_65 GBV_ILN_69 GBV_ILN_70 GBV_ILN_73 GBV_ILN_95 GBV_ILN_105 GBV_ILN_110 GBV_ILN_151 GBV_ILN_161 GBV_ILN_170 GBV_ILN_213 GBV_ILN_230 GBV_ILN_285 GBV_ILN_293 GBV_ILN_370 GBV_ILN_602 GBV_ILN_2014 GBV_ILN_4012 GBV_ILN_4037 GBV_ILN_4112 GBV_ILN_4125 GBV_ILN_4126 GBV_ILN_4249 GBV_ILN_4305 GBV_ILN_4306 GBV_ILN_4307 GBV_ILN_4313 GBV_ILN_4322 GBV_ILN_4323 GBV_ILN_4324 GBV_ILN_4325 GBV_ILN_4326 GBV_ILN_4335 GBV_ILN_4338 GBV_ILN_4367 GBV_ILN_4700 |
container_issue |
2 |
title_short |
DynK-hydra: improved dynamic architecture ensembling for efficient inference |
url |
https://dx.doi.org/10.1007/s40747-022-00897-1 |
remote_bool |
true |
author2 |
Darabant, Adrian Sergiu Borza, Diana Laura Marinescu, Alexandru Ion |
author2Str |
Darabant, Adrian Sergiu Borza, Diana Laura Marinescu, Alexandru Ion |
ppnlink |
835589269 |
mediatype_str_mv |
c |
isOA_txt |
true |
hochschulschrift_bool |
false |
doi_str |
10.1007/s40747-022-00897-1 |
up_date |
2024-07-03T13:22:59.205Z |
_version_ |
1803564329689153536 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000naa a22002652 4500</leader><controlfield tag="001">SPR050101056</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230419064805.0</controlfield><controlfield tag="007">cr uuu---uuuuu</controlfield><controlfield tag="008">230419s2022 xx |||||o 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s40747-022-00897-1</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)SPR050101056</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(SPR)s40747-022-00897-1-e</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Ileni, Tudor Alexandru</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">DynK-hydra: improved dynamic architecture ensembling for efficient inference</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2022</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">Computermedien</subfield><subfield code="b">c</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Online-Ressource</subfield><subfield code="b">cr</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s) 2022</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Accessibility on edge devices and the trade-off between latency and accuracy is an area of interest in deploying deep learning models. This paper explores a Mixture of Experts system, namely, DynK-Hydra, which allows training of an ensemble formed of multiple similar branches on data sets with a high number of classes, but uses, during the inference, only a subset of necessary branches. We achieve this by training a cohort of specialized branches (deep network of reduced size) and a gater/supervisor, that decides dynamically what branch to use for any specific input. An original contribution is that the number of chosen models is dynamically set, based on how confident the gater is (similar works use a static parameter for this). Another contribution is the way we ensure the branches’ specialization. We divide the data set classes into multiple clusters, and we assign a cluster to each branch while enforcing its specialization on this cluster by a separate loss function. We evaluate DynK-Hydra on CIFAR-100, Food-101, CUB-200, and ImageNet32 data sets and we obtain improvements of up to 4.3% accuracy compared with state-of-the-art ResNet. All this while reducing the number of inference flops by a factor of 2–5.5 times. Compared to a similar work (HydraRes), we obtain marginal accuracy improvements of up to 1.2% on the pairwise inference time architectures. However, we improve the inference times by up to 2.8 times compared to HydraRes.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Dynamic network</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Ensemble</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Optimal inference</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Conditional computation</subfield><subfield code="7">(dpeaa)DE-He213</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Darabant, Adrian Sergiu</subfield><subfield code="0">(orcid)0000-0002-7580-5722</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Borza, Diana Laura</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Marinescu, Alexandru Ion</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Complex & intelligent systems</subfield><subfield code="d">Berlin : SpringerOpen, 2015</subfield><subfield code="g">9(2022), 2 vom: 16. Nov., Seite 2177-2188</subfield><subfield code="w">(DE-627)835589269</subfield><subfield code="w">(DE-600)2834740-7</subfield><subfield code="x">2198-6053</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:9</subfield><subfield code="g">year:2022</subfield><subfield code="g">number:2</subfield><subfield code="g">day:16</subfield><subfield code="g">month:11</subfield><subfield code="g">pages:2177-2188</subfield></datafield><datafield tag="856" ind1="4" ind2="0"><subfield code="u">https://dx.doi.org/10.1007/s40747-022-00897-1</subfield><subfield code="z">kostenfrei</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_SPRINGER</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_11</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_20</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_22</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_23</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_24</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_39</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_60</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_62</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_63</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_69</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_73</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_95</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_105</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_110</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_151</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_161</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_170</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_213</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_230</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_285</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_293</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_370</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_602</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2014</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4037</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4112</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4125</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4249</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4306</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4307</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4313</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4322</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4323</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4324</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4325</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4326</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4335</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4338</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4367</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4700</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">9</subfield><subfield code="j">2022</subfield><subfield code="e">2</subfield><subfield code="b">16</subfield><subfield code="c">11</subfield><subfield code="h">2177-2188</subfield></datafield></record></collection>
|
score |
7.397464 |