ShiDianNao

In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain l...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Du, Zidong [verfasserIn] Fasthuber, Robert Chen, Tianshi Ienne, Paolo Li, Ling Luo, Tao Feng, Xiaobing Chen, Yunji Temam, Olivier

Format:	Artikel
Sprache:	Englisch

Erschienen:	2016

Übergeordnetes Werk:	Enthalten in: Computer architecture news - New York, NY : ACM, 1972, 43(2016), 3, Seite 92-104
Übergeordnetes Werk:	volume:43 ; year:2016 ; number:3 ; pages:92-104

Links:	Volltext Link aufrufen

DOI / URN:	10.1145/2872887.2750389

Katalog-ID:	OLC1973673711

Internformat


LEADER	01000caa a2200265 4500
001	OLC1973673711
003	DE-627
005	20230714184925.0
007	tu
008	160430s2016 xx \|\|\|\|\| 00\| \|\|eng c
024	7		\|a 10.1145/2872887.2750389 \|2 doi
028	5	2	\|a PQ20160430
035			\|a (DE-627)OLC1973673711
035			\|a (DE-599)GBVOLC1973673711
035			\|a (PRQ)acm_primary_27503890
035			\|a (KEY)0040085820160000043000300092shidiannao
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 004 \|q DNB
100	1		\|a Du, Zidong \|e verfasserin \|4 aut
245	1	0	\|a ShiDianNao
264		1	\|c 2016
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
520			\|a In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain limited by memory accesses. In this paper, we focus on image applications, arguably the most important category among recognition and mining applications. The neural networks which are state-of-the-art for these applications are Convolutional Neural Networks (CNN), and they have an important property: weights are shared among many neurons, considerably reducing the neural network memory footprint. This property allows to entirely map a CNN within an SRAM, eliminating all DRAM accesses for weights. By further hoisting this accelerator next to the image sensor, it is possible to eliminate all remaining DRAM accesses, i.e., for inputs and outputs. In this paper, we propose such a CNN accelerator, placed next to a CMOS or CCD sensor. The absence of DRAM accesses combined with a careful exploitation of the specific data access patterns within CNNs allows us to design an accelerator which is 60&times more energy efficient than the previous state-of-the-art neural network accelerator. We present a full design down to the layout at 65 nm, with a modest footprint of 4.86mm 2 and consuming only 320mW, but still about 30× faster than high-end GPUs.
700	1		\|a Fasthuber, Robert \|4 oth
700	1		\|a Chen, Tianshi \|4 oth
700	1		\|a Ienne, Paolo \|4 oth
700	1		\|a Li, Ling \|4 oth
700	1		\|a Luo, Tao \|4 oth
700	1		\|a Feng, Xiaobing \|4 oth
700	1		\|a Chen, Yunji \|4 oth
700	1		\|a Temam, Olivier \|4 oth
773	0	8	\|i Enthalten in \|t Computer architecture news \|d New York, NY : ACM, 1972 \|g 43(2016), 3, Seite 92-104 \|w (DE-627)129397881 \|w (DE-600)186012-4 \|w (DE-576)014781093 \|x 0163-5964 \|7 nnns
773	1	8	\|g volume:43 \|g year:2016 \|g number:3 \|g pages:92-104
856	4	1	\|u http://dx.doi.org/10.1145/2872887.2750389 \|3 Volltext
856	4	2	\|u http://dl.acm.org/citation.cfm?id=2750389
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_OLC
912			\|a SSG-OLC-MAT
912			\|a GBV_ILN_70
912			\|a GBV_ILN_134
912			\|a GBV_ILN_2021
912			\|a GBV_ILN_2190
951			\|a AR
952			\|d 43 \|j 2016 \|e 3 \|h 92-104

Indexfelder

author_variant	z d zd
matchkey_str	article:01635964:2016----::hda
hierarchy_sort_str	2016
publishDate	2016
allfields	10.1145/2872887.2750389 doi PQ20160430 (DE-627)OLC1973673711 (DE-599)GBVOLC1973673711 (PRQ)acm_primary_27503890 (KEY)0040085820160000043000300092shidiannao DE-627 ger DE-627 rakwb eng 004 DNB Du, Zidong verfasserin aut ShiDianNao 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain limited by memory accesses. In this paper, we focus on image applications, arguably the most important category among recognition and mining applications. The neural networks which are state-of-the-art for these applications are Convolutional Neural Networks (CNN), and they have an important property: weights are shared among many neurons, considerably reducing the neural network memory footprint. This property allows to entirely map a CNN within an SRAM, eliminating all DRAM accesses for weights. By further hoisting this accelerator next to the image sensor, it is possible to eliminate all remaining DRAM accesses, i.e., for inputs and outputs. In this paper, we propose such a CNN accelerator, placed next to a CMOS or CCD sensor. The absence of DRAM accesses combined with a careful exploitation of the specific data access patterns within CNNs allows us to design an accelerator which is 60&times more energy efficient than the previous state-of-the-art neural network accelerator. We present a full design down to the layout at 65 nm, with a modest footprint of 4.86mm 2 and consuming only 320mW, but still about 30× faster than high-end GPUs. Fasthuber, Robert oth Chen, Tianshi oth Ienne, Paolo oth Li, Ling oth Luo, Tao oth Feng, Xiaobing oth Chen, Yunji oth Temam, Olivier oth Enthalten in Computer architecture news New York, NY : ACM, 1972 43(2016), 3, Seite 92-104 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:43 year:2016 number:3 pages:92-104 http://dx.doi.org/10.1145/2872887.2750389 Volltext http://dl.acm.org/citation.cfm?id=2750389 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 43 2016 3 92-104
spelling	10.1145/2872887.2750389 doi PQ20160430 (DE-627)OLC1973673711 (DE-599)GBVOLC1973673711 (PRQ)acm_primary_27503890 (KEY)0040085820160000043000300092shidiannao DE-627 ger DE-627 rakwb eng 004 DNB Du, Zidong verfasserin aut ShiDianNao 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain limited by memory accesses. In this paper, we focus on image applications, arguably the most important category among recognition and mining applications. The neural networks which are state-of-the-art for these applications are Convolutional Neural Networks (CNN), and they have an important property: weights are shared among many neurons, considerably reducing the neural network memory footprint. This property allows to entirely map a CNN within an SRAM, eliminating all DRAM accesses for weights. By further hoisting this accelerator next to the image sensor, it is possible to eliminate all remaining DRAM accesses, i.e., for inputs and outputs. In this paper, we propose such a CNN accelerator, placed next to a CMOS or CCD sensor. The absence of DRAM accesses combined with a careful exploitation of the specific data access patterns within CNNs allows us to design an accelerator which is 60&times more energy efficient than the previous state-of-the-art neural network accelerator. We present a full design down to the layout at 65 nm, with a modest footprint of 4.86mm 2 and consuming only 320mW, but still about 30× faster than high-end GPUs. Fasthuber, Robert oth Chen, Tianshi oth Ienne, Paolo oth Li, Ling oth Luo, Tao oth Feng, Xiaobing oth Chen, Yunji oth Temam, Olivier oth Enthalten in Computer architecture news New York, NY : ACM, 1972 43(2016), 3, Seite 92-104 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:43 year:2016 number:3 pages:92-104 http://dx.doi.org/10.1145/2872887.2750389 Volltext http://dl.acm.org/citation.cfm?id=2750389 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 43 2016 3 92-104
allfields_unstemmed	10.1145/2872887.2750389 doi PQ20160430 (DE-627)OLC1973673711 (DE-599)GBVOLC1973673711 (PRQ)acm_primary_27503890 (KEY)0040085820160000043000300092shidiannao DE-627 ger DE-627 rakwb eng 004 DNB Du, Zidong verfasserin aut ShiDianNao 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain limited by memory accesses. In this paper, we focus on image applications, arguably the most important category among recognition and mining applications. The neural networks which are state-of-the-art for these applications are Convolutional Neural Networks (CNN), and they have an important property: weights are shared among many neurons, considerably reducing the neural network memory footprint. This property allows to entirely map a CNN within an SRAM, eliminating all DRAM accesses for weights. By further hoisting this accelerator next to the image sensor, it is possible to eliminate all remaining DRAM accesses, i.e., for inputs and outputs. In this paper, we propose such a CNN accelerator, placed next to a CMOS or CCD sensor. The absence of DRAM accesses combined with a careful exploitation of the specific data access patterns within CNNs allows us to design an accelerator which is 60&times more energy efficient than the previous state-of-the-art neural network accelerator. We present a full design down to the layout at 65 nm, with a modest footprint of 4.86mm 2 and consuming only 320mW, but still about 30× faster than high-end GPUs. Fasthuber, Robert oth Chen, Tianshi oth Ienne, Paolo oth Li, Ling oth Luo, Tao oth Feng, Xiaobing oth Chen, Yunji oth Temam, Olivier oth Enthalten in Computer architecture news New York, NY : ACM, 1972 43(2016), 3, Seite 92-104 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:43 year:2016 number:3 pages:92-104 http://dx.doi.org/10.1145/2872887.2750389 Volltext http://dl.acm.org/citation.cfm?id=2750389 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 43 2016 3 92-104
allfieldsGer	10.1145/2872887.2750389 doi PQ20160430 (DE-627)OLC1973673711 (DE-599)GBVOLC1973673711 (PRQ)acm_primary_27503890 (KEY)0040085820160000043000300092shidiannao DE-627 ger DE-627 rakwb eng 004 DNB Du, Zidong verfasserin aut ShiDianNao 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain limited by memory accesses. In this paper, we focus on image applications, arguably the most important category among recognition and mining applications. The neural networks which are state-of-the-art for these applications are Convolutional Neural Networks (CNN), and they have an important property: weights are shared among many neurons, considerably reducing the neural network memory footprint. This property allows to entirely map a CNN within an SRAM, eliminating all DRAM accesses for weights. By further hoisting this accelerator next to the image sensor, it is possible to eliminate all remaining DRAM accesses, i.e., for inputs and outputs. In this paper, we propose such a CNN accelerator, placed next to a CMOS or CCD sensor. The absence of DRAM accesses combined with a careful exploitation of the specific data access patterns within CNNs allows us to design an accelerator which is 60&times more energy efficient than the previous state-of-the-art neural network accelerator. We present a full design down to the layout at 65 nm, with a modest footprint of 4.86mm 2 and consuming only 320mW, but still about 30× faster than high-end GPUs. Fasthuber, Robert oth Chen, Tianshi oth Ienne, Paolo oth Li, Ling oth Luo, Tao oth Feng, Xiaobing oth Chen, Yunji oth Temam, Olivier oth Enthalten in Computer architecture news New York, NY : ACM, 1972 43(2016), 3, Seite 92-104 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:43 year:2016 number:3 pages:92-104 http://dx.doi.org/10.1145/2872887.2750389 Volltext http://dl.acm.org/citation.cfm?id=2750389 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 43 2016 3 92-104
allfieldsSound	10.1145/2872887.2750389 doi PQ20160430 (DE-627)OLC1973673711 (DE-599)GBVOLC1973673711 (PRQ)acm_primary_27503890 (KEY)0040085820160000043000300092shidiannao DE-627 ger DE-627 rakwb eng 004 DNB Du, Zidong verfasserin aut ShiDianNao 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain limited by memory accesses. In this paper, we focus on image applications, arguably the most important category among recognition and mining applications. The neural networks which are state-of-the-art for these applications are Convolutional Neural Networks (CNN), and they have an important property: weights are shared among many neurons, considerably reducing the neural network memory footprint. This property allows to entirely map a CNN within an SRAM, eliminating all DRAM accesses for weights. By further hoisting this accelerator next to the image sensor, it is possible to eliminate all remaining DRAM accesses, i.e., for inputs and outputs. In this paper, we propose such a CNN accelerator, placed next to a CMOS or CCD sensor. The absence of DRAM accesses combined with a careful exploitation of the specific data access patterns within CNNs allows us to design an accelerator which is 60&times more energy efficient than the previous state-of-the-art neural network accelerator. We present a full design down to the layout at 65 nm, with a modest footprint of 4.86mm 2 and consuming only 320mW, but still about 30× faster than high-end GPUs. Fasthuber, Robert oth Chen, Tianshi oth Ienne, Paolo oth Li, Ling oth Luo, Tao oth Feng, Xiaobing oth Chen, Yunji oth Temam, Olivier oth Enthalten in Computer architecture news New York, NY : ACM, 1972 43(2016), 3, Seite 92-104 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:43 year:2016 number:3 pages:92-104 http://dx.doi.org/10.1145/2872887.2750389 Volltext http://dl.acm.org/citation.cfm?id=2750389 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 43 2016 3 92-104
language	English
source	Enthalten in Computer architecture news 43(2016), 3, Seite 92-104 volume:43 year:2016 number:3 pages:92-104
sourceStr	Enthalten in Computer architecture news 43(2016), 3, Seite 92-104 volume:43 year:2016 number:3 pages:92-104
format_phy_str_mv	Article
institution	findex.gbv.de
dewey-raw	004
isfreeaccess_bool	false
container_title	Computer architecture news
authorswithroles_txt_mv	Du, Zidong @@aut@@ Fasthuber, Robert @@oth@@ Chen, Tianshi @@oth@@ Ienne, Paolo @@oth@@ Li, Ling @@oth@@ Luo, Tao @@oth@@ Feng, Xiaobing @@oth@@ Chen, Yunji @@oth@@ Temam, Olivier @@oth@@
publishDateDaySort_date	2016-01-01T00:00:00Z
hierarchy_top_id	129397881
dewey-sort	14
id	OLC1973673711
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1973673711</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230714184925.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160430s2016 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1145/2872887.2750389</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160430</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1973673711</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1973673711</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)acm_primary_27503890</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0040085820160000043000300092shidiannao</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Du, Zidong</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">ShiDianNao</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain limited by memory accesses. In this paper, we focus on image applications, arguably the most important category among recognition and mining applications. The neural networks which are state-of-the-art for these applications are Convolutional Neural Networks (CNN), and they have an important property: weights are shared among many neurons, considerably reducing the neural network memory footprint. This property allows to entirely map a CNN within an SRAM, eliminating all DRAM accesses for weights. By further hoisting this accelerator next to the image sensor, it is possible to eliminate all remaining DRAM accesses, i.e., for inputs and outputs. In this paper, we propose such a CNN accelerator, placed next to a CMOS or CCD sensor. The absence of DRAM accesses combined with a careful exploitation of the specific data access patterns within CNNs allows us to design an accelerator which is 60&times more energy efficient than the previous state-of-the-art neural network accelerator. We present a full design down to the layout at 65 nm, with a modest footprint of 4.86mm 2 and consuming only 320mW, but still about 30× faster than high-end GPUs.</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Fasthuber, Robert</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Tianshi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Ienne, Paolo</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Li, Ling</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Luo, Tao</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Feng, Xiaobing</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Yunji</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Temam, Olivier</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Computer architecture news</subfield><subfield code="d">New York, NY : ACM, 1972</subfield><subfield code="g">43(2016), 3, Seite 92-104</subfield><subfield code="w">(DE-627)129397881</subfield><subfield code="w">(DE-600)186012-4</subfield><subfield code="w">(DE-576)014781093</subfield><subfield code="x">0163-5964</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:43</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:3</subfield><subfield code="g">pages:92-104</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1145/2872887.2750389</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://dl.acm.org/citation.cfm?id=2750389</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_134</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2021</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2190</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">43</subfield><subfield code="j">2016</subfield><subfield code="e">3</subfield><subfield code="h">92-104</subfield></datafield></record></collection>
author	Du, Zidong
spellingShingle	Du, Zidong ddc 004 ShiDianNao
authorStr	Du, Zidong
ppnlink_with_tag_str_mv	@@773@@(DE-627)129397881
format	Article
dewey-ones	004 - Data processing & computer science
delete_txt_mv	keep
author_role	aut
collection	OLC
remote_str	false
illustrated	Not Illustrated
issn	0163-5964
topic_title	004 DNB ShiDianNao
topic	ddc 004
topic_unstemmed	ddc 004
topic_browse	ddc 004
format_facet	Aufsätze Gedruckte Aufsätze
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	nc
author2_variant	r f rf t c tc p i pi l l ll t l tl x f xf y c yc o t ot
hierarchy_parent_title	Computer architecture news
hierarchy_parent_id	129397881
dewey-tens	000 - Computer science, knowledge & systems
hierarchy_top_title	Computer architecture news
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)129397881 (DE-600)186012-4 (DE-576)014781093
title	ShiDianNao
ctrlnum	(DE-627)OLC1973673711 (DE-599)GBVOLC1973673711 (PRQ)acm_primary_27503890 (KEY)0040085820160000043000300092shidiannao
title_full	ShiDianNao
author_sort	Du, Zidong
journal	Computer architecture news
journalStr	Computer architecture news
lang_code	eng
isOA_bool	false
dewey-hundreds	000 - Computer science, information & general works
recordtype	marc
publishDateSort	2016
contenttype_str_mv	txt
container_start_page	92
author_browse	Du, Zidong
container_volume	43
class	004 DNB
format_se	Aufsätze
author-letter	Du, Zidong
doi_str_mv	10.1145/2872887.2750389
dewey-full	004
title_sort	shidiannao
title_auth	ShiDianNao
abstract	In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain limited by memory accesses. In this paper, we focus on image applications, arguably the most important category among recognition and mining applications. The neural networks which are state-of-the-art for these applications are Convolutional Neural Networks (CNN), and they have an important property: weights are shared among many neurons, considerably reducing the neural network memory footprint. This property allows to entirely map a CNN within an SRAM, eliminating all DRAM accesses for weights. By further hoisting this accelerator next to the image sensor, it is possible to eliminate all remaining DRAM accesses, i.e., for inputs and outputs. In this paper, we propose such a CNN accelerator, placed next to a CMOS or CCD sensor. The absence of DRAM accesses combined with a careful exploitation of the specific data access patterns within CNNs allows us to design an accelerator which is 60&times more energy efficient than the previous state-of-the-art neural network accelerator. We present a full design down to the layout at 65 nm, with a modest footprint of 4.86mm 2 and consuming only 320mW, but still about 30× faster than high-end GPUs.
abstractGer	In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain limited by memory accesses. In this paper, we focus on image applications, arguably the most important category among recognition and mining applications. The neural networks which are state-of-the-art for these applications are Convolutional Neural Networks (CNN), and they have an important property: weights are shared among many neurons, considerably reducing the neural network memory footprint. This property allows to entirely map a CNN within an SRAM, eliminating all DRAM accesses for weights. By further hoisting this accelerator next to the image sensor, it is possible to eliminate all remaining DRAM accesses, i.e., for inputs and outputs. In this paper, we propose such a CNN accelerator, placed next to a CMOS or CCD sensor. The absence of DRAM accesses combined with a careful exploitation of the specific data access patterns within CNNs allows us to design an accelerator which is 60&times more energy efficient than the previous state-of-the-art neural network accelerator. We present a full design down to the layout at 65 nm, with a modest footprint of 4.86mm 2 and consuming only 320mW, but still about 30× faster than high-end GPUs.
abstract_unstemmed	In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain limited by memory accesses. In this paper, we focus on image applications, arguably the most important category among recognition and mining applications. The neural networks which are state-of-the-art for these applications are Convolutional Neural Networks (CNN), and they have an important property: weights are shared among many neurons, considerably reducing the neural network memory footprint. This property allows to entirely map a CNN within an SRAM, eliminating all DRAM accesses for weights. By further hoisting this accelerator next to the image sensor, it is possible to eliminate all remaining DRAM accesses, i.e., for inputs and outputs. In this paper, we propose such a CNN accelerator, placed next to a CMOS or CCD sensor. The absence of DRAM accesses combined with a careful exploitation of the specific data access patterns within CNNs allows us to design an accelerator which is 60&times more energy efficient than the previous state-of-the-art neural network accelerator. We present a full design down to the layout at 65 nm, with a modest footprint of 4.86mm 2 and consuming only 320mW, but still about 30× faster than high-end GPUs.
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190
container_issue	3
title_short	ShiDianNao
url	http://dx.doi.org/10.1145/2872887.2750389 http://dl.acm.org/citation.cfm?id=2750389
remote_bool	false
author2	Fasthuber, Robert Chen, Tianshi Ienne, Paolo Li, Ling Luo, Tao Feng, Xiaobing Chen, Yunji Temam, Olivier
author2Str	Fasthuber, Robert Chen, Tianshi Ienne, Paolo Li, Ling Luo, Tao Feng, Xiaobing Chen, Yunji Temam, Olivier
ppnlink	129397881
mediatype_str_mv	n
isOA_txt	false
hochschulschrift_bool	false
author2_role	oth oth oth oth oth oth oth oth
doi_str	10.1145/2872887.2750389
up_date	2024-07-04T02:55:17.852Z
_version_	1803615435859427328
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1973673711</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230714184925.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160430s2016 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1145/2872887.2750389</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160430</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1973673711</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1973673711</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)acm_primary_27503890</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0040085820160000043000300092shidiannao</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Du, Zidong</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">ShiDianNao</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Still, both the energy efficiency and performance of such accelerators remain limited by memory accesses. In this paper, we focus on image applications, arguably the most important category among recognition and mining applications. The neural networks which are state-of-the-art for these applications are Convolutional Neural Networks (CNN), and they have an important property: weights are shared among many neurons, considerably reducing the neural network memory footprint. This property allows to entirely map a CNN within an SRAM, eliminating all DRAM accesses for weights. By further hoisting this accelerator next to the image sensor, it is possible to eliminate all remaining DRAM accesses, i.e., for inputs and outputs. In this paper, we propose such a CNN accelerator, placed next to a CMOS or CCD sensor. The absence of DRAM accesses combined with a careful exploitation of the specific data access patterns within CNNs allows us to design an accelerator which is 60&times more energy efficient than the previous state-of-the-art neural network accelerator. We present a full design down to the layout at 65 nm, with a modest footprint of 4.86mm 2 and consuming only 320mW, but still about 30× faster than high-end GPUs.</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Fasthuber, Robert</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Tianshi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Ienne, Paolo</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Li, Ling</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Luo, Tao</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Feng, Xiaobing</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Yunji</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Temam, Olivier</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Computer architecture news</subfield><subfield code="d">New York, NY : ACM, 1972</subfield><subfield code="g">43(2016), 3, Seite 92-104</subfield><subfield code="w">(DE-627)129397881</subfield><subfield code="w">(DE-600)186012-4</subfield><subfield code="w">(DE-576)014781093</subfield><subfield code="x">0163-5964</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:43</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:3</subfield><subfield code="g">pages:92-104</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1145/2872887.2750389</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://dl.acm.org/citation.cfm?id=2750389</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_134</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2021</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2190</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">43</subfield><subfield code="j">2016</subfield><subfield code="e">3</subfield><subfield code="h">92-104</subfield></datafield></record></collection>
score	7.3976517

Nicht das Richtige dabei?

Schreiben Sie uns!

ShiDianNao

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?