Cambricon

Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Liu, Shaoli [verfasserIn] Du, Zidong Tao, Jinhua Han, Dong Luo, Tao Xie, Yuan Chen, Yunji Chen, Tianshi

Format:	Artikel
Sprache:	Englisch

Erschienen:	2016

Übergeordnetes Werk:	Enthalten in: Computer architecture news - New York, NY : ACM, 1972, 44(2016), 3, Seite 393-405
Übergeordnetes Werk:	volume:44 ; year:2016 ; number:3 ; pages:393-405

Links:	Volltext Link aufrufen

DOI / URN:	10.1145/3007787.3001179

Katalog-ID:	OLC1984484052

Internformat


LEADER	01000caa a2200265 4500
001	OLC1984484052
003	DE-627
005	20230714223953.0
007	tu
008	161202s2016 xx \|\|\|\|\| 00\| \|\|eng c
024	7		\|a 10.1145/3007787.3001179 \|2 doi
028	5	2	\|a PQ20161201
035			\|a (DE-627)OLC1984484052
035			\|a (DE-599)GBVOLC1984484052
035			\|a (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460
035			\|a (KEY)0040085820160000044000300393cambricon
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 004 \|q DE-600
100	1		\|a Liu, Shaoli \|e verfasserin \|4 aut
245	1	0	\|a Cambricon
264		1	\|c 2016
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
520			\|a Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks.
700	1		\|a Du, Zidong \|4 oth
700	1		\|a Tao, Jinhua \|4 oth
700	1		\|a Han, Dong \|4 oth
700	1		\|a Luo, Tao \|4 oth
700	1		\|a Xie, Yuan \|4 oth
700	1		\|a Chen, Yunji \|4 oth
700	1		\|a Chen, Tianshi \|4 oth
773	0	8	\|i Enthalten in \|t Computer architecture news \|d New York, NY : ACM, 1972 \|g 44(2016), 3, Seite 393-405 \|w (DE-627)129397881 \|w (DE-600)186012-4 \|w (DE-576)014781093 \|x 0163-5964 \|7 nnns
773	1	8	\|g volume:44 \|g year:2016 \|g number:3 \|g pages:393-405
856	4	1	\|u http://dx.doi.org/10.1145/3007787.3001179 \|3 Volltext
856	4	2	\|u http://dl.acm.org/citation.cfm?id=3001179
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_OLC
912			\|a SSG-OLC-MAT
912			\|a GBV_ILN_70
912			\|a GBV_ILN_134
912			\|a GBV_ILN_2021
912			\|a GBV_ILN_2190
951			\|a AR
952			\|d 44 \|j 2016 \|e 3 \|h 393-405

Indexfelder

author_variant	s l sl
matchkey_str	article:01635964:2016----::abi
hierarchy_sort_str	2016
publishDate	2016
allfields	10.1145/3007787.3001179 doi PQ20161201 (DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon DE-627 ger DE-627 rakwb eng 004 DE-600 Liu, Shaoli verfasserin aut Cambricon 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. Du, Zidong oth Tao, Jinhua oth Han, Dong oth Luo, Tao oth Xie, Yuan oth Chen, Yunji oth Chen, Tianshi oth Enthalten in Computer architecture news New York, NY : ACM, 1972 44(2016), 3, Seite 393-405 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:44 year:2016 number:3 pages:393-405 http://dx.doi.org/10.1145/3007787.3001179 Volltext http://dl.acm.org/citation.cfm?id=3001179 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 44 2016 3 393-405
spelling	10.1145/3007787.3001179 doi PQ20161201 (DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon DE-627 ger DE-627 rakwb eng 004 DE-600 Liu, Shaoli verfasserin aut Cambricon 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. Du, Zidong oth Tao, Jinhua oth Han, Dong oth Luo, Tao oth Xie, Yuan oth Chen, Yunji oth Chen, Tianshi oth Enthalten in Computer architecture news New York, NY : ACM, 1972 44(2016), 3, Seite 393-405 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:44 year:2016 number:3 pages:393-405 http://dx.doi.org/10.1145/3007787.3001179 Volltext http://dl.acm.org/citation.cfm?id=3001179 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 44 2016 3 393-405
allfields_unstemmed	10.1145/3007787.3001179 doi PQ20161201 (DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon DE-627 ger DE-627 rakwb eng 004 DE-600 Liu, Shaoli verfasserin aut Cambricon 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. Du, Zidong oth Tao, Jinhua oth Han, Dong oth Luo, Tao oth Xie, Yuan oth Chen, Yunji oth Chen, Tianshi oth Enthalten in Computer architecture news New York, NY : ACM, 1972 44(2016), 3, Seite 393-405 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:44 year:2016 number:3 pages:393-405 http://dx.doi.org/10.1145/3007787.3001179 Volltext http://dl.acm.org/citation.cfm?id=3001179 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 44 2016 3 393-405
allfieldsGer	10.1145/3007787.3001179 doi PQ20161201 (DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon DE-627 ger DE-627 rakwb eng 004 DE-600 Liu, Shaoli verfasserin aut Cambricon 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. Du, Zidong oth Tao, Jinhua oth Han, Dong oth Luo, Tao oth Xie, Yuan oth Chen, Yunji oth Chen, Tianshi oth Enthalten in Computer architecture news New York, NY : ACM, 1972 44(2016), 3, Seite 393-405 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:44 year:2016 number:3 pages:393-405 http://dx.doi.org/10.1145/3007787.3001179 Volltext http://dl.acm.org/citation.cfm?id=3001179 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 44 2016 3 393-405
allfieldsSound	10.1145/3007787.3001179 doi PQ20161201 (DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon DE-627 ger DE-627 rakwb eng 004 DE-600 Liu, Shaoli verfasserin aut Cambricon 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. Du, Zidong oth Tao, Jinhua oth Han, Dong oth Luo, Tao oth Xie, Yuan oth Chen, Yunji oth Chen, Tianshi oth Enthalten in Computer architecture news New York, NY : ACM, 1972 44(2016), 3, Seite 393-405 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:44 year:2016 number:3 pages:393-405 http://dx.doi.org/10.1145/3007787.3001179 Volltext http://dl.acm.org/citation.cfm?id=3001179 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 44 2016 3 393-405
language	English
source	Enthalten in Computer architecture news 44(2016), 3, Seite 393-405 volume:44 year:2016 number:3 pages:393-405
sourceStr	Enthalten in Computer architecture news 44(2016), 3, Seite 393-405 volume:44 year:2016 number:3 pages:393-405
format_phy_str_mv	Article
institution	findex.gbv.de
dewey-raw	004
isfreeaccess_bool	false
container_title	Computer architecture news
authorswithroles_txt_mv	Liu, Shaoli @@aut@@ Du, Zidong @@oth@@ Tao, Jinhua @@oth@@ Han, Dong @@oth@@ Luo, Tao @@oth@@ Xie, Yuan @@oth@@ Chen, Yunji @@oth@@ Chen, Tianshi @@oth@@
publishDateDaySort_date	2016-01-01T00:00:00Z
hierarchy_top_id	129397881
dewey-sort	14
id	OLC1984484052
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1984484052</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230714223953.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">161202s2016 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1145/3007787.3001179</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20161201</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1984484052</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1984484052</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0040085820160000044000300393cambricon</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DE-600</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Liu, Shaoli</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Cambricon</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks.</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Du, Zidong</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Tao, Jinhua</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Han, Dong</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Luo, Tao</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Xie, Yuan</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Yunji</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Tianshi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Computer architecture news</subfield><subfield code="d">New York, NY : ACM, 1972</subfield><subfield code="g">44(2016), 3, Seite 393-405</subfield><subfield code="w">(DE-627)129397881</subfield><subfield code="w">(DE-600)186012-4</subfield><subfield code="w">(DE-576)014781093</subfield><subfield code="x">0163-5964</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:44</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:3</subfield><subfield code="g">pages:393-405</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1145/3007787.3001179</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://dl.acm.org/citation.cfm?id=3001179</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_134</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2021</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2190</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">44</subfield><subfield code="j">2016</subfield><subfield code="e">3</subfield><subfield code="h">393-405</subfield></datafield></record></collection>
author	Liu, Shaoli
spellingShingle	Liu, Shaoli ddc 004 Cambricon
authorStr	Liu, Shaoli
ppnlink_with_tag_str_mv	@@773@@(DE-627)129397881
format	Article
dewey-ones	004 - Data processing & computer science
delete_txt_mv	keep
author_role	aut
collection	OLC
remote_str	false
illustrated	Not Illustrated
issn	0163-5964
topic_title	004 DE-600 Cambricon
topic	ddc 004
topic_unstemmed	ddc 004
topic_browse	ddc 004
format_facet	Aufsätze Gedruckte Aufsätze
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	nc
author2_variant	z d zd j t jt d h dh t l tl y x yx y c yc t c tc
hierarchy_parent_title	Computer architecture news
hierarchy_parent_id	129397881
dewey-tens	000 - Computer science, knowledge & systems
hierarchy_top_title	Computer architecture news
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)129397881 (DE-600)186012-4 (DE-576)014781093
title	Cambricon
ctrlnum	(DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon
title_full	Cambricon
author_sort	Liu, Shaoli
journal	Computer architecture news
journalStr	Computer architecture news
lang_code	eng
isOA_bool	false
dewey-hundreds	000 - Computer science, information & general works
recordtype	marc
publishDateSort	2016
contenttype_str_mv	txt
container_start_page	393
author_browse	Liu, Shaoli
container_volume	44
class	004 DE-600
format_se	Aufsätze
author-letter	Liu, Shaoli
doi_str_mv	10.1145/3007787.3001179
dewey-full	004
title_sort	cambricon
title_auth	Cambricon
abstract	Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks.
abstractGer	Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks.
abstract_unstemmed	Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks.
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190
container_issue	3
title_short	Cambricon
url	http://dx.doi.org/10.1145/3007787.3001179 http://dl.acm.org/citation.cfm?id=3001179
remote_bool	false
author2	Du, Zidong Tao, Jinhua Han, Dong Luo, Tao Xie, Yuan Chen, Yunji Chen, Tianshi
author2Str	Du, Zidong Tao, Jinhua Han, Dong Luo, Tao Xie, Yuan Chen, Yunji Chen, Tianshi
ppnlink	129397881
mediatype_str_mv	n
isOA_txt	false
hochschulschrift_bool	false
author2_role	oth oth oth oth oth oth oth
doi_str	10.1145/3007787.3001179
up_date	2024-07-04T00:35:37.278Z
_version_	1803606648190664704
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1984484052</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230714223953.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">161202s2016 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1145/3007787.3001179</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20161201</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1984484052</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1984484052</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0040085820160000044000300393cambricon</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DE-600</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Liu, Shaoli</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Cambricon</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks.</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Du, Zidong</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Tao, Jinhua</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Han, Dong</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Luo, Tao</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Xie, Yuan</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Yunji</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Tianshi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Computer architecture news</subfield><subfield code="d">New York, NY : ACM, 1972</subfield><subfield code="g">44(2016), 3, Seite 393-405</subfield><subfield code="w">(DE-627)129397881</subfield><subfield code="w">(DE-600)186012-4</subfield><subfield code="w">(DE-576)014781093</subfield><subfield code="x">0163-5964</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:44</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:3</subfield><subfield code="g">pages:393-405</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1145/3007787.3001179</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://dl.acm.org/citation.cfm?id=3001179</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_134</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2021</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2190</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">44</subfield><subfield code="j">2016</subfield><subfield code="e">3</subfield><subfield code="h">393-405</subfield></datafield></record></collection>
score	7.398052

Nicht das Richtige dabei?

Schreiben Sie uns!

Cambricon

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?