Cambricon
Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardwar...
Ausführliche Beschreibung
Autor*in: |
Liu, Shaoli [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2016 |
---|
Übergeordnetes Werk: |
Enthalten in: Computer architecture news - New York, NY : ACM, 1972, 44(2016), 3, Seite 393-405 |
---|---|
Übergeordnetes Werk: |
volume:44 ; year:2016 ; number:3 ; pages:393-405 |
Links: |
---|
DOI / URN: |
10.1145/3007787.3001179 |
---|
Katalog-ID: |
OLC1984484052 |
---|
LEADER | 01000caa a2200265 4500 | ||
---|---|---|---|
001 | OLC1984484052 | ||
003 | DE-627 | ||
005 | 20230714223953.0 | ||
007 | tu | ||
008 | 161202s2016 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1145/3007787.3001179 |2 doi | |
028 | 5 | 2 | |a PQ20161201 |
035 | |a (DE-627)OLC1984484052 | ||
035 | |a (DE-599)GBVOLC1984484052 | ||
035 | |a (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 | ||
035 | |a (KEY)0040085820160000044000300393cambricon | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 004 |q DE-600 |
100 | 1 | |a Liu, Shaoli |e verfasserin |4 aut | |
245 | 1 | 0 | |a Cambricon |
264 | 1 | |c 2016 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
520 | |a Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. | ||
700 | 1 | |a Du, Zidong |4 oth | |
700 | 1 | |a Tao, Jinhua |4 oth | |
700 | 1 | |a Han, Dong |4 oth | |
700 | 1 | |a Luo, Tao |4 oth | |
700 | 1 | |a Xie, Yuan |4 oth | |
700 | 1 | |a Chen, Yunji |4 oth | |
700 | 1 | |a Chen, Tianshi |4 oth | |
773 | 0 | 8 | |i Enthalten in |t Computer architecture news |d New York, NY : ACM, 1972 |g 44(2016), 3, Seite 393-405 |w (DE-627)129397881 |w (DE-600)186012-4 |w (DE-576)014781093 |x 0163-5964 |7 nnns |
773 | 1 | 8 | |g volume:44 |g year:2016 |g number:3 |g pages:393-405 |
856 | 4 | 1 | |u http://dx.doi.org/10.1145/3007787.3001179 |3 Volltext |
856 | 4 | 2 | |u http://dl.acm.org/citation.cfm?id=3001179 |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-MAT | ||
912 | |a GBV_ILN_70 | ||
912 | |a GBV_ILN_134 | ||
912 | |a GBV_ILN_2021 | ||
912 | |a GBV_ILN_2190 | ||
951 | |a AR | ||
952 | |d 44 |j 2016 |e 3 |h 393-405 |
author_variant |
s l sl |
---|---|
matchkey_str |
article:01635964:2016----::abi |
hierarchy_sort_str |
2016 |
publishDate |
2016 |
allfields |
10.1145/3007787.3001179 doi PQ20161201 (DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon DE-627 ger DE-627 rakwb eng 004 DE-600 Liu, Shaoli verfasserin aut Cambricon 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. Du, Zidong oth Tao, Jinhua oth Han, Dong oth Luo, Tao oth Xie, Yuan oth Chen, Yunji oth Chen, Tianshi oth Enthalten in Computer architecture news New York, NY : ACM, 1972 44(2016), 3, Seite 393-405 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:44 year:2016 number:3 pages:393-405 http://dx.doi.org/10.1145/3007787.3001179 Volltext http://dl.acm.org/citation.cfm?id=3001179 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 44 2016 3 393-405 |
spelling |
10.1145/3007787.3001179 doi PQ20161201 (DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon DE-627 ger DE-627 rakwb eng 004 DE-600 Liu, Shaoli verfasserin aut Cambricon 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. Du, Zidong oth Tao, Jinhua oth Han, Dong oth Luo, Tao oth Xie, Yuan oth Chen, Yunji oth Chen, Tianshi oth Enthalten in Computer architecture news New York, NY : ACM, 1972 44(2016), 3, Seite 393-405 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:44 year:2016 number:3 pages:393-405 http://dx.doi.org/10.1145/3007787.3001179 Volltext http://dl.acm.org/citation.cfm?id=3001179 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 44 2016 3 393-405 |
allfields_unstemmed |
10.1145/3007787.3001179 doi PQ20161201 (DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon DE-627 ger DE-627 rakwb eng 004 DE-600 Liu, Shaoli verfasserin aut Cambricon 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. Du, Zidong oth Tao, Jinhua oth Han, Dong oth Luo, Tao oth Xie, Yuan oth Chen, Yunji oth Chen, Tianshi oth Enthalten in Computer architecture news New York, NY : ACM, 1972 44(2016), 3, Seite 393-405 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:44 year:2016 number:3 pages:393-405 http://dx.doi.org/10.1145/3007787.3001179 Volltext http://dl.acm.org/citation.cfm?id=3001179 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 44 2016 3 393-405 |
allfieldsGer |
10.1145/3007787.3001179 doi PQ20161201 (DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon DE-627 ger DE-627 rakwb eng 004 DE-600 Liu, Shaoli verfasserin aut Cambricon 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. Du, Zidong oth Tao, Jinhua oth Han, Dong oth Luo, Tao oth Xie, Yuan oth Chen, Yunji oth Chen, Tianshi oth Enthalten in Computer architecture news New York, NY : ACM, 1972 44(2016), 3, Seite 393-405 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:44 year:2016 number:3 pages:393-405 http://dx.doi.org/10.1145/3007787.3001179 Volltext http://dl.acm.org/citation.cfm?id=3001179 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 44 2016 3 393-405 |
allfieldsSound |
10.1145/3007787.3001179 doi PQ20161201 (DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon DE-627 ger DE-627 rakwb eng 004 DE-600 Liu, Shaoli verfasserin aut Cambricon 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. Du, Zidong oth Tao, Jinhua oth Han, Dong oth Luo, Tao oth Xie, Yuan oth Chen, Yunji oth Chen, Tianshi oth Enthalten in Computer architecture news New York, NY : ACM, 1972 44(2016), 3, Seite 393-405 (DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 0163-5964 nnns volume:44 year:2016 number:3 pages:393-405 http://dx.doi.org/10.1145/3007787.3001179 Volltext http://dl.acm.org/citation.cfm?id=3001179 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 AR 44 2016 3 393-405 |
language |
English |
source |
Enthalten in Computer architecture news 44(2016), 3, Seite 393-405 volume:44 year:2016 number:3 pages:393-405 |
sourceStr |
Enthalten in Computer architecture news 44(2016), 3, Seite 393-405 volume:44 year:2016 number:3 pages:393-405 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
dewey-raw |
004 |
isfreeaccess_bool |
false |
container_title |
Computer architecture news |
authorswithroles_txt_mv |
Liu, Shaoli @@aut@@ Du, Zidong @@oth@@ Tao, Jinhua @@oth@@ Han, Dong @@oth@@ Luo, Tao @@oth@@ Xie, Yuan @@oth@@ Chen, Yunji @@oth@@ Chen, Tianshi @@oth@@ |
publishDateDaySort_date |
2016-01-01T00:00:00Z |
hierarchy_top_id |
129397881 |
dewey-sort |
14 |
id |
OLC1984484052 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1984484052</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230714223953.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">161202s2016 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1145/3007787.3001179</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20161201</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1984484052</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1984484052</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0040085820160000044000300393cambricon</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DE-600</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Liu, Shaoli</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Cambricon</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks.</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Du, Zidong</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Tao, Jinhua</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Han, Dong</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Luo, Tao</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Xie, Yuan</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Yunji</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Tianshi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Computer architecture news</subfield><subfield code="d">New York, NY : ACM, 1972</subfield><subfield code="g">44(2016), 3, Seite 393-405</subfield><subfield code="w">(DE-627)129397881</subfield><subfield code="w">(DE-600)186012-4</subfield><subfield code="w">(DE-576)014781093</subfield><subfield code="x">0163-5964</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:44</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:3</subfield><subfield code="g">pages:393-405</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1145/3007787.3001179</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://dl.acm.org/citation.cfm?id=3001179</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_134</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2021</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2190</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">44</subfield><subfield code="j">2016</subfield><subfield code="e">3</subfield><subfield code="h">393-405</subfield></datafield></record></collection>
|
author |
Liu, Shaoli |
spellingShingle |
Liu, Shaoli ddc 004 Cambricon |
authorStr |
Liu, Shaoli |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)129397881 |
format |
Article |
dewey-ones |
004 - Data processing & computer science |
delete_txt_mv |
keep |
author_role |
aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
0163-5964 |
topic_title |
004 DE-600 Cambricon |
topic |
ddc 004 |
topic_unstemmed |
ddc 004 |
topic_browse |
ddc 004 |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
author2_variant |
z d zd j t jt d h dh t l tl y x yx y c yc t c tc |
hierarchy_parent_title |
Computer architecture news |
hierarchy_parent_id |
129397881 |
dewey-tens |
000 - Computer science, knowledge & systems |
hierarchy_top_title |
Computer architecture news |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)129397881 (DE-600)186012-4 (DE-576)014781093 |
title |
Cambricon |
ctrlnum |
(DE-627)OLC1984484052 (DE-599)GBVOLC1984484052 (PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460 (KEY)0040085820160000044000300393cambricon |
title_full |
Cambricon |
author_sort |
Liu, Shaoli |
journal |
Computer architecture news |
journalStr |
Computer architecture news |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works |
recordtype |
marc |
publishDateSort |
2016 |
contenttype_str_mv |
txt |
container_start_page |
393 |
author_browse |
Liu, Shaoli |
container_volume |
44 |
class |
004 DE-600 |
format_se |
Aufsätze |
author-letter |
Liu, Shaoli |
doi_str_mv |
10.1145/3007787.3001179 |
dewey-full |
004 |
title_sort |
cambricon |
title_auth |
Cambricon |
abstract |
Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. |
abstractGer |
Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. |
abstract_unstemmed |
Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks. |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_134 GBV_ILN_2021 GBV_ILN_2190 |
container_issue |
3 |
title_short |
Cambricon |
url |
http://dx.doi.org/10.1145/3007787.3001179 http://dl.acm.org/citation.cfm?id=3001179 |
remote_bool |
false |
author2 |
Du, Zidong Tao, Jinhua Han, Dong Luo, Tao Xie, Yuan Chen, Yunji Chen, Tianshi |
author2Str |
Du, Zidong Tao, Jinhua Han, Dong Luo, Tao Xie, Yuan Chen, Yunji Chen, Tianshi |
ppnlink |
129397881 |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
author2_role |
oth oth oth oth oth oth oth |
doi_str |
10.1145/3007787.3001179 |
up_date |
2024-07-04T00:35:37.278Z |
_version_ |
1803606648190664704 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1984484052</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230714223953.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">161202s2016 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1145/3007787.3001179</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20161201</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1984484052</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1984484052</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)a599-21e2471407481508f05482b2e49f63b5ff7d7230d715e372dd9dff276490b2460</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0040085820160000044000300393cambricon</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DE-600</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Liu, Shaoli</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Cambricon</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Neural Networks (NN) are a family of models for a broad range of emerging machine learning and pattern recondition applications. NN techniques are conventionally executed on general-purpose processors (such as CPU and GPGPU), which are usually not energy-efficient since they invest excessive hardware resources to flexibly support various workloads. Consequently, application-specific hardware accelerators for neural networks have been proposed recently to improve the energy-efficiency. However, such accelerators were designed for a small set of NN techniques sharing similar computational patterns, and they adopt complex and informative instructions (control signals) directly corresponding to high-level functional blocks of an NN (such as layers), or even an NN as a whole. Although straightforward and easy-to-implement for a limited set of similar NN techniques, the lack of agility in the instruction set prevents such accelerator designs from supporting a variety of different NN techniques with sufficient flexibility and efficiency. In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for NN accelerators, called Cambricon, which is a load-store architecture that integrates scalar, vector, matrix, logical, data transfer, and control instructions, based on a comprehensive analysis of existing NN techniques. Our evaluation over a total of ten representative yet distinct NN techniques have demonstrated that Cambricon exhibits strong descriptive capacity over a broad range of NN techniques, and provides higher code density than general-purpose ISAs such as ×86, MIPS, and GPGPU. Compared to the latest state-of-the-art NN accelerator design DaDianNao [5] (which can only accommodate 3 types of NN techniques), our Cambricon-based accelerator prototype implemented in TSMC 65nm technology incurs only negligible latency/power/area overheads, with a versatile coverage of 10 different NN benchmarks.</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Du, Zidong</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Tao, Jinhua</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Han, Dong</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Luo, Tao</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Xie, Yuan</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Yunji</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Chen, Tianshi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Computer architecture news</subfield><subfield code="d">New York, NY : ACM, 1972</subfield><subfield code="g">44(2016), 3, Seite 393-405</subfield><subfield code="w">(DE-627)129397881</subfield><subfield code="w">(DE-600)186012-4</subfield><subfield code="w">(DE-576)014781093</subfield><subfield code="x">0163-5964</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:44</subfield><subfield code="g">year:2016</subfield><subfield code="g">number:3</subfield><subfield code="g">pages:393-405</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1145/3007787.3001179</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://dl.acm.org/citation.cfm?id=3001179</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_134</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2021</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2190</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">44</subfield><subfield code="j">2016</subfield><subfield code="e">3</subfield><subfield code="h">393-405</subfield></datafield></record></collection>
|
score |
7.398052 |