Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures
Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and c...
Ausführliche Beschreibung
Autor*in: |
Liu, Dajiang [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2015 |
---|
Schlagwörter: |
---|
Übergeordnetes Werk: |
Enthalten in: IEEE transactions on very large scale integration (VLSI) systems - New York, NY : Institute of Electrical and Electronics Engineers, 1993, 23(2015), 11, Seite 2581-2594 |
---|---|
Übergeordnetes Werk: |
volume:23 ; year:2015 ; number:11 ; pages:2581-2594 |
Links: |
---|
DOI / URN: |
10.1109/TVLSI.2014.2371854 |
---|
Katalog-ID: |
OLC1959572741 |
---|
LEADER | 01000caa a2200265 4500 | ||
---|---|---|---|
001 | OLC1959572741 | ||
003 | DE-627 | ||
005 | 20230714151614.0 | ||
007 | tu | ||
008 | 160206s2015 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1109/TVLSI.2014.2371854 |2 doi | |
028 | 5 | 2 | |a PQ20160617 |
035 | |a (DE-627)OLC1959572741 | ||
035 | |a (DE-599)GBVOLC1959572741 | ||
035 | |a (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 | ||
035 | |a (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 004 |a 620 |q DNB |
100 | 1 | |a Liu, Dajiang |e verfasserin |4 aut | |
245 | 1 | 0 | |a Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures |
264 | 1 | |c 2015 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
520 | |a Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. | ||
650 | 4 | |a loop nests | |
650 | 4 | |a polyhedral model | |
650 | 4 | |a Context | |
650 | 4 | |a Hardware | |
650 | 4 | |a Power demand | |
650 | 4 | |a Vectors | |
650 | 4 | |a coarse-grained reconfigurable architecture (CGRA) | |
650 | 4 | |a Affine transformation | |
650 | 4 | |a Arrays | |
650 | 4 | |a Optimization | |
700 | 1 | |a Yin, Shouyi |4 oth | |
700 | 1 | |a Peng, Yu |4 oth | |
700 | 1 | |a Liu, Leibo |4 oth | |
700 | 1 | |a Wei, Shaojun |4 oth | |
773 | 0 | 8 | |i Enthalten in |t IEEE transactions on very large scale integration (VLSI) systems |d New York, NY : Institute of Electrical and Electronics Engineers, 1993 |g 23(2015), 11, Seite 2581-2594 |w (DE-627)165670282 |w (DE-600)1151835-2 |w (DE-576)034204024 |x 1063-8210 |7 nnns |
773 | 1 | 8 | |g volume:23 |g year:2015 |g number:11 |g pages:2581-2594 |
856 | 4 | 1 | |u http://dx.doi.org/10.1109/TVLSI.2014.2371854 |3 Volltext |
856 | 4 | 2 | |u http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 |
856 | 4 | 2 | |u http://search.proquest.com/docview/1729394789 |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-TEC | ||
912 | |a SSG-OLC-MAT | ||
912 | |a GBV_ILN_70 | ||
912 | |a GBV_ILN_2002 | ||
951 | |a AR | ||
952 | |d 23 |j 2015 |e 11 |h 2581-2594 |
author_variant |
d l dl |
---|---|
matchkey_str |
article:10638210:2015----::piiigptampigfetdopocasgandeo |
hierarchy_sort_str |
2015 |
publishDate |
2015 |
allfields |
10.1109/TVLSI.2014.2371854 doi PQ20160617 (DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain DE-627 ger DE-627 rakwb eng 004 620 DNB Liu, Dajiang verfasserin aut Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization Yin, Shouyi oth Peng, Yu oth Liu, Leibo oth Wei, Shaojun oth Enthalten in IEEE transactions on very large scale integration (VLSI) systems New York, NY : Institute of Electrical and Electronics Engineers, 1993 23(2015), 11, Seite 2581-2594 (DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024 1063-8210 nnns volume:23 year:2015 number:11 pages:2581-2594 http://dx.doi.org/10.1109/TVLSI.2014.2371854 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002 AR 23 2015 11 2581-2594 |
spelling |
10.1109/TVLSI.2014.2371854 doi PQ20160617 (DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain DE-627 ger DE-627 rakwb eng 004 620 DNB Liu, Dajiang verfasserin aut Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization Yin, Shouyi oth Peng, Yu oth Liu, Leibo oth Wei, Shaojun oth Enthalten in IEEE transactions on very large scale integration (VLSI) systems New York, NY : Institute of Electrical and Electronics Engineers, 1993 23(2015), 11, Seite 2581-2594 (DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024 1063-8210 nnns volume:23 year:2015 number:11 pages:2581-2594 http://dx.doi.org/10.1109/TVLSI.2014.2371854 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002 AR 23 2015 11 2581-2594 |
allfields_unstemmed |
10.1109/TVLSI.2014.2371854 doi PQ20160617 (DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain DE-627 ger DE-627 rakwb eng 004 620 DNB Liu, Dajiang verfasserin aut Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization Yin, Shouyi oth Peng, Yu oth Liu, Leibo oth Wei, Shaojun oth Enthalten in IEEE transactions on very large scale integration (VLSI) systems New York, NY : Institute of Electrical and Electronics Engineers, 1993 23(2015), 11, Seite 2581-2594 (DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024 1063-8210 nnns volume:23 year:2015 number:11 pages:2581-2594 http://dx.doi.org/10.1109/TVLSI.2014.2371854 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002 AR 23 2015 11 2581-2594 |
allfieldsGer |
10.1109/TVLSI.2014.2371854 doi PQ20160617 (DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain DE-627 ger DE-627 rakwb eng 004 620 DNB Liu, Dajiang verfasserin aut Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization Yin, Shouyi oth Peng, Yu oth Liu, Leibo oth Wei, Shaojun oth Enthalten in IEEE transactions on very large scale integration (VLSI) systems New York, NY : Institute of Electrical and Electronics Engineers, 1993 23(2015), 11, Seite 2581-2594 (DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024 1063-8210 nnns volume:23 year:2015 number:11 pages:2581-2594 http://dx.doi.org/10.1109/TVLSI.2014.2371854 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002 AR 23 2015 11 2581-2594 |
allfieldsSound |
10.1109/TVLSI.2014.2371854 doi PQ20160617 (DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain DE-627 ger DE-627 rakwb eng 004 620 DNB Liu, Dajiang verfasserin aut Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization Yin, Shouyi oth Peng, Yu oth Liu, Leibo oth Wei, Shaojun oth Enthalten in IEEE transactions on very large scale integration (VLSI) systems New York, NY : Institute of Electrical and Electronics Engineers, 1993 23(2015), 11, Seite 2581-2594 (DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024 1063-8210 nnns volume:23 year:2015 number:11 pages:2581-2594 http://dx.doi.org/10.1109/TVLSI.2014.2371854 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002 AR 23 2015 11 2581-2594 |
language |
English |
source |
Enthalten in IEEE transactions on very large scale integration (VLSI) systems 23(2015), 11, Seite 2581-2594 volume:23 year:2015 number:11 pages:2581-2594 |
sourceStr |
Enthalten in IEEE transactions on very large scale integration (VLSI) systems 23(2015), 11, Seite 2581-2594 volume:23 year:2015 number:11 pages:2581-2594 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization |
dewey-raw |
004 |
isfreeaccess_bool |
false |
container_title |
IEEE transactions on very large scale integration (VLSI) systems |
authorswithroles_txt_mv |
Liu, Dajiang @@aut@@ Yin, Shouyi @@oth@@ Peng, Yu @@oth@@ Liu, Leibo @@oth@@ Wei, Shaojun @@oth@@ |
publishDateDaySort_date |
2015-01-01T00:00:00Z |
hierarchy_top_id |
165670282 |
dewey-sort |
14 |
id |
OLC1959572741 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1959572741</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230714151614.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160206s2015 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1109/TVLSI.2014.2371854</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160617</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1959572741</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1959572741</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="a">620</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Liu, Dajiang</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2015</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">loop nests</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">polyhedral model</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Context</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Hardware</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Power demand</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Vectors</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">coarse-grained reconfigurable architecture (CGRA)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Affine transformation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Arrays</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Optimization</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yin, Shouyi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Peng, Yu</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Liu, Leibo</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wei, Shaojun</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">IEEE transactions on very large scale integration (VLSI) systems</subfield><subfield code="d">New York, NY : Institute of Electrical and Electronics Engineers, 1993</subfield><subfield code="g">23(2015), 11, Seite 2581-2594</subfield><subfield code="w">(DE-627)165670282</subfield><subfield code="w">(DE-600)1151835-2</subfield><subfield code="w">(DE-576)034204024</subfield><subfield code="x">1063-8210</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:23</subfield><subfield code="g">year:2015</subfield><subfield code="g">number:11</subfield><subfield code="g">pages:2581-2594</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1109/TVLSI.2014.2371854</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://search.proquest.com/docview/1729394789</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2002</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">23</subfield><subfield code="j">2015</subfield><subfield code="e">11</subfield><subfield code="h">2581-2594</subfield></datafield></record></collection>
|
author |
Liu, Dajiang |
spellingShingle |
Liu, Dajiang ddc 004 misc loop nests misc polyhedral model misc Context misc Hardware misc Power demand misc Vectors misc coarse-grained reconfigurable architecture (CGRA) misc Affine transformation misc Arrays misc Optimization Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures |
authorStr |
Liu, Dajiang |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)165670282 |
format |
Article |
dewey-ones |
004 - Data processing & computer science 620 - Engineering & allied operations |
delete_txt_mv |
keep |
author_role |
aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
1063-8210 |
topic_title |
004 620 DNB Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization |
topic |
ddc 004 misc loop nests misc polyhedral model misc Context misc Hardware misc Power demand misc Vectors misc coarse-grained reconfigurable architecture (CGRA) misc Affine transformation misc Arrays misc Optimization |
topic_unstemmed |
ddc 004 misc loop nests misc polyhedral model misc Context misc Hardware misc Power demand misc Vectors misc coarse-grained reconfigurable architecture (CGRA) misc Affine transformation misc Arrays misc Optimization |
topic_browse |
ddc 004 misc loop nests misc polyhedral model misc Context misc Hardware misc Power demand misc Vectors misc coarse-grained reconfigurable architecture (CGRA) misc Affine transformation misc Arrays misc Optimization |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
author2_variant |
s y sy y p yp l l ll s w sw |
hierarchy_parent_title |
IEEE transactions on very large scale integration (VLSI) systems |
hierarchy_parent_id |
165670282 |
dewey-tens |
000 - Computer science, knowledge & systems 620 - Engineering |
hierarchy_top_title |
IEEE transactions on very large scale integration (VLSI) systems |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024 |
title |
Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures |
ctrlnum |
(DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain |
title_full |
Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures |
author_sort |
Liu, Dajiang |
journal |
IEEE transactions on very large scale integration (VLSI) systems |
journalStr |
IEEE transactions on very large scale integration (VLSI) systems |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works 600 - Technology |
recordtype |
marc |
publishDateSort |
2015 |
contenttype_str_mv |
txt |
container_start_page |
2581 |
author_browse |
Liu, Dajiang |
container_volume |
23 |
class |
004 620 DNB |
format_se |
Aufsätze |
author-letter |
Liu, Dajiang |
doi_str_mv |
10.1109/TVLSI.2014.2371854 |
dewey-full |
004 620 |
title_sort |
optimizing spatial mapping of nested loop for coarse-grained reconfigurable architectures |
title_auth |
Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures |
abstract |
Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. |
abstractGer |
Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. |
abstract_unstemmed |
Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002 |
container_issue |
11 |
title_short |
Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures |
url |
http://dx.doi.org/10.1109/TVLSI.2014.2371854 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789 |
remote_bool |
false |
author2 |
Yin, Shouyi Peng, Yu Liu, Leibo Wei, Shaojun |
author2Str |
Yin, Shouyi Peng, Yu Liu, Leibo Wei, Shaojun |
ppnlink |
165670282 |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
author2_role |
oth oth oth oth |
doi_str |
10.1109/TVLSI.2014.2371854 |
up_date |
2024-07-03T17:55:49.090Z |
_version_ |
1803581494766075904 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1959572741</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230714151614.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160206s2015 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1109/TVLSI.2014.2371854</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160617</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1959572741</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1959572741</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="a">620</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Liu, Dajiang</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2015</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">loop nests</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">polyhedral model</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Context</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Hardware</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Power demand</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Vectors</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">coarse-grained reconfigurable architecture (CGRA)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Affine transformation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Arrays</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Optimization</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yin, Shouyi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Peng, Yu</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Liu, Leibo</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wei, Shaojun</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">IEEE transactions on very large scale integration (VLSI) systems</subfield><subfield code="d">New York, NY : Institute of Electrical and Electronics Engineers, 1993</subfield><subfield code="g">23(2015), 11, Seite 2581-2594</subfield><subfield code="w">(DE-627)165670282</subfield><subfield code="w">(DE-600)1151835-2</subfield><subfield code="w">(DE-576)034204024</subfield><subfield code="x">1063-8210</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:23</subfield><subfield code="g">year:2015</subfield><subfield code="g">number:11</subfield><subfield code="g">pages:2581-2594</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1109/TVLSI.2014.2371854</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://search.proquest.com/docview/1729394789</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2002</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">23</subfield><subfield code="j">2015</subfield><subfield code="e">11</subfield><subfield code="h">2581-2594</subfield></datafield></record></collection>
|
score |
7.400405 |