Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures

Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and c...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Liu, Dajiang [verfasserIn] Yin, Shouyi Peng, Yu Liu, Leibo Wei, Shaojun

Format:	Artikel
Sprache:	Englisch

Erschienen:	2015

Schlagwörter:	loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization

Übergeordnetes Werk:	Enthalten in: IEEE transactions on very large scale integration (VLSI) systems - New York, NY : Institute of Electrical and Electronics Engineers, 1993, 23(2015), 11, Seite 2581-2594
Übergeordnetes Werk:	volume:23 ; year:2015 ; number:11 ; pages:2581-2594

Links:	Volltext Link aufrufen Link aufrufen

DOI / URN:	10.1109/TVLSI.2014.2371854

Katalog-ID:	OLC1959572741

Internformat


LEADER	01000caa a2200265 4500
001	OLC1959572741
003	DE-627
005	20230714151614.0
007	tu
008	160206s2015 xx \|\|\|\|\| 00\| \|\|eng c
024	7		\|a 10.1109/TVLSI.2014.2371854 \|2 doi
028	5	2	\|a PQ20160617
035			\|a (DE-627)OLC1959572741
035			\|a (DE-599)GBVOLC1959572741
035			\|a (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0
035			\|a (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 004 \|a 620 \|q DNB
100	1		\|a Liu, Dajiang \|e verfasserin \|4 aut
245	1	0	\|a Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures
264		1	\|c 2015
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
520			\|a Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable.
650		4	\|a loop nests
650		4	\|a polyhedral model
650		4	\|a Context
650		4	\|a Hardware
650		4	\|a Power demand
650		4	\|a Vectors
650		4	\|a coarse-grained reconfigurable architecture (CGRA)
650		4	\|a Affine transformation
650		4	\|a Arrays
650		4	\|a Optimization
700	1		\|a Yin, Shouyi \|4 oth
700	1		\|a Peng, Yu \|4 oth
700	1		\|a Liu, Leibo \|4 oth
700	1		\|a Wei, Shaojun \|4 oth
773	0	8	\|i Enthalten in \|t IEEE transactions on very large scale integration (VLSI) systems \|d New York, NY : Institute of Electrical and Electronics Engineers, 1993 \|g 23(2015), 11, Seite 2581-2594 \|w (DE-627)165670282 \|w (DE-600)1151835-2 \|w (DE-576)034204024 \|x 1063-8210 \|7 nnns
773	1	8	\|g volume:23 \|g year:2015 \|g number:11 \|g pages:2581-2594
856	4	1	\|u http://dx.doi.org/10.1109/TVLSI.2014.2371854 \|3 Volltext
856	4	2	\|u http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969
856	4	2	\|u http://search.proquest.com/docview/1729394789
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_OLC
912			\|a SSG-OLC-TEC
912			\|a SSG-OLC-MAT
912			\|a GBV_ILN_70
912			\|a GBV_ILN_2002
951			\|a AR
952			\|d 23 \|j 2015 \|e 11 \|h 2581-2594

Indexfelder

author_variant	d l dl
matchkey_str	article:10638210:2015----::piiigptampigfetdopocasgandeo
hierarchy_sort_str	2015
publishDate	2015
allfields	10.1109/TVLSI.2014.2371854 doi PQ20160617 (DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain DE-627 ger DE-627 rakwb eng 004 620 DNB Liu, Dajiang verfasserin aut Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization Yin, Shouyi oth Peng, Yu oth Liu, Leibo oth Wei, Shaojun oth Enthalten in IEEE transactions on very large scale integration (VLSI) systems New York, NY : Institute of Electrical and Electronics Engineers, 1993 23(2015), 11, Seite 2581-2594 (DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024 1063-8210 nnns volume:23 year:2015 number:11 pages:2581-2594 http://dx.doi.org/10.1109/TVLSI.2014.2371854 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002 AR 23 2015 11 2581-2594
spelling	10.1109/TVLSI.2014.2371854 doi PQ20160617 (DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain DE-627 ger DE-627 rakwb eng 004 620 DNB Liu, Dajiang verfasserin aut Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization Yin, Shouyi oth Peng, Yu oth Liu, Leibo oth Wei, Shaojun oth Enthalten in IEEE transactions on very large scale integration (VLSI) systems New York, NY : Institute of Electrical and Electronics Engineers, 1993 23(2015), 11, Seite 2581-2594 (DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024 1063-8210 nnns volume:23 year:2015 number:11 pages:2581-2594 http://dx.doi.org/10.1109/TVLSI.2014.2371854 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002 AR 23 2015 11 2581-2594
allfields_unstemmed	10.1109/TVLSI.2014.2371854 doi PQ20160617 (DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain DE-627 ger DE-627 rakwb eng 004 620 DNB Liu, Dajiang verfasserin aut Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization Yin, Shouyi oth Peng, Yu oth Liu, Leibo oth Wei, Shaojun oth Enthalten in IEEE transactions on very large scale integration (VLSI) systems New York, NY : Institute of Electrical and Electronics Engineers, 1993 23(2015), 11, Seite 2581-2594 (DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024 1063-8210 nnns volume:23 year:2015 number:11 pages:2581-2594 http://dx.doi.org/10.1109/TVLSI.2014.2371854 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002 AR 23 2015 11 2581-2594
allfieldsGer	10.1109/TVLSI.2014.2371854 doi PQ20160617 (DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain DE-627 ger DE-627 rakwb eng 004 620 DNB Liu, Dajiang verfasserin aut Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization Yin, Shouyi oth Peng, Yu oth Liu, Leibo oth Wei, Shaojun oth Enthalten in IEEE transactions on very large scale integration (VLSI) systems New York, NY : Institute of Electrical and Electronics Engineers, 1993 23(2015), 11, Seite 2581-2594 (DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024 1063-8210 nnns volume:23 year:2015 number:11 pages:2581-2594 http://dx.doi.org/10.1109/TVLSI.2014.2371854 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002 AR 23 2015 11 2581-2594
allfieldsSound	10.1109/TVLSI.2014.2371854 doi PQ20160617 (DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain DE-627 ger DE-627 rakwb eng 004 620 DNB Liu, Dajiang verfasserin aut Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures 2015 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable. loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization Yin, Shouyi oth Peng, Yu oth Liu, Leibo oth Wei, Shaojun oth Enthalten in IEEE transactions on very large scale integration (VLSI) systems New York, NY : Institute of Electrical and Electronics Engineers, 1993 23(2015), 11, Seite 2581-2594 (DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024 1063-8210 nnns volume:23 year:2015 number:11 pages:2581-2594 http://dx.doi.org/10.1109/TVLSI.2014.2371854 Volltext http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002 AR 23 2015 11 2581-2594
language	English
source	Enthalten in IEEE transactions on very large scale integration (VLSI) systems 23(2015), 11, Seite 2581-2594 volume:23 year:2015 number:11 pages:2581-2594
sourceStr	Enthalten in IEEE transactions on very large scale integration (VLSI) systems 23(2015), 11, Seite 2581-2594 volume:23 year:2015 number:11 pages:2581-2594
format_phy_str_mv	Article
institution	findex.gbv.de
topic_facet	loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization
dewey-raw	004
isfreeaccess_bool	false
container_title	IEEE transactions on very large scale integration (VLSI) systems
authorswithroles_txt_mv	Liu, Dajiang @@aut@@ Yin, Shouyi @@oth@@ Peng, Yu @@oth@@ Liu, Leibo @@oth@@ Wei, Shaojun @@oth@@
publishDateDaySort_date	2015-01-01T00:00:00Z
hierarchy_top_id	165670282
dewey-sort	14
id	OLC1959572741
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1959572741</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230714151614.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160206s2015 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1109/TVLSI.2014.2371854</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160617</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1959572741</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1959572741</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="a">620</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Liu, Dajiang</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2015</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">loop nests</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">polyhedral model</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Context</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Hardware</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Power demand</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Vectors</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">coarse-grained reconfigurable architecture (CGRA)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Affine transformation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Arrays</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Optimization</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yin, Shouyi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Peng, Yu</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Liu, Leibo</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wei, Shaojun</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">IEEE transactions on very large scale integration (VLSI) systems</subfield><subfield code="d">New York, NY : Institute of Electrical and Electronics Engineers, 1993</subfield><subfield code="g">23(2015), 11, Seite 2581-2594</subfield><subfield code="w">(DE-627)165670282</subfield><subfield code="w">(DE-600)1151835-2</subfield><subfield code="w">(DE-576)034204024</subfield><subfield code="x">1063-8210</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:23</subfield><subfield code="g">year:2015</subfield><subfield code="g">number:11</subfield><subfield code="g">pages:2581-2594</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1109/TVLSI.2014.2371854</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://search.proquest.com/docview/1729394789</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2002</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">23</subfield><subfield code="j">2015</subfield><subfield code="e">11</subfield><subfield code="h">2581-2594</subfield></datafield></record></collection>
author	Liu, Dajiang
spellingShingle	Liu, Dajiang ddc 004 misc loop nests misc polyhedral model misc Context misc Hardware misc Power demand misc Vectors misc coarse-grained reconfigurable architecture (CGRA) misc Affine transformation misc Arrays misc Optimization Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures
authorStr	Liu, Dajiang
ppnlink_with_tag_str_mv	@@773@@(DE-627)165670282
format	Article
dewey-ones	004 - Data processing & computer science 620 - Engineering & allied operations
delete_txt_mv	keep
author_role	aut
collection	OLC
remote_str	false
illustrated	Not Illustrated
issn	1063-8210
topic_title	004 620 DNB Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures loop nests polyhedral model Context Hardware Power demand Vectors coarse-grained reconfigurable architecture (CGRA) Affine transformation Arrays Optimization
topic	ddc 004 misc loop nests misc polyhedral model misc Context misc Hardware misc Power demand misc Vectors misc coarse-grained reconfigurable architecture (CGRA) misc Affine transformation misc Arrays misc Optimization
topic_unstemmed	ddc 004 misc loop nests misc polyhedral model misc Context misc Hardware misc Power demand misc Vectors misc coarse-grained reconfigurable architecture (CGRA) misc Affine transformation misc Arrays misc Optimization
topic_browse	ddc 004 misc loop nests misc polyhedral model misc Context misc Hardware misc Power demand misc Vectors misc coarse-grained reconfigurable architecture (CGRA) misc Affine transformation misc Arrays misc Optimization
format_facet	Aufsätze Gedruckte Aufsätze
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	nc
author2_variant	s y sy y p yp l l ll s w sw
hierarchy_parent_title	IEEE transactions on very large scale integration (VLSI) systems
hierarchy_parent_id	165670282
dewey-tens	000 - Computer science, knowledge & systems 620 - Engineering
hierarchy_top_title	IEEE transactions on very large scale integration (VLSI) systems
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)165670282 (DE-600)1151835-2 (DE-576)034204024
title	Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures
ctrlnum	(DE-627)OLC1959572741 (DE-599)GBVOLC1959572741 (PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0 (KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain
title_full	Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures
author_sort	Liu, Dajiang
journal	IEEE transactions on very large scale integration (VLSI) systems
journalStr	IEEE transactions on very large scale integration (VLSI) systems
lang_code	eng
isOA_bool	false
dewey-hundreds	000 - Computer science, information & general works 600 - Technology
recordtype	marc
publishDateSort	2015
contenttype_str_mv	txt
container_start_page	2581
author_browse	Liu, Dajiang
container_volume	23
class	004 620 DNB
format_se	Aufsätze
author-letter	Liu, Dajiang
doi_str_mv	10.1109/TVLSI.2014.2371854
dewey-full	004 620
title_sort	optimizing spatial mapping of nested loop for coarse-grained reconfigurable architectures
title_auth	Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures
abstract	Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable.
abstractGer	Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable.
abstract_unstemmed	Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable.
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT GBV_ILN_70 GBV_ILN_2002
container_issue	11
title_short	Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures
url	http://dx.doi.org/10.1109/TVLSI.2014.2371854 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969 http://search.proquest.com/docview/1729394789
remote_bool	false
author2	Yin, Shouyi Peng, Yu Liu, Leibo Wei, Shaojun
author2Str	Yin, Shouyi Peng, Yu Liu, Leibo Wei, Shaojun
ppnlink	165670282
mediatype_str_mv	n
isOA_txt	false
hochschulschrift_bool	false
author2_role	oth oth oth oth
doi_str	10.1109/TVLSI.2014.2371854
up_date	2024-07-03T17:55:49.090Z
_version_	1803581494766075904
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1959572741</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230714151614.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160206s2015 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1109/TVLSI.2014.2371854</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160617</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1959572741</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1959572741</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)c1600-f9c4b257e760b3c23cd2c5ab09b0b41fcaa401f153ef32d38d394fa18b09ac0d0</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0226264920150000023001102581optimizingspatialmappingofnestedloopforcoarsegrain</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="a">620</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Liu, Dajiang</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2015</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Coarse-grained reconfigurable architectures (CGRAs) have drawn increasing attention due to their flexibility and efficiency. Loops in applications are often mapped onto CGRAs for acceleration, and the mapping of loops onto CGRA is quite a challenging work due to the parallel execution paradigm and constrained hardware resource. To map loops onto CGRAs efficiently, it is important to transform loops into pieces that obey hardware resource constraints with less overhead (e.g., communication and configuration overhead). In this paper, we tackle this problem by establishing a performance optimization problem, including loop transformation and back- end placing and routing. A novel searching strategy is also designed to find the optimal result efficiently. Finally, we built a complete flow of mapping loop nests onto CGRA. Experiment results on most kernels of the Polybench show that our proposed approach can improve the performance of the kernels by 42% on average, as compared with the state-of-the-art methods. The runtime complexity of our approach is also acceptable.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">loop nests</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">polyhedral model</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Context</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Hardware</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Power demand</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Vectors</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">coarse-grained reconfigurable architecture (CGRA)</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Affine transformation</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Arrays</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Optimization</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Yin, Shouyi</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Peng, Yu</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Liu, Leibo</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Wei, Shaojun</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">IEEE transactions on very large scale integration (VLSI) systems</subfield><subfield code="d">New York, NY : Institute of Electrical and Electronics Engineers, 1993</subfield><subfield code="g">23(2015), 11, Seite 2581-2594</subfield><subfield code="w">(DE-627)165670282</subfield><subfield code="w">(DE-600)1151835-2</subfield><subfield code="w">(DE-576)034204024</subfield><subfield code="x">1063-8210</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:23</subfield><subfield code="g">year:2015</subfield><subfield code="g">number:11</subfield><subfield code="g">pages:2581-2594</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1109/TVLSI.2014.2371854</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6977969</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://search.proquest.com/docview/1729394789</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2002</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">23</subfield><subfield code="j">2015</subfield><subfield code="e">11</subfield><subfield code="h">2581-2594</subfield></datafield></record></collection>
score	7.400405

Nicht das Richtige dabei?

Schreiben Sie uns!

Optimizing Spatial Mapping of Nested Loop for Coarse-Grained Reconfigurable Architectures

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?