Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems

The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large over...
Ausführliche Beschreibung

Gespeichert in:

Autor*in:	Zhou, Huan [verfasserIn] Idrees, Kamran Gracia, José

Format:	Artikel
Sprache:	Englisch

Erschienen:	2016

Schlagwörter:	Cluster Computing Parallel Distributed Computer Science

Übergeordnetes Werk:	Enthalten in: Lecture notes in computer science - Berlin, Germany : Springer, 1973, (2016)
Übergeordnetes Werk:	year:2016

Links:	Volltext Link aufrufen

DOI / URN:	10.1007/978-3-662-48096-0_29

Katalog-ID:	OLC1973550369

Internformat


LEADER	01000caa a2200265 4500
001	OLC1973550369
003	DE-627
005	20220224094737.0
007	tu
008	160430s2016 xx \|\|\|\|\| 00\| \|\|eng c
024	7		\|a 10.1007/978-3-662-48096-0_29 \|2 doi
028	5	2	\|a PQ20160430
035			\|a (DE-627)OLC1973550369
035			\|a (DE-599)GBVOLC1973550369
035			\|a (PRQ)a627-70ebbc031968e56581418beefc0b1eaccf00f1366b9105edf9a1fe8f523cd91c0
035			\|a (KEY)0013707320160000000000000000leveragingmpi3sharedmemoryextensionsforefficientpg
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
082	0	4	\|a 004 \|q DNB
082	0	4	\|a 620 \|q AVZ
100	1		\|a Zhou, Huan \|e verfasserin \|4 aut
245	1	0	\|a Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems
264		1	\|c 2016
336			\|a Text \|b txt \|2 rdacontent
337			\|a ohne Hilfsmittel zu benutzen \|b n \|2 rdamedia
338			\|a Band \|b nc \|2 rdacarrier
520			\|a The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intra-node transfers, and that of MPI-3 for inter-node transfers.
650		4	\|a Cluster Computing
650		4	\|a Parallel
650		4	\|a Distributed
650		4	\|a Computer Science
700	1		\|a Idrees, Kamran \|4 oth
700	1		\|a Gracia, José \|4 oth
773	0	8	\|i Enthalten in \|t Lecture notes in computer science \|d Berlin, Germany : Springer, 1973 \|g (2016) \|w (DE-627)129300152 \|w (DE-600)121909-1 \|w (DE-576)014492687 \|x 0302-9743
773	1	8	\|g year:2016
856	4	1	\|u http://dx.doi.org/10.1007/978-3-662-48096-0_29 \|3 Volltext
856	4	2	\|u http://arxiv.org/abs/1603.02226
912			\|a GBV_USEFLAG_A
912			\|a SYSFLAG_A
912			\|a GBV_OLC
912			\|a SSG-OLC-TEC
912			\|a SSG-OLC-MAT
912			\|a SSG-OPC-BBI
912			\|a GBV_ILN_70
912			\|a GBV_ILN_2018
951			\|a AR
952			\|j 2016

Indexfelder

author_variant	h z hz
matchkey_str	article:03029743:2016----::eeaigp3hrdeoyxesosoefcet
hierarchy_sort_str	2016
publishDate	2016
allfields	10.1007/978-3-662-48096-0_29 doi PQ20160430 (DE-627)OLC1973550369 (DE-599)GBVOLC1973550369 (PRQ)a627-70ebbc031968e56581418beefc0b1eaccf00f1366b9105edf9a1fe8f523cd91c0 (KEY)0013707320160000000000000000leveragingmpi3sharedmemoryextensionsforefficientpg DE-627 ger DE-627 rakwb eng 004 DNB 620 AVZ Zhou, Huan verfasserin aut Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intra-node transfers, and that of MPI-3 for inter-node transfers. Cluster Computing Parallel Distributed Computer Science Idrees, Kamran oth Gracia, José oth Enthalten in Lecture notes in computer science Berlin, Germany : Springer, 1973 (2016) (DE-627)129300152 (DE-600)121909-1 (DE-576)014492687 0302-9743 year:2016 http://dx.doi.org/10.1007/978-3-662-48096-0_29 Volltext http://arxiv.org/abs/1603.02226 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_70 GBV_ILN_2018 AR 2016
spelling	10.1007/978-3-662-48096-0_29 doi PQ20160430 (DE-627)OLC1973550369 (DE-599)GBVOLC1973550369 (PRQ)a627-70ebbc031968e56581418beefc0b1eaccf00f1366b9105edf9a1fe8f523cd91c0 (KEY)0013707320160000000000000000leveragingmpi3sharedmemoryextensionsforefficientpg DE-627 ger DE-627 rakwb eng 004 DNB 620 AVZ Zhou, Huan verfasserin aut Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intra-node transfers, and that of MPI-3 for inter-node transfers. Cluster Computing Parallel Distributed Computer Science Idrees, Kamran oth Gracia, José oth Enthalten in Lecture notes in computer science Berlin, Germany : Springer, 1973 (2016) (DE-627)129300152 (DE-600)121909-1 (DE-576)014492687 0302-9743 year:2016 http://dx.doi.org/10.1007/978-3-662-48096-0_29 Volltext http://arxiv.org/abs/1603.02226 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_70 GBV_ILN_2018 AR 2016
allfields_unstemmed	10.1007/978-3-662-48096-0_29 doi PQ20160430 (DE-627)OLC1973550369 (DE-599)GBVOLC1973550369 (PRQ)a627-70ebbc031968e56581418beefc0b1eaccf00f1366b9105edf9a1fe8f523cd91c0 (KEY)0013707320160000000000000000leveragingmpi3sharedmemoryextensionsforefficientpg DE-627 ger DE-627 rakwb eng 004 DNB 620 AVZ Zhou, Huan verfasserin aut Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intra-node transfers, and that of MPI-3 for inter-node transfers. Cluster Computing Parallel Distributed Computer Science Idrees, Kamran oth Gracia, José oth Enthalten in Lecture notes in computer science Berlin, Germany : Springer, 1973 (2016) (DE-627)129300152 (DE-600)121909-1 (DE-576)014492687 0302-9743 year:2016 http://dx.doi.org/10.1007/978-3-662-48096-0_29 Volltext http://arxiv.org/abs/1603.02226 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_70 GBV_ILN_2018 AR 2016
allfieldsGer	10.1007/978-3-662-48096-0_29 doi PQ20160430 (DE-627)OLC1973550369 (DE-599)GBVOLC1973550369 (PRQ)a627-70ebbc031968e56581418beefc0b1eaccf00f1366b9105edf9a1fe8f523cd91c0 (KEY)0013707320160000000000000000leveragingmpi3sharedmemoryextensionsforefficientpg DE-627 ger DE-627 rakwb eng 004 DNB 620 AVZ Zhou, Huan verfasserin aut Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intra-node transfers, and that of MPI-3 for inter-node transfers. Cluster Computing Parallel Distributed Computer Science Idrees, Kamran oth Gracia, José oth Enthalten in Lecture notes in computer science Berlin, Germany : Springer, 1973 (2016) (DE-627)129300152 (DE-600)121909-1 (DE-576)014492687 0302-9743 year:2016 http://dx.doi.org/10.1007/978-3-662-48096-0_29 Volltext http://arxiv.org/abs/1603.02226 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_70 GBV_ILN_2018 AR 2016
allfieldsSound	10.1007/978-3-662-48096-0_29 doi PQ20160430 (DE-627)OLC1973550369 (DE-599)GBVOLC1973550369 (PRQ)a627-70ebbc031968e56581418beefc0b1eaccf00f1366b9105edf9a1fe8f523cd91c0 (KEY)0013707320160000000000000000leveragingmpi3sharedmemoryextensionsforefficientpg DE-627 ger DE-627 rakwb eng 004 DNB 620 AVZ Zhou, Huan verfasserin aut Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems 2016 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intra-node transfers, and that of MPI-3 for inter-node transfers. Cluster Computing Parallel Distributed Computer Science Idrees, Kamran oth Gracia, José oth Enthalten in Lecture notes in computer science Berlin, Germany : Springer, 1973 (2016) (DE-627)129300152 (DE-600)121909-1 (DE-576)014492687 0302-9743 year:2016 http://dx.doi.org/10.1007/978-3-662-48096-0_29 Volltext http://arxiv.org/abs/1603.02226 GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_70 GBV_ILN_2018 AR 2016
language	English
source	Enthalten in Lecture notes in computer science (2016) year:2016
sourceStr	Enthalten in Lecture notes in computer science (2016) year:2016
format_phy_str_mv	Article
institution	findex.gbv.de
topic_facet	Cluster Computing Parallel Distributed Computer Science
dewey-raw	004
isfreeaccess_bool	false
container_title	Lecture notes in computer science
authorswithroles_txt_mv	Zhou, Huan @@aut@@ Idrees, Kamran @@oth@@ Gracia, José @@oth@@
publishDateDaySort_date	2016-01-01T00:00:00Z
hierarchy_top_id	129300152
dewey-sort	14
id	OLC1973550369
language_de	englisch
fullrecord	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1973550369</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20220224094737.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160430s2016 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/978-3-662-48096-0_29</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160430</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1973550369</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1973550369</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)a627-70ebbc031968e56581418beefc0b1eaccf00f1366b9105edf9a1fe8f523cd91c0</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0013707320160000000000000000leveragingmpi3sharedmemoryextensionsforefficientpg</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">620</subfield><subfield code="q">AVZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Zhou, Huan</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intra-node transfers, and that of MPI-3 for inter-node transfers.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cluster Computing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Parallel</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Distributed</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer Science</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Idrees, Kamran</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Gracia, José</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Lecture notes in computer science</subfield><subfield code="d">Berlin, Germany : Springer, 1973</subfield><subfield code="g">(2016)</subfield><subfield code="w">(DE-627)129300152</subfield><subfield code="w">(DE-600)121909-1</subfield><subfield code="w">(DE-576)014492687</subfield><subfield code="x">0302-9743</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">year:2016</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1007/978-3-662-48096-0_29</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://arxiv.org/abs/1603.02226</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2018</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="j">2016</subfield></datafield></record></collection>
author	Zhou, Huan
spellingShingle	Zhou, Huan ddc 004 ddc 620 misc Cluster Computing misc Parallel misc Distributed misc Computer Science Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems
authorStr	Zhou, Huan
ppnlink_with_tag_str_mv	@@773@@(DE-627)129300152
format	Article
dewey-ones	004 - Data processing & computer science 620 - Engineering & allied operations
delete_txt_mv	keep
author_role	aut
collection	OLC
remote_str	false
illustrated	Not Illustrated
issn	0302-9743
topic_title	004 DNB 620 AVZ Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems Cluster Computing Parallel Distributed Computer Science
topic	ddc 004 ddc 620 misc Cluster Computing misc Parallel misc Distributed misc Computer Science
topic_unstemmed	ddc 004 ddc 620 misc Cluster Computing misc Parallel misc Distributed misc Computer Science
topic_browse	ddc 004 ddc 620 misc Cluster Computing misc Parallel misc Distributed misc Computer Science
format_facet	Aufsätze Gedruckte Aufsätze
format_main_str_mv	Text Zeitschrift/Artikel
carriertype_str_mv	nc
author2_variant	k i ki j g jg
hierarchy_parent_title	Lecture notes in computer science
hierarchy_parent_id	129300152
dewey-tens	000 - Computer science, knowledge & systems 620 - Engineering
hierarchy_top_title	Lecture notes in computer science
isfreeaccess_txt	false
familylinks_str_mv	(DE-627)129300152 (DE-600)121909-1 (DE-576)014492687
title	Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems
ctrlnum	(DE-627)OLC1973550369 (DE-599)GBVOLC1973550369 (PRQ)a627-70ebbc031968e56581418beefc0b1eaccf00f1366b9105edf9a1fe8f523cd91c0 (KEY)0013707320160000000000000000leveragingmpi3sharedmemoryextensionsforefficientpg
title_full	Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems
author_sort	Zhou, Huan
journal	Lecture notes in computer science
journalStr	Lecture notes in computer science
lang_code	eng
isOA_bool	false
dewey-hundreds	000 - Computer science, information & general works 600 - Technology
recordtype	marc
publishDateSort	2016
contenttype_str_mv	txt
author_browse	Zhou, Huan
class	004 DNB 620 AVZ
format_se	Aufsätze
author-letter	Zhou, Huan
doi_str_mv	10.1007/978-3-662-48096-0_29
dewey-full	004 620
title_sort	leveraging mpi-3 shared-memory extensions for efficient pgas runtime systems
title_auth	Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems
abstract	The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intra-node transfers, and that of MPI-3 for inter-node transfers.
abstractGer	The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intra-node transfers, and that of MPI-3 for inter-node transfers.
abstract_unstemmed	The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intra-node transfers, and that of MPI-3 for inter-node transfers.
collection_details	GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-TEC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_70 GBV_ILN_2018
title_short	Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems
url	http://dx.doi.org/10.1007/978-3-662-48096-0_29 http://arxiv.org/abs/1603.02226
remote_bool	false
author2	Idrees, Kamran Gracia, José
author2Str	Idrees, Kamran Gracia, José
ppnlink	129300152
mediatype_str_mv	n
isOA_txt	false
hochschulschrift_bool	false
author2_role	oth oth
doi_str	10.1007/978-3-662-48096-0_29
up_date	2024-07-04T02:40:49.777Z
_version_	1803614525621010432
fullrecord_marcxml	<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a2200265 4500</leader><controlfield tag="001">OLC1973550369</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20220224094737.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">160430s2016 xx \|\|\|\|\| 00\| \|\|eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/978-3-662-48096-0_29</subfield><subfield code="2">doi</subfield></datafield><datafield tag="028" ind1="5" ind2="2"><subfield code="a">PQ20160430</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC1973550369</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-599)GBVOLC1973550369</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(PRQ)a627-70ebbc031968e56581418beefc0b1eaccf00f1366b9105edf9a1fe8f523cd91c0</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(KEY)0013707320160000000000000000leveragingmpi3sharedmemoryextensionsforefficientpg</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">004</subfield><subfield code="q">DNB</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">620</subfield><subfield code="q">AVZ</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Zhou, Huan</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2016</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intra-node transfers, and that of MPI-3 for inter-node transfers.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cluster Computing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Parallel</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Distributed</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Computer Science</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Idrees, Kamran</subfield><subfield code="4">oth</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Gracia, José</subfield><subfield code="4">oth</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Lecture notes in computer science</subfield><subfield code="d">Berlin, Germany : Springer, 1973</subfield><subfield code="g">(2016)</subfield><subfield code="w">(DE-627)129300152</subfield><subfield code="w">(DE-600)121909-1</subfield><subfield code="w">(DE-576)014492687</subfield><subfield code="x">0302-9743</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">year:2016</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">http://dx.doi.org/10.1007/978-3-662-48096-0_29</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="856" ind1="4" ind2="2"><subfield code="u">http://arxiv.org/abs/1603.02226</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-TEC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2018</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="j">2016</subfield></datafield></record></collection>
score	7.400075

Nicht das Richtige dabei?

Schreiben Sie uns!

Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems

Nicht das Richtige dabei?

Zugang & Verfügbarkeit

Vorhandene Bände

Nicht das Richtige dabei?