Distributed top-k aggregation queries at large
Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods f...
Ausführliche Beschreibung
Autor*in: |
Neumann, Thomas [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2009 |
---|
Schlagwörter: |
---|
Anmerkung: |
© The Author(s) 2009 |
---|
Übergeordnetes Werk: |
Enthalten in: Distributed and parallel databases - Springer US, 1993, 26(2009), 1 vom: 18. Juni, Seite 3-27 |
---|---|
Übergeordnetes Werk: |
volume:26 ; year:2009 ; number:1 ; day:18 ; month:06 ; pages:3-27 |
Links: |
---|
DOI / URN: |
10.1007/s10619-009-7041-z |
---|
Katalog-ID: |
OLC2027066659 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | OLC2027066659 | ||
003 | DE-627 | ||
005 | 20230503034744.0 | ||
007 | tu | ||
008 | 200819s2009 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1007/s10619-009-7041-z |2 doi | |
035 | |a (DE-627)OLC2027066659 | ||
035 | |a (DE-He213)s10619-009-7041-z-p | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 070 |a 020 |a 004 |q VZ |
084 | |a 24,1 |2 ssgn | ||
100 | 1 | |a Neumann, Thomas |e verfasserin |4 aut | |
245 | 1 | 0 | |a Distributed top-k aggregation queries at large |
264 | 1 | |c 2009 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
500 | |a © The Author(s) 2009 | ||
520 | |a Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network. | ||
650 | 4 | |a Top- | |
650 | 4 | |a Distributed queries | |
650 | 4 | |a Query optimization | |
650 | 4 | |a Cost models | |
700 | 1 | |a Bender, Matthias |4 aut | |
700 | 1 | |a Michel, Sebastian |4 aut | |
700 | 1 | |a Schenkel, Ralf |4 aut | |
700 | 1 | |a Triantafillou, Peter |4 aut | |
700 | 1 | |a Weikum, Gerhard |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Distributed and parallel databases |d Springer US, 1993 |g 26(2009), 1 vom: 18. Juni, Seite 3-27 |w (DE-627)165664401 |w (DE-600)913166-8 |w (DE-576)038480352 |x 0926-8782 |7 nnns |
773 | 1 | 8 | |g volume:26 |g year:2009 |g number:1 |g day:18 |g month:06 |g pages:3-27 |
856 | 4 | 1 | |u https://doi.org/10.1007/s10619-009-7041-z |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-MAT | ||
912 | |a SSG-OPC-BBI | ||
912 | |a GBV_ILN_31 | ||
912 | |a GBV_ILN_65 | ||
912 | |a GBV_ILN_70 | ||
912 | |a GBV_ILN_100 | ||
912 | |a GBV_ILN_2005 | ||
912 | |a GBV_ILN_2010 | ||
912 | |a GBV_ILN_2021 | ||
912 | |a GBV_ILN_2244 | ||
912 | |a GBV_ILN_4012 | ||
912 | |a GBV_ILN_4036 | ||
912 | |a GBV_ILN_4126 | ||
912 | |a GBV_ILN_4305 | ||
912 | |a GBV_ILN_4318 | ||
951 | |a AR | ||
952 | |d 26 |j 2009 |e 1 |b 18 |c 06 |h 3-27 |
author_variant |
t n tn m b mb s m sm r s rs p t pt g w gw |
---|---|
matchkey_str |
article:09268782:2009----::itiuetpageainur |
hierarchy_sort_str |
2009 |
publishDate |
2009 |
allfields |
10.1007/s10619-009-7041-z doi (DE-627)OLC2027066659 (DE-He213)s10619-009-7041-z-p DE-627 ger DE-627 rakwb eng 070 020 004 VZ 24,1 ssgn Neumann, Thomas verfasserin aut Distributed top-k aggregation queries at large 2009 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s) 2009 Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network. Top- Distributed queries Query optimization Cost models Bender, Matthias aut Michel, Sebastian aut Schenkel, Ralf aut Triantafillou, Peter aut Weikum, Gerhard aut Enthalten in Distributed and parallel databases Springer US, 1993 26(2009), 1 vom: 18. Juni, Seite 3-27 (DE-627)165664401 (DE-600)913166-8 (DE-576)038480352 0926-8782 nnns volume:26 year:2009 number:1 day:18 month:06 pages:3-27 https://doi.org/10.1007/s10619-009-7041-z lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_31 GBV_ILN_65 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_2010 GBV_ILN_2021 GBV_ILN_2244 GBV_ILN_4012 GBV_ILN_4036 GBV_ILN_4126 GBV_ILN_4305 GBV_ILN_4318 AR 26 2009 1 18 06 3-27 |
spelling |
10.1007/s10619-009-7041-z doi (DE-627)OLC2027066659 (DE-He213)s10619-009-7041-z-p DE-627 ger DE-627 rakwb eng 070 020 004 VZ 24,1 ssgn Neumann, Thomas verfasserin aut Distributed top-k aggregation queries at large 2009 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s) 2009 Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network. Top- Distributed queries Query optimization Cost models Bender, Matthias aut Michel, Sebastian aut Schenkel, Ralf aut Triantafillou, Peter aut Weikum, Gerhard aut Enthalten in Distributed and parallel databases Springer US, 1993 26(2009), 1 vom: 18. Juni, Seite 3-27 (DE-627)165664401 (DE-600)913166-8 (DE-576)038480352 0926-8782 nnns volume:26 year:2009 number:1 day:18 month:06 pages:3-27 https://doi.org/10.1007/s10619-009-7041-z lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_31 GBV_ILN_65 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_2010 GBV_ILN_2021 GBV_ILN_2244 GBV_ILN_4012 GBV_ILN_4036 GBV_ILN_4126 GBV_ILN_4305 GBV_ILN_4318 AR 26 2009 1 18 06 3-27 |
allfields_unstemmed |
10.1007/s10619-009-7041-z doi (DE-627)OLC2027066659 (DE-He213)s10619-009-7041-z-p DE-627 ger DE-627 rakwb eng 070 020 004 VZ 24,1 ssgn Neumann, Thomas verfasserin aut Distributed top-k aggregation queries at large 2009 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s) 2009 Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network. Top- Distributed queries Query optimization Cost models Bender, Matthias aut Michel, Sebastian aut Schenkel, Ralf aut Triantafillou, Peter aut Weikum, Gerhard aut Enthalten in Distributed and parallel databases Springer US, 1993 26(2009), 1 vom: 18. Juni, Seite 3-27 (DE-627)165664401 (DE-600)913166-8 (DE-576)038480352 0926-8782 nnns volume:26 year:2009 number:1 day:18 month:06 pages:3-27 https://doi.org/10.1007/s10619-009-7041-z lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_31 GBV_ILN_65 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_2010 GBV_ILN_2021 GBV_ILN_2244 GBV_ILN_4012 GBV_ILN_4036 GBV_ILN_4126 GBV_ILN_4305 GBV_ILN_4318 AR 26 2009 1 18 06 3-27 |
allfieldsGer |
10.1007/s10619-009-7041-z doi (DE-627)OLC2027066659 (DE-He213)s10619-009-7041-z-p DE-627 ger DE-627 rakwb eng 070 020 004 VZ 24,1 ssgn Neumann, Thomas verfasserin aut Distributed top-k aggregation queries at large 2009 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s) 2009 Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network. Top- Distributed queries Query optimization Cost models Bender, Matthias aut Michel, Sebastian aut Schenkel, Ralf aut Triantafillou, Peter aut Weikum, Gerhard aut Enthalten in Distributed and parallel databases Springer US, 1993 26(2009), 1 vom: 18. Juni, Seite 3-27 (DE-627)165664401 (DE-600)913166-8 (DE-576)038480352 0926-8782 nnns volume:26 year:2009 number:1 day:18 month:06 pages:3-27 https://doi.org/10.1007/s10619-009-7041-z lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_31 GBV_ILN_65 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_2010 GBV_ILN_2021 GBV_ILN_2244 GBV_ILN_4012 GBV_ILN_4036 GBV_ILN_4126 GBV_ILN_4305 GBV_ILN_4318 AR 26 2009 1 18 06 3-27 |
allfieldsSound |
10.1007/s10619-009-7041-z doi (DE-627)OLC2027066659 (DE-He213)s10619-009-7041-z-p DE-627 ger DE-627 rakwb eng 070 020 004 VZ 24,1 ssgn Neumann, Thomas verfasserin aut Distributed top-k aggregation queries at large 2009 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © The Author(s) 2009 Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network. Top- Distributed queries Query optimization Cost models Bender, Matthias aut Michel, Sebastian aut Schenkel, Ralf aut Triantafillou, Peter aut Weikum, Gerhard aut Enthalten in Distributed and parallel databases Springer US, 1993 26(2009), 1 vom: 18. Juni, Seite 3-27 (DE-627)165664401 (DE-600)913166-8 (DE-576)038480352 0926-8782 nnns volume:26 year:2009 number:1 day:18 month:06 pages:3-27 https://doi.org/10.1007/s10619-009-7041-z lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_31 GBV_ILN_65 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_2010 GBV_ILN_2021 GBV_ILN_2244 GBV_ILN_4012 GBV_ILN_4036 GBV_ILN_4126 GBV_ILN_4305 GBV_ILN_4318 AR 26 2009 1 18 06 3-27 |
language |
English |
source |
Enthalten in Distributed and parallel databases 26(2009), 1 vom: 18. Juni, Seite 3-27 volume:26 year:2009 number:1 day:18 month:06 pages:3-27 |
sourceStr |
Enthalten in Distributed and parallel databases 26(2009), 1 vom: 18. Juni, Seite 3-27 volume:26 year:2009 number:1 day:18 month:06 pages:3-27 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Top- Distributed queries Query optimization Cost models |
dewey-raw |
070 |
isfreeaccess_bool |
false |
container_title |
Distributed and parallel databases |
authorswithroles_txt_mv |
Neumann, Thomas @@aut@@ Bender, Matthias @@aut@@ Michel, Sebastian @@aut@@ Schenkel, Ralf @@aut@@ Triantafillou, Peter @@aut@@ Weikum, Gerhard @@aut@@ |
publishDateDaySort_date |
2009-06-18T00:00:00Z |
hierarchy_top_id |
165664401 |
dewey-sort |
270 |
id |
OLC2027066659 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2027066659</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230503034744.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2009 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10619-009-7041-z</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2027066659</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s10619-009-7041-z-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">070</subfield><subfield code="a">020</subfield><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Neumann, Thomas</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Distributed top-k aggregation queries at large</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2009</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s) 2009</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Top-</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Distributed queries</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Query optimization</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cost models</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Bender, Matthias</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Michel, Sebastian</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Schenkel, Ralf</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Triantafillou, Peter</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Weikum, Gerhard</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Distributed and parallel databases</subfield><subfield code="d">Springer US, 1993</subfield><subfield code="g">26(2009), 1 vom: 18. Juni, Seite 3-27</subfield><subfield code="w">(DE-627)165664401</subfield><subfield code="w">(DE-600)913166-8</subfield><subfield code="w">(DE-576)038480352</subfield><subfield code="x">0926-8782</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:26</subfield><subfield code="g">year:2009</subfield><subfield code="g">number:1</subfield><subfield code="g">day:18</subfield><subfield code="g">month:06</subfield><subfield code="g">pages:3-27</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s10619-009-7041-z</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_100</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2005</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2010</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2021</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2244</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4036</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4318</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">26</subfield><subfield code="j">2009</subfield><subfield code="e">1</subfield><subfield code="b">18</subfield><subfield code="c">06</subfield><subfield code="h">3-27</subfield></datafield></record></collection>
|
author |
Neumann, Thomas |
spellingShingle |
Neumann, Thomas ddc 070 ssgn 24,1 misc Top- misc Distributed queries misc Query optimization misc Cost models Distributed top-k aggregation queries at large |
authorStr |
Neumann, Thomas |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)165664401 |
format |
Article |
dewey-ones |
070 - News media, journalism & publishing 020 - Library & information sciences 004 - Data processing & computer science |
delete_txt_mv |
keep |
author_role |
aut aut aut aut aut aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
0926-8782 |
topic_title |
070 020 004 VZ 24,1 ssgn Distributed top-k aggregation queries at large Top- Distributed queries Query optimization Cost models |
topic |
ddc 070 ssgn 24,1 misc Top- misc Distributed queries misc Query optimization misc Cost models |
topic_unstemmed |
ddc 070 ssgn 24,1 misc Top- misc Distributed queries misc Query optimization misc Cost models |
topic_browse |
ddc 070 ssgn 24,1 misc Top- misc Distributed queries misc Query optimization misc Cost models |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
hierarchy_parent_title |
Distributed and parallel databases |
hierarchy_parent_id |
165664401 |
dewey-tens |
070 - News media, journalism & publishing 020 - Library & information sciences 000 - Computer science, knowledge & systems |
hierarchy_top_title |
Distributed and parallel databases |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)165664401 (DE-600)913166-8 (DE-576)038480352 |
title |
Distributed top-k aggregation queries at large |
ctrlnum |
(DE-627)OLC2027066659 (DE-He213)s10619-009-7041-z-p |
title_full |
Distributed top-k aggregation queries at large |
author_sort |
Neumann, Thomas |
journal |
Distributed and parallel databases |
journalStr |
Distributed and parallel databases |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works |
recordtype |
marc |
publishDateSort |
2009 |
contenttype_str_mv |
txt |
container_start_page |
3 |
author_browse |
Neumann, Thomas Bender, Matthias Michel, Sebastian Schenkel, Ralf Triantafillou, Peter Weikum, Gerhard |
container_volume |
26 |
class |
070 020 004 VZ 24,1 ssgn |
format_se |
Aufsätze |
author-letter |
Neumann, Thomas |
doi_str_mv |
10.1007/s10619-009-7041-z |
dewey-full |
070 020 004 |
title_sort |
distributed top-k aggregation queries at large |
title_auth |
Distributed top-k aggregation queries at large |
abstract |
Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network. © The Author(s) 2009 |
abstractGer |
Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network. © The Author(s) 2009 |
abstract_unstemmed |
Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network. © The Author(s) 2009 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-MAT SSG-OPC-BBI GBV_ILN_31 GBV_ILN_65 GBV_ILN_70 GBV_ILN_100 GBV_ILN_2005 GBV_ILN_2010 GBV_ILN_2021 GBV_ILN_2244 GBV_ILN_4012 GBV_ILN_4036 GBV_ILN_4126 GBV_ILN_4305 GBV_ILN_4318 |
container_issue |
1 |
title_short |
Distributed top-k aggregation queries at large |
url |
https://doi.org/10.1007/s10619-009-7041-z |
remote_bool |
false |
author2 |
Bender, Matthias Michel, Sebastian Schenkel, Ralf Triantafillou, Peter Weikum, Gerhard |
author2Str |
Bender, Matthias Michel, Sebastian Schenkel, Ralf Triantafillou, Peter Weikum, Gerhard |
ppnlink |
165664401 |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
doi_str |
10.1007/s10619-009-7041-z |
up_date |
2024-07-03T13:39:08.575Z |
_version_ |
1803565346139930624 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2027066659</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230503034744.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2009 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10619-009-7041-z</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2027066659</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s10619-009-7041-z-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">070</subfield><subfield code="a">020</subfield><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Neumann, Thomas</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">Distributed top-k aggregation queries at large</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2009</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© The Author(s) 2009</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is spread across different nodes in a network. This paper introduces novel optimization methods for top-k aggregation queries in such distributed environments. The optimizations can be applied to all algorithms that fall into the frameworks of the prior TPUT and KLEE methods. The optimizations address three degrees of freedom: 1) hierarchically grouping input lists into top-k operator trees and optimizing the tree structure, 2) computing data-adaptive scan depths for different input sources, and 3) data-adaptive sampling of a small subset of input sources in scenarios with hundreds or thousands of query-relevant network nodes. All optimizations are based on a statistical cost model that utilizes local synopses, e.g., in the form of histograms, efficiently computed convolutions, and estimators based on order statistics. The paper presents comprehensive experiments, with three different real-life datasets and using the ns-2 network simulator for a packet-level simulation of a large Internet-style network.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Top-</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Distributed queries</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Query optimization</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Cost models</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Bender, Matthias</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Michel, Sebastian</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Schenkel, Ralf</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Triantafillou, Peter</subfield><subfield code="4">aut</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Weikum, Gerhard</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Distributed and parallel databases</subfield><subfield code="d">Springer US, 1993</subfield><subfield code="g">26(2009), 1 vom: 18. Juni, Seite 3-27</subfield><subfield code="w">(DE-627)165664401</subfield><subfield code="w">(DE-600)913166-8</subfield><subfield code="w">(DE-576)038480352</subfield><subfield code="x">0926-8782</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:26</subfield><subfield code="g">year:2009</subfield><subfield code="g">number:1</subfield><subfield code="g">day:18</subfield><subfield code="g">month:06</subfield><subfield code="g">pages:3-27</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s10619-009-7041-z</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-MAT</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_31</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_65</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_70</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_100</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2005</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2010</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2021</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_2244</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4036</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4126</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4305</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4318</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">26</subfield><subfield code="j">2009</subfield><subfield code="e">1</subfield><subfield code="b">18</subfield><subfield code="c">06</subfield><subfield code="h">3-27</subfield></datafield></record></collection>
|
score |
7.3993025 |