A compressed self-index using a Ziv–Lempel dictionary
Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this pa...
Ausführliche Beschreibung
Autor*in: |
Russo, Luís M. S. [verfasserIn] |
---|
Format: |
Artikel |
---|---|
Sprache: |
Englisch |
Erschienen: |
2008 |
---|
Schlagwörter: |
---|
Anmerkung: |
© Springer Science+Business Media, LLC 2008 |
---|
Übergeordnetes Werk: |
Enthalten in: Information retrieval journal - Springer Netherlands, 1999, 11(2008), 4 vom: 01. Mai, Seite 359-388 |
---|---|
Übergeordnetes Werk: |
volume:11 ; year:2008 ; number:4 ; day:01 ; month:05 ; pages:359-388 |
Links: |
---|
DOI / URN: |
10.1007/s10791-008-9050-3 |
---|
Katalog-ID: |
OLC2034065425 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | OLC2034065425 | ||
003 | DE-627 | ||
005 | 20230503100535.0 | ||
007 | tu | ||
008 | 200819s2008 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1007/s10791-008-9050-3 |2 doi | |
035 | |a (DE-627)OLC2034065425 | ||
035 | |a (DE-He213)s10791-008-9050-3-p | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 020 |a 070 |a 004 |q VZ |
084 | |a 24,1 |2 ssgn | ||
084 | |a 06.74$jInformationssysteme |2 bkl | ||
100 | 1 | |a Russo, Luís M. S. |e verfasserin |4 aut | |
245 | 1 | 0 | |a A compressed self-index using a Ziv–Lempel dictionary |
264 | 1 | |c 2008 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
500 | |a © Springer Science+Business Media, LLC 2008 | ||
520 | |a Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ)log u) time, where occ is the number of occurrences. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $${\mathcal{T}}_{78}$$ suffix tree. We show that our method is very competitive in practice by comparing it against other state of the art compressed indexes. | ||
650 | 4 | |a Pattern matching | |
650 | 4 | |a Text indexing | |
650 | 4 | |a Data compression | |
650 | 4 | |a Compressed index | |
700 | 1 | |a Oliveira, Arlindo L. |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Information retrieval journal |d Springer Netherlands, 1999 |g 11(2008), 4 vom: 01. Mai, Seite 359-388 |w (DE-627)245716939 |w (DE-600)1432556-1 |w (DE-576)066689066 |x 1386-4564 |7 nnns |
773 | 1 | 8 | |g volume:11 |g year:2008 |g number:4 |g day:01 |g month:05 |g pages:359-388 |
856 | 4 | 1 | |u https://doi.org/10.1007/s10791-008-9050-3 |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a SSG-OLC-BUB | ||
912 | |a SSG-OPC-BBI | ||
912 | |a GBV_ILN_40 | ||
912 | |a GBV_ILN_90 | ||
912 | |a GBV_ILN_100 | ||
912 | |a GBV_ILN_4012 | ||
912 | |a GBV_ILN_4334 | ||
936 | b | k | |a 06.74$jInformationssysteme |q VZ |0 106415212 |0 (DE-625)106415212 |
951 | |a AR | ||
952 | |d 11 |j 2008 |e 4 |b 01 |c 05 |h 359-388 |
author_variant |
l m s r lms lmsr a l o al alo |
---|---|
matchkey_str |
article:13864564:2008----::cmrseslidxsnailm |
hierarchy_sort_str |
2008 |
bklnumber |
06.74$jInformationssysteme |
publishDate |
2008 |
allfields |
10.1007/s10791-008-9050-3 doi (DE-627)OLC2034065425 (DE-He213)s10791-008-9050-3-p DE-627 ger DE-627 rakwb eng 020 070 004 VZ 24,1 ssgn 06.74$jInformationssysteme bkl Russo, Luís M. S. verfasserin aut A compressed self-index using a Ziv–Lempel dictionary 2008 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media, LLC 2008 Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ)log u) time, where occ is the number of occurrences. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $${\mathcal{T}}_{78}$$ suffix tree. We show that our method is very competitive in practice by comparing it against other state of the art compressed indexes. Pattern matching Text indexing Data compression Compressed index Oliveira, Arlindo L. aut Enthalten in Information retrieval journal Springer Netherlands, 1999 11(2008), 4 vom: 01. Mai, Seite 359-388 (DE-627)245716939 (DE-600)1432556-1 (DE-576)066689066 1386-4564 nnns volume:11 year:2008 number:4 day:01 month:05 pages:359-388 https://doi.org/10.1007/s10791-008-9050-3 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_40 GBV_ILN_90 GBV_ILN_100 GBV_ILN_4012 GBV_ILN_4334 06.74$jInformationssysteme VZ 106415212 (DE-625)106415212 AR 11 2008 4 01 05 359-388 |
spelling |
10.1007/s10791-008-9050-3 doi (DE-627)OLC2034065425 (DE-He213)s10791-008-9050-3-p DE-627 ger DE-627 rakwb eng 020 070 004 VZ 24,1 ssgn 06.74$jInformationssysteme bkl Russo, Luís M. S. verfasserin aut A compressed self-index using a Ziv–Lempel dictionary 2008 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media, LLC 2008 Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ)log u) time, where occ is the number of occurrences. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $${\mathcal{T}}_{78}$$ suffix tree. We show that our method is very competitive in practice by comparing it against other state of the art compressed indexes. Pattern matching Text indexing Data compression Compressed index Oliveira, Arlindo L. aut Enthalten in Information retrieval journal Springer Netherlands, 1999 11(2008), 4 vom: 01. Mai, Seite 359-388 (DE-627)245716939 (DE-600)1432556-1 (DE-576)066689066 1386-4564 nnns volume:11 year:2008 number:4 day:01 month:05 pages:359-388 https://doi.org/10.1007/s10791-008-9050-3 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_40 GBV_ILN_90 GBV_ILN_100 GBV_ILN_4012 GBV_ILN_4334 06.74$jInformationssysteme VZ 106415212 (DE-625)106415212 AR 11 2008 4 01 05 359-388 |
allfields_unstemmed |
10.1007/s10791-008-9050-3 doi (DE-627)OLC2034065425 (DE-He213)s10791-008-9050-3-p DE-627 ger DE-627 rakwb eng 020 070 004 VZ 24,1 ssgn 06.74$jInformationssysteme bkl Russo, Luís M. S. verfasserin aut A compressed self-index using a Ziv–Lempel dictionary 2008 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media, LLC 2008 Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ)log u) time, where occ is the number of occurrences. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $${\mathcal{T}}_{78}$$ suffix tree. We show that our method is very competitive in practice by comparing it against other state of the art compressed indexes. Pattern matching Text indexing Data compression Compressed index Oliveira, Arlindo L. aut Enthalten in Information retrieval journal Springer Netherlands, 1999 11(2008), 4 vom: 01. Mai, Seite 359-388 (DE-627)245716939 (DE-600)1432556-1 (DE-576)066689066 1386-4564 nnns volume:11 year:2008 number:4 day:01 month:05 pages:359-388 https://doi.org/10.1007/s10791-008-9050-3 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_40 GBV_ILN_90 GBV_ILN_100 GBV_ILN_4012 GBV_ILN_4334 06.74$jInformationssysteme VZ 106415212 (DE-625)106415212 AR 11 2008 4 01 05 359-388 |
allfieldsGer |
10.1007/s10791-008-9050-3 doi (DE-627)OLC2034065425 (DE-He213)s10791-008-9050-3-p DE-627 ger DE-627 rakwb eng 020 070 004 VZ 24,1 ssgn 06.74$jInformationssysteme bkl Russo, Luís M. S. verfasserin aut A compressed self-index using a Ziv–Lempel dictionary 2008 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media, LLC 2008 Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ)log u) time, where occ is the number of occurrences. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $${\mathcal{T}}_{78}$$ suffix tree. We show that our method is very competitive in practice by comparing it against other state of the art compressed indexes. Pattern matching Text indexing Data compression Compressed index Oliveira, Arlindo L. aut Enthalten in Information retrieval journal Springer Netherlands, 1999 11(2008), 4 vom: 01. Mai, Seite 359-388 (DE-627)245716939 (DE-600)1432556-1 (DE-576)066689066 1386-4564 nnns volume:11 year:2008 number:4 day:01 month:05 pages:359-388 https://doi.org/10.1007/s10791-008-9050-3 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_40 GBV_ILN_90 GBV_ILN_100 GBV_ILN_4012 GBV_ILN_4334 06.74$jInformationssysteme VZ 106415212 (DE-625)106415212 AR 11 2008 4 01 05 359-388 |
allfieldsSound |
10.1007/s10791-008-9050-3 doi (DE-627)OLC2034065425 (DE-He213)s10791-008-9050-3-p DE-627 ger DE-627 rakwb eng 020 070 004 VZ 24,1 ssgn 06.74$jInformationssysteme bkl Russo, Luís M. S. verfasserin aut A compressed self-index using a Ziv–Lempel dictionary 2008 Text txt rdacontent ohne Hilfsmittel zu benutzen n rdamedia Band nc rdacarrier © Springer Science+Business Media, LLC 2008 Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ)log u) time, where occ is the number of occurrences. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $${\mathcal{T}}_{78}$$ suffix tree. We show that our method is very competitive in practice by comparing it against other state of the art compressed indexes. Pattern matching Text indexing Data compression Compressed index Oliveira, Arlindo L. aut Enthalten in Information retrieval journal Springer Netherlands, 1999 11(2008), 4 vom: 01. Mai, Seite 359-388 (DE-627)245716939 (DE-600)1432556-1 (DE-576)066689066 1386-4564 nnns volume:11 year:2008 number:4 day:01 month:05 pages:359-388 https://doi.org/10.1007/s10791-008-9050-3 lizenzpflichtig Volltext GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_40 GBV_ILN_90 GBV_ILN_100 GBV_ILN_4012 GBV_ILN_4334 06.74$jInformationssysteme VZ 106415212 (DE-625)106415212 AR 11 2008 4 01 05 359-388 |
language |
English |
source |
Enthalten in Information retrieval journal 11(2008), 4 vom: 01. Mai, Seite 359-388 volume:11 year:2008 number:4 day:01 month:05 pages:359-388 |
sourceStr |
Enthalten in Information retrieval journal 11(2008), 4 vom: 01. Mai, Seite 359-388 volume:11 year:2008 number:4 day:01 month:05 pages:359-388 |
format_phy_str_mv |
Article |
institution |
findex.gbv.de |
topic_facet |
Pattern matching Text indexing Data compression Compressed index |
dewey-raw |
020 |
isfreeaccess_bool |
false |
container_title |
Information retrieval journal |
authorswithroles_txt_mv |
Russo, Luís M. S. @@aut@@ Oliveira, Arlindo L. @@aut@@ |
publishDateDaySort_date |
2008-05-01T00:00:00Z |
hierarchy_top_id |
245716939 |
dewey-sort |
220 |
id |
OLC2034065425 |
language_de |
englisch |
fullrecord |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2034065425</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230503100535.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2008 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10791-008-9050-3</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2034065425</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s10791-008-9050-3-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">020</subfield><subfield code="a">070</subfield><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">06.74$jInformationssysteme</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Russo, Luís M. S.</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">A compressed self-index using a Ziv–Lempel dictionary</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2008</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Springer Science+Business Media, LLC 2008</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ)log u) time, where occ is the number of occurrences. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $${\mathcal{T}}_{78}$$ suffix tree. We show that our method is very competitive in practice by comparing it against other state of the art compressed indexes.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Pattern matching</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text indexing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data compression</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Compressed index</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Oliveira, Arlindo L.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Information retrieval journal</subfield><subfield code="d">Springer Netherlands, 1999</subfield><subfield code="g">11(2008), 4 vom: 01. Mai, Seite 359-388</subfield><subfield code="w">(DE-627)245716939</subfield><subfield code="w">(DE-600)1432556-1</subfield><subfield code="w">(DE-576)066689066</subfield><subfield code="x">1386-4564</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:11</subfield><subfield code="g">year:2008</subfield><subfield code="g">number:4</subfield><subfield code="g">day:01</subfield><subfield code="g">month:05</subfield><subfield code="g">pages:359-388</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s10791-008-9050-3</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_90</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_100</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4334</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">06.74$jInformationssysteme</subfield><subfield code="q">VZ</subfield><subfield code="0">106415212</subfield><subfield code="0">(DE-625)106415212</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">11</subfield><subfield code="j">2008</subfield><subfield code="e">4</subfield><subfield code="b">01</subfield><subfield code="c">05</subfield><subfield code="h">359-388</subfield></datafield></record></collection>
|
author |
Russo, Luís M. S. |
spellingShingle |
Russo, Luís M. S. ddc 020 ssgn 24,1 bkl 06.74$jInformationssysteme misc Pattern matching misc Text indexing misc Data compression misc Compressed index A compressed self-index using a Ziv–Lempel dictionary |
authorStr |
Russo, Luís M. S. |
ppnlink_with_tag_str_mv |
@@773@@(DE-627)245716939 |
format |
Article |
dewey-ones |
020 - Library & information sciences 070 - News media, journalism & publishing 004 - Data processing & computer science |
delete_txt_mv |
keep |
author_role |
aut aut |
collection |
OLC |
remote_str |
false |
illustrated |
Not Illustrated |
issn |
1386-4564 |
topic_title |
020 070 004 VZ 24,1 ssgn 06.74$jInformationssysteme bkl A compressed self-index using a Ziv–Lempel dictionary Pattern matching Text indexing Data compression Compressed index |
topic |
ddc 020 ssgn 24,1 bkl 06.74$jInformationssysteme misc Pattern matching misc Text indexing misc Data compression misc Compressed index |
topic_unstemmed |
ddc 020 ssgn 24,1 bkl 06.74$jInformationssysteme misc Pattern matching misc Text indexing misc Data compression misc Compressed index |
topic_browse |
ddc 020 ssgn 24,1 bkl 06.74$jInformationssysteme misc Pattern matching misc Text indexing misc Data compression misc Compressed index |
format_facet |
Aufsätze Gedruckte Aufsätze |
format_main_str_mv |
Text Zeitschrift/Artikel |
carriertype_str_mv |
nc |
hierarchy_parent_title |
Information retrieval journal |
hierarchy_parent_id |
245716939 |
dewey-tens |
020 - Library & information sciences 070 - News media, journalism & publishing 000 - Computer science, knowledge & systems |
hierarchy_top_title |
Information retrieval journal |
isfreeaccess_txt |
false |
familylinks_str_mv |
(DE-627)245716939 (DE-600)1432556-1 (DE-576)066689066 |
title |
A compressed self-index using a Ziv–Lempel dictionary |
ctrlnum |
(DE-627)OLC2034065425 (DE-He213)s10791-008-9050-3-p |
title_full |
A compressed self-index using a Ziv–Lempel dictionary |
author_sort |
Russo, Luís M. S. |
journal |
Information retrieval journal |
journalStr |
Information retrieval journal |
lang_code |
eng |
isOA_bool |
false |
dewey-hundreds |
000 - Computer science, information & general works |
recordtype |
marc |
publishDateSort |
2008 |
contenttype_str_mv |
txt |
container_start_page |
359 |
author_browse |
Russo, Luís M. S. Oliveira, Arlindo L. |
container_volume |
11 |
class |
020 070 004 VZ 24,1 ssgn 06.74$jInformationssysteme bkl |
format_se |
Aufsätze |
author-letter |
Russo, Luís M. S. |
doi_str_mv |
10.1007/s10791-008-9050-3 |
normlink |
106415212 |
normlink_prefix_str_mv |
106415212 (DE-625)106415212 |
dewey-full |
020 070 004 |
title_sort |
a compressed self-index using a ziv–lempel dictionary |
title_auth |
A compressed self-index using a Ziv–Lempel dictionary |
abstract |
Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ)log u) time, where occ is the number of occurrences. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $${\mathcal{T}}_{78}$$ suffix tree. We show that our method is very competitive in practice by comparing it against other state of the art compressed indexes. © Springer Science+Business Media, LLC 2008 |
abstractGer |
Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ)log u) time, where occ is the number of occurrences. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $${\mathcal{T}}_{78}$$ suffix tree. We show that our method is very competitive in practice by comparing it against other state of the art compressed indexes. © Springer Science+Business Media, LLC 2008 |
abstract_unstemmed |
Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ)log u) time, where occ is the number of occurrences. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $${\mathcal{T}}_{78}$$ suffix tree. We show that our method is very competitive in practice by comparing it against other state of the art compressed indexes. © Springer Science+Business Media, LLC 2008 |
collection_details |
GBV_USEFLAG_A SYSFLAG_A GBV_OLC SSG-OLC-BUB SSG-OPC-BBI GBV_ILN_40 GBV_ILN_90 GBV_ILN_100 GBV_ILN_4012 GBV_ILN_4334 |
container_issue |
4 |
title_short |
A compressed self-index using a Ziv–Lempel dictionary |
url |
https://doi.org/10.1007/s10791-008-9050-3 |
remote_bool |
false |
author2 |
Oliveira, Arlindo L. |
author2Str |
Oliveira, Arlindo L. |
ppnlink |
245716939 |
mediatype_str_mv |
n |
isOA_txt |
false |
hochschulschrift_bool |
false |
doi_str |
10.1007/s10791-008-9050-3 |
up_date |
2024-07-03T19:26:08.760Z |
_version_ |
1803587177694625794 |
fullrecord_marcxml |
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim"><record><leader>01000caa a22002652 4500</leader><controlfield tag="001">OLC2034065425</controlfield><controlfield tag="003">DE-627</controlfield><controlfield tag="005">20230503100535.0</controlfield><controlfield tag="007">tu</controlfield><controlfield tag="008">200819s2008 xx ||||| 00| ||eng c</controlfield><datafield tag="024" ind1="7" ind2=" "><subfield code="a">10.1007/s10791-008-9050-3</subfield><subfield code="2">doi</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-627)OLC2034065425</subfield></datafield><datafield tag="035" ind1=" " ind2=" "><subfield code="a">(DE-He213)s10791-008-9050-3-p</subfield></datafield><datafield tag="040" ind1=" " ind2=" "><subfield code="a">DE-627</subfield><subfield code="b">ger</subfield><subfield code="c">DE-627</subfield><subfield code="e">rakwb</subfield></datafield><datafield tag="041" ind1=" " ind2=" "><subfield code="a">eng</subfield></datafield><datafield tag="082" ind1="0" ind2="4"><subfield code="a">020</subfield><subfield code="a">070</subfield><subfield code="a">004</subfield><subfield code="q">VZ</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">24,1</subfield><subfield code="2">ssgn</subfield></datafield><datafield tag="084" ind1=" " ind2=" "><subfield code="a">06.74$jInformationssysteme</subfield><subfield code="2">bkl</subfield></datafield><datafield tag="100" ind1="1" ind2=" "><subfield code="a">Russo, Luís M. S.</subfield><subfield code="e">verfasserin</subfield><subfield code="4">aut</subfield></datafield><datafield tag="245" ind1="1" ind2="0"><subfield code="a">A compressed self-index using a Ziv–Lempel dictionary</subfield></datafield><datafield tag="264" ind1=" " ind2="1"><subfield code="c">2008</subfield></datafield><datafield tag="336" ind1=" " ind2=" "><subfield code="a">Text</subfield><subfield code="b">txt</subfield><subfield code="2">rdacontent</subfield></datafield><datafield tag="337" ind1=" " ind2=" "><subfield code="a">ohne Hilfsmittel zu benutzen</subfield><subfield code="b">n</subfield><subfield code="2">rdamedia</subfield></datafield><datafield tag="338" ind1=" " ind2=" "><subfield code="a">Band</subfield><subfield code="b">nc</subfield><subfield code="2">rdacarrier</subfield></datafield><datafield tag="500" ind1=" " ind2=" "><subfield code="a">© Springer Science+Business Media, LLC 2008</subfield></datafield><datafield tag="520" ind1=" " ind2=" "><subfield code="a">Abstract A compressed full-text self-index for a text T, of size u, is a data structure used to search for patterns P, of size m, in T, that requires reduced space, i.e. space that depends on the empirical entropy (Hk or H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m + occ)log u) time, where occ is the number of occurrences. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $${\mathcal{T}}_{78}$$ suffix tree. We show that our method is very competitive in practice by comparing it against other state of the art compressed indexes.</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Pattern matching</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Text indexing</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Data compression</subfield></datafield><datafield tag="650" ind1=" " ind2="4"><subfield code="a">Compressed index</subfield></datafield><datafield tag="700" ind1="1" ind2=" "><subfield code="a">Oliveira, Arlindo L.</subfield><subfield code="4">aut</subfield></datafield><datafield tag="773" ind1="0" ind2="8"><subfield code="i">Enthalten in</subfield><subfield code="t">Information retrieval journal</subfield><subfield code="d">Springer Netherlands, 1999</subfield><subfield code="g">11(2008), 4 vom: 01. Mai, Seite 359-388</subfield><subfield code="w">(DE-627)245716939</subfield><subfield code="w">(DE-600)1432556-1</subfield><subfield code="w">(DE-576)066689066</subfield><subfield code="x">1386-4564</subfield><subfield code="7">nnns</subfield></datafield><datafield tag="773" ind1="1" ind2="8"><subfield code="g">volume:11</subfield><subfield code="g">year:2008</subfield><subfield code="g">number:4</subfield><subfield code="g">day:01</subfield><subfield code="g">month:05</subfield><subfield code="g">pages:359-388</subfield></datafield><datafield tag="856" ind1="4" ind2="1"><subfield code="u">https://doi.org/10.1007/s10791-008-9050-3</subfield><subfield code="z">lizenzpflichtig</subfield><subfield code="3">Volltext</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_USEFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SYSFLAG_A</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_OLC</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OLC-BUB</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">SSG-OPC-BBI</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_40</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_90</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_100</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4012</subfield></datafield><datafield tag="912" ind1=" " ind2=" "><subfield code="a">GBV_ILN_4334</subfield></datafield><datafield tag="936" ind1="b" ind2="k"><subfield code="a">06.74$jInformationssysteme</subfield><subfield code="q">VZ</subfield><subfield code="0">106415212</subfield><subfield code="0">(DE-625)106415212</subfield></datafield><datafield tag="951" ind1=" " ind2=" "><subfield code="a">AR</subfield></datafield><datafield tag="952" ind1=" " ind2=" "><subfield code="d">11</subfield><subfield code="j">2008</subfield><subfield code="e">4</subfield><subfield code="b">01</subfield><subfield code="c">05</subfield><subfield code="h">359-388</subfield></datafield></record></collection>
|
score |
7.39787 |