Reducing the space requirement of LZ-index. Sadakane, K., Arroyuelo, D., & Navarro, G. In *Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)*, volume 4009 LNCS, pages 318-329, 2006.

doi abstract bibtex

doi abstract bibtex

The LZ-index is a compressed full-text self-index able to represent a text T1...u , over an alphabet of size σ = O(polylog(u)) and with k-th order empirical entropy Hk (T), using 4uHk (T) + o(u log σ) bits for any k = o(logσ u). It can report all the occ occurrences of a pattern P1...u in T in O(m^{3} log σ + (m + occ) log u) worst case time. Its main drawback is the factor 4 in its space complexity, which makes it larger than other state-of-the-art alternatives. In this paper we present two different approaches to reduce the space requirement of LZ-index. In both cases we achieve (2 + ε)uH k (T) + o(u log σ) bits of space, for any constant ε > 0, and we simultaneously improve the search time to O(m^{2} log m + (m + occ) log u). Both indexes support displaying any subtext of length ℓ in optimal O(ℓ/ logσ u) time. In addition, we show how the space can be squeezed to (1 + ε)uHk (T) + o(u log σ) to obtain a structure with O(m^{2}) average search time for m ≥ 2 logσ u. © Springer-Verlag Berlin Heidelberg 2006.

@inproceedings{10.1007/11780441_29, abstract = "The LZ-index is a compressed full-text self-index able to represent a text T<inf>1...u</inf>, over an alphabet of size σ = O(polylog(u)) and with k-th order empirical entropy H<inf>k</inf>(T), using 4uH<inf>k</inf>(T) + o(u log σ) bits for any k = o(log<inf>σ</inf> u). It can report all the occ occurrences of a pattern P<inf>1...u</inf> in T in O(m<sup>3</sup> log σ + (m + occ) log u) worst case time. Its main drawback is the factor 4 in its space complexity, which makes it larger than other state-of-the-art alternatives. In this paper we present two different approaches to reduce the space requirement of LZ-index. In both cases we achieve (2 + ε)uH <inf>k</inf>(T) + o(u log σ) bits of space, for any constant ε > 0, and we simultaneously improve the search time to O(m<sup>2</sup> log m + (m + occ) log u). Both indexes support displaying any subtext of length ℓ in optimal O(ℓ/ log<inf>σ</inf> u) time. In addition, we show how the space can be squeezed to (1 + ε)uH<inf>k</inf>(T) + o(u log σ) to obtain a structure with O(m<sup>2</sup>) average search time for m ≥ 2 log<inf>σ</inf> u. © Springer-Verlag Berlin Heidelberg 2006.", year = "2006", title = "Reducing the space requirement of LZ-index", volume = "4009 LNCS", pages = "318-329", doi = "10.1007/11780441\_29", booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)", author = "Sadakane, Kunihiko and Arroyuelo, Diego and Navarro, Gonzalo" }

Downloads: 0