Transposition Invariant String Matching. Mäkinen, V., Navarro, G., & Ukkonen, E. 56(2):124–153.
Transposition Invariant String Matching [link]Paper  doi  abstract   bibtex   
Given strings A=a1a2...am and B=b1b2...bn over an alphabet Σ⊆U, where U is some numerical universe closed under addition and subtraction, and a distance function d(A,B) that gives the score of the best (partial) matching of A and B, the transposition invariant distance is mint∈Ud(A+t,B), where A+t=(a1+t)(a2+t)...(am+t). We study the problem of computing the transposition invariant distance for various distance (and similarity) functions d, including Hamming distance, longest common subsequence (LCS), Levenshtein distance, and their versions where the exact matching condition is replaced by an approximate one. For all these problems we give algorithms whose time complexities are close to the known upper bounds without transposition invariance, and for some we achieve these upper bounds. In particular, we show how sparse dynamic programming can be used to solve transposition invariant problems, and its connection with multidimensional range-minimum search. As a byproduct, we give improved sparse dynamic programming algorithms to compute LCS and Levenshtein distance.
@article{makinenTranspositionInvariantString2005,
  title = {Transposition Invariant String Matching},
  author = {Mäkinen, Veli and Navarro, Gonzalo and Ukkonen, Esko},
  date = {2005-08},
  journaltitle = {Journal of Algorithms},
  volume = {56},
  pages = {124--153},
  issn = {0196-6774},
  doi = {10.1016/j.jalgor.2004.07.008},
  url = {https://doi.org/10.1016/j.jalgor.2004.07.008},
  abstract = {Given strings A=a1a2...am and B=b1b2...bn over an alphabet Σ⊆U, where U is some numerical universe closed under addition and subtraction, and a distance function d(A,B) that gives the score of the best (partial) matching of A and B, the transposition invariant distance is mint∈Ud(A+t,B), where A+t=(a1+t)(a2+t)...(am+t). We study the problem of computing the transposition invariant distance for various distance (and similarity) functions d, including Hamming distance, longest common subsequence (LCS), Levenshtein distance, and their versions where the exact matching condition is replaced by an approximate one. For all these problems we give algorithms whose time complexities are close to the known upper bounds without transposition invariance, and for some we achieve these upper bounds. In particular, we show how sparse dynamic programming can be used to solve transposition invariant problems, and its connection with multidimensional range-minimum search. As a byproduct, we give improved sparse dynamic programming algorithms to compute LCS and Levenshtein distance.},
  keywords = {*imported-from-citeulike-INRMM,~INRMM-MiD:c-13073877,categorical-variables,dynamic-programming,mathematics,similarity},
  number = {2}
}

Downloads: 0