Misplaced Pages

Maximal pair

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In computer science, a maximal pair within a string is a pair of matching substrings that are maximal, where "maximal" means that it is not possible to make a longer matching pair by extending the range of both substrings to the left or right.

Example

Index 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Character x a b c y a b c w a b c y z

For example, in this table, the substrings at indices 2 to 4 (in red) and indices 6 to 8 (in blue) are a maximal pair, because they contain identical characters (abc), and they have different characters to the left (x at index 1 and y at index 5) and different characters to the right (y at index 5 and w at index 9). Similarly, the substrings at indices 6 to 8 (in blue) and indices 10 to 12 (in green) are a maximal pair.

However, the substrings at indices 2 to 4 (in red) and indices 10 to 12 (in green) are not a maximal pair, as the character y follows both substrings, and so they can be extended to the right to make a longer pair.

Formal definition

Formally, a maximal pair of substrings with starting positions p 1 {\displaystyle p_{1}} and p 2 {\displaystyle p_{2}} respectively, and both of length l {\displaystyle l} , is specified by a triple ( p 1 , p 2 , l ) {\displaystyle (p_{1},p_{2},l)} , such that, given a string S {\displaystyle S} of length n {\displaystyle n} , S [ p 1 . . p 1 + l 1 ] = S [ p 2 . . p 2 + l 1 ] {\displaystyle S=S} (meaning that the substrings have identical contents), but S [ p 1 1 ] S [ p 2 1 ] {\displaystyle S\neq S} (they have different characters to their left) and S [ p 1 + l ] S [ p 2 + l ] {\displaystyle S\neq S} (they also have different characters to their right; together, these two inequalities are the condition for being maximal). Thus, in the example above, the maximal pairs are ( 2 , 6 , 3 ) {\displaystyle (2,6,3)} (the red and blue substrings) and ( 6 , 10 , 3 ) {\displaystyle (6,10,3)} (the green and blue substrings), and ( 2 , 10 , 3 ) {\displaystyle (2,10,3)} is not a maximal pair.

Related concepts and time complexity

A maximal repeat is the string represented by a maximal pair. A supermaximal repeat is a maximal repeat never occurring as a proper substring of another maximal repeat. In the above example, abc and abcy are both maximal repeats, but only abcy is a supermaximal repeat.

Maximal pairs, maximal repeats and supermaximal repeats can each be found in Θ ( n + z ) {\displaystyle \Theta (n+z)} time using a suffix tree, if there are z {\displaystyle z} such structures.

References

  1. Gusfield, Dan (1999) . Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. USA: Cambridge University Press. p. 143. ISBN 0-521-58519-8.

External links


Stub icon

This computer science article is a stub. You can help Misplaced Pages by expanding it.

Categories: