Misplaced Pages

Unavoidable pattern

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In mathematics and theoretical computer science, a pattern is an unavoidable pattern if it is unavoidable on any finite alphabet.

Definitions

Pattern

Like a word, a pattern (also called term) is a sequence of symbols over some alphabet.

The minimum multiplicity of the pattern p {\displaystyle p} is m ( p ) = min ( c o u n t p ( x ) : x p ) {\displaystyle m(p)=\min(\mathrm {count_{p}} (x):x\in p)} where c o u n t p ( x ) {\displaystyle \mathrm {count_{p}} (x)} is the number of occurrence of symbol x {\displaystyle x} in pattern p {\displaystyle p} . In other words, it is the number of occurrences in p {\displaystyle p} of the least frequently occurring symbol in p {\displaystyle p} .

Instance

Given finite alphabets Σ {\displaystyle \Sigma } and Δ {\displaystyle \Delta } , a word x Σ {\displaystyle x\in \Sigma ^{*}} is an instance of the pattern p Δ {\displaystyle p\in \Delta ^{*}} if there exists a non-erasing semigroup morphism f : Δ Σ {\displaystyle f:\Delta ^{*}\rightarrow \Sigma ^{*}} such that f ( p ) = x {\displaystyle f(p)=x} , where Σ {\displaystyle \Sigma ^{*}} denotes the Kleene star of Σ {\displaystyle \Sigma } . Non-erasing means that f ( a ) ε {\displaystyle f(a)\neq \varepsilon } for all a Δ {\displaystyle a\in \Delta } , where ε {\displaystyle \varepsilon } denotes the empty string.

Avoidance / Matching

A word w {\displaystyle w} is said to match, or encounter, a pattern p {\displaystyle p} if a factor (also called subword or substring) of w {\displaystyle w} is an instance of p {\displaystyle p} . Otherwise, w {\displaystyle w} is said to avoid p {\displaystyle p} , or to be p {\displaystyle p} -free. This definition can be generalized to the case of an infinite w {\displaystyle w} , based on a generalized definition of "substring".

Avoidability / Unavoidability on a specific alphabet

A pattern p {\displaystyle p} is unavoidable on a finite alphabet Σ {\displaystyle \Sigma } if each sufficiently long word x Σ {\displaystyle x\in \Sigma ^{*}} must match p {\displaystyle p} ; formally: if n N .   x Σ .   ( | x | n x  matches  p ) {\displaystyle \exists n\in \mathrm {N} .\ \forall x\in \Sigma ^{*}.\ (|x|\geq n\implies x{\text{ matches }}p)} . Otherwise, p {\displaystyle p} is avoidable on Σ {\displaystyle \Sigma } , which implies there exist infinitely many words over the alphabet Σ {\displaystyle \Sigma } that avoid p {\displaystyle p} .

By Kőnig's lemma, pattern p {\displaystyle p} is avoidable on Σ {\displaystyle \Sigma } if and only if there exists an infinite word w Σ ω {\displaystyle w\in \Sigma ^{\omega }} that avoids p {\displaystyle p} .

Maximal p-free word

Given a pattern p {\displaystyle p} and an alphabet Σ {\displaystyle \Sigma } . A p {\displaystyle p} -free word w Σ {\displaystyle w\in \Sigma ^{*}} is a maximal p {\displaystyle p} -free word over Σ {\displaystyle \Sigma } if a w {\displaystyle aw} and w a {\displaystyle wa} match p {\displaystyle p} a Σ {\displaystyle \forall a\in \Sigma } .

Avoidable / Unavoidable pattern

A pattern p {\displaystyle p} is an unavoidable pattern (also called blocking term) if p {\displaystyle p} is unavoidable on any finite alphabet.

If a pattern is unavoidable and not limited to a specific alphabet, then it is unavoidable for any finite alphabet by default. Conversely, if a pattern is said to be avoidable and not limited to a specific alphabet, then it is avoidable on some finite alphabet by default.

k-avoidable / k-unavoidable

A pattern p {\displaystyle p} is k {\displaystyle k} -avoidable if p {\displaystyle p} is avoidable on an alphabet Σ {\displaystyle \Sigma } of size k {\displaystyle k} . Otherwise, p {\displaystyle p} is k {\displaystyle k} -unavoidable, which means p {\displaystyle p} is unavoidable on every alphabet of size k {\displaystyle k} .

If pattern p {\displaystyle p} is k {\displaystyle k} -avoidable, then p {\displaystyle p} is g {\displaystyle g} -avoidable for all g k {\displaystyle g\geq k} .

Given a finite set of avoidable patterns S = { p 1 , p 2 , . . . , p i } {\displaystyle S=\{p_{1},p_{2},...,p_{i}\}} , there exists an infinite word w Σ ω {\displaystyle w\in \Sigma ^{\omega }} such that w {\displaystyle w} avoids all patterns of S {\displaystyle S} . Let μ ( S ) {\displaystyle \mu (S)} denote the size of the minimal alphabet Σ {\displaystyle \Sigma '} such that w Σ ω {\displaystyle \exists w'\in {\Sigma '}^{\omega }} avoiding all patterns of S {\displaystyle S} .

Avoidability index

The avoidability index of a pattern p {\displaystyle p} is the smallest k {\displaystyle k} such that p {\displaystyle p} is k {\displaystyle k} -avoidable, and {\displaystyle \infty } if p {\displaystyle p} is unavoidable.

Properties

  • A pattern q {\displaystyle q} is avoidable if q {\displaystyle q} is an instance of an avoidable pattern p {\displaystyle p} .
  • Let avoidable pattern p {\displaystyle p} be a factor of pattern q {\displaystyle q} , then q {\displaystyle q} is also avoidable.
  • A pattern q {\displaystyle q} is unavoidable if and only if q {\displaystyle q} is a factor of some unavoidable pattern p {\displaystyle p} .
  • Given an unavoidable pattern p {\displaystyle p} and a symbol a {\displaystyle a} not in p {\displaystyle p} , then p a p {\displaystyle pap} is unavoidable.
  • Given an unavoidable pattern p {\displaystyle p} , then the reversal p R {\displaystyle p^{R}} is unavoidable.
  • Given an unavoidable pattern p {\displaystyle p} , there exists a symbol a {\displaystyle a} such that a {\displaystyle a} occurs exactly once in p {\displaystyle p} .
  • Let n N {\displaystyle n\in \mathrm {N} } represent the number of distinct symbols of pattern p {\displaystyle p} . If | p | 2 n {\displaystyle |p|\geq 2^{n}} , then p {\displaystyle p} is avoidable.

Zimin words

Main article: Sesquipower

Given alphabet Δ = { x 1 , x 2 , . . . } {\displaystyle \Delta =\{x_{1},x_{2},...\}} , Zimin words (patterns) are defined recursively Z n + 1 = Z n x n + 1 Z n {\displaystyle Z_{n+1}=Z_{n}x_{n+1}Z_{n}} for n Z + {\displaystyle n\in \mathrm {Z} ^{+}} and Z 1 = x 1 {\displaystyle Z_{1}=x_{1}} .

Unavoidability

All Zimin words are unavoidable.

A word w {\displaystyle w} is unavoidable if and only if it is a factor of a Zimin word.

Given a finite alphabet Σ {\displaystyle \Sigma } , let f ( n , | Σ | ) {\displaystyle f(n,|\Sigma |)} represent the smallest m Z + {\displaystyle m\in \mathrm {Z} ^{+}} such that w {\displaystyle w} matches Z n {\displaystyle Z_{n}} for all w Σ m {\displaystyle w\in \Sigma ^{m}} . We have following properties:

  • f ( 1 , q ) = 1 {\displaystyle f(1,q)=1}
  • f ( 2 , q ) = 2 q + 1 {\displaystyle f(2,q)=2q+1}
  • f ( 3 , 2 ) = 29 {\displaystyle f(3,2)=29}
  • f ( n , q ) n 1 q = q q q n 1 {\displaystyle f(n,q)\leq {^{n-1}q}=\underbrace {q^{q^{\cdot ^{\cdot ^{q}}}}} _{n-1}}

Z n {\displaystyle Z_{n}} is the longest unavoidable pattern constructed by alphabet Δ = { x 1 , x 2 , . . . , x n } {\displaystyle \Delta =\{x_{1},x_{2},...,x_{n}\}} since | Z n | = 2 n 1 {\displaystyle |Z_{n}|=2^{n}-1} .

Pattern reduction

Free letter

Given a pattern p {\displaystyle p} over some alphabet Δ {\displaystyle \Delta } , we say x Δ {\displaystyle x\in \Delta } is free for p {\displaystyle p} if there exist subsets A , B {\displaystyle A,B} of Δ {\displaystyle \Delta } such that the following hold:

  1. u v {\displaystyle uv} is a factor of p {\displaystyle p} and u A {\displaystyle u\in A} u v {\displaystyle uv} is a factor of p {\displaystyle p} and v B {\displaystyle v\in B}
  2. x A B B A {\displaystyle x\in A\backslash B\cup B\backslash A}

For example, let p = a b c b a b {\displaystyle p=abcbab} , then b {\displaystyle b} is free for p {\displaystyle p} since there exist A = a c , B = b {\displaystyle A=ac,B=b} satisfying the conditions above.

Reduce

A pattern p Δ {\displaystyle p\in \Delta ^{*}} reduces to pattern q {\displaystyle q} if there exists a symbol x Δ {\displaystyle x\in \Delta } such that x {\displaystyle x} is free for p {\displaystyle p} , and q {\displaystyle q} can be obtained by removing all occurrence of x {\displaystyle x} from p {\displaystyle p} . Denote this relation by p x q {\displaystyle p{\stackrel {x}{\rightarrow }}q} .

For example, let p = a b c b a b {\displaystyle p=abcbab} , then p {\displaystyle p} can reduce to q = a c a {\displaystyle q=aca} since b {\displaystyle b} is free for p {\displaystyle p} .

Locked

A word w {\displaystyle w} is said to be locked if w {\displaystyle w} has no free letter; hence w {\displaystyle w} can not be reduced.

Transitivity

Given patterns p , q , r {\displaystyle p,q,r} , if p {\displaystyle p} reduces to q {\displaystyle q} and q {\displaystyle q} reduces to r {\displaystyle r} , then p {\displaystyle p} reduces to r {\displaystyle r} . Denote this relation by p r {\displaystyle p{\stackrel {*}{\rightarrow }}r} .

Unavoidability

A pattern p {\displaystyle p} is unavoidable if and only if p {\displaystyle p} reduces to a word of length one; hence w {\displaystyle \exists w} such that | w | = 1 {\displaystyle |w|=1} and p w {\displaystyle p{\stackrel {*}{\rightarrow }}w} .

Graph pattern avoidance

Avoidance / Matching on a specific graph

Given a simple graph G = ( V , E ) {\displaystyle G=(V,E)} , a edge coloring c : E Δ {\displaystyle c:E\rightarrow \Delta } matches pattern p {\displaystyle p} if there exists a simple path P = [ e 1 , e 2 , . . . , e r ] {\displaystyle P=} in G {\displaystyle G} such that the sequence c ( P ) = [ c ( e 1 ) , c ( e 2 ) , . . . , c ( e r ) ] {\displaystyle c(P)=} matches p {\displaystyle p} . Otherwise, c {\displaystyle c} is said to avoid p {\displaystyle p} or be p {\displaystyle p} -free.

Similarly, a vertex coloring c : V Δ {\displaystyle c:V\rightarrow \Delta } matches pattern p {\displaystyle p} if there exists a simple path P = [ c 1 , c 2 , . . . , c r ] {\displaystyle P=} in G {\displaystyle G} such that the sequence c ( P ) {\displaystyle c(P)} matches p {\displaystyle p} .

Pattern chromatic number

The pattern chromatic number π p ( G ) {\displaystyle \pi _{p}(G)} is the minimal number of distinct colors needed for a p {\displaystyle p} -free vertex coloring c {\displaystyle c} over the graph G {\displaystyle G} .

Let π p ( n ) = max { π p ( G ) : G G n } {\displaystyle \pi _{p}(n)=\max\{\pi _{p}(G):G\in G_{n}\}} where G n {\displaystyle G_{n}} is the set of all simple graphs with a maximum degree no more than n {\displaystyle n} .

Similarly, π p ( G ) {\displaystyle \pi _{p}'(G)} and π p ( n ) {\displaystyle \pi _{p}'(n)} are defined for edge colorings.

Avoidability / Unavoidability on graphs

A pattern p {\displaystyle p} is avoidable on graphs if π p ( n ) {\displaystyle \pi _{p}(n)} is bounded by c p {\displaystyle c_{p}} , where c p {\displaystyle c_{p}} only depends on p {\displaystyle p} .

  • Avoidance on words can be expressed as a specific case of avoidance on graphs; hence a pattern p {\displaystyle p} is avoidable on any finite alphabet if and only if π p ( P n ) c p {\displaystyle \pi _{p}(P_{n})\leq c_{p}} for all n Z + {\displaystyle n\in \mathrm {Z} ^{+}} , where P n {\displaystyle P_{n}} is a graph of n {\displaystyle n} vertices concatenated.

Probabilistic bound on πp(n)

There exists an absolute constant c {\displaystyle c} , such that π p ( n ) c n m ( p ) m ( p ) 1 c n 2 {\displaystyle \pi _{p}(n)\leq cn^{\frac {m(p)}{m(p)-1}}\leq cn^{2}} for all patterns p {\displaystyle p} with m ( p ) 2 {\displaystyle m(p)\geq 2} .

Given a pattern p {\displaystyle p} , let n {\displaystyle n} represent the number of distinct symbols of p {\displaystyle p} . If | p | 2 n {\displaystyle |p|\geq 2^{n}} , then p {\displaystyle p} is avoidable on graphs.

Explicit colorings

Given a pattern p {\displaystyle p} such that c o u n t p ( x ) {\displaystyle count_{p}(x)} is even for all x p {\displaystyle x\in p} , then π p ( K 2 k ) 2 k 1 {\displaystyle \pi _{p}'(K_{2}^{k})\leq 2^{k}-1} for all k 1 {\displaystyle k\geq 1} , where K n {\displaystyle K_{n}} is the complete graph of n {\displaystyle n} vertices.

Given a pattern p {\displaystyle p} such that m ( p ) 2 {\displaystyle m(p)\geq 2} , and an arbitrary tree T {\displaystyle T} , let S {\displaystyle S} be the set of all avoidable subpatterns and their reflections of p {\displaystyle p} . Then π p ( T ) 3 μ ( S ) {\displaystyle \pi _{p}(T)\leq 3\mu (S)} .

Given a pattern p {\displaystyle p} such that m ( p ) 2 {\displaystyle m(p)\geq 2} , and a tree T {\displaystyle T} with degree n 2 {\displaystyle n\geq 2} . Let S {\displaystyle S} be the set of all avoidable subpatterns and their reflections of p {\displaystyle p} , then π p ( T ) 2 ( n 1 ) μ ( S ) {\displaystyle \pi _{p}'(T)\leq 2(n-1)\mu (S)} .

Examples

  • The Thue–Morse sequence is cube-free and overlap-free; hence it avoids the patterns x x x {\displaystyle xxx} and x y x y x {\displaystyle xyxyx} .
  • A square-free word is one avoiding the pattern x x {\displaystyle xx} . The word over the alphabet { 0 , ± 1 } {\displaystyle \{0,\pm 1\}} obtained by taking the first difference of the Thue–Morse sequence is an example of an infinite square-free word.
  • The patterns x {\displaystyle x} and x y x {\displaystyle xyx} are unavoidable on any alphabet, since they are factors of the Zimin words.
  • The power patterns x n {\displaystyle x^{n}} for n 3 {\displaystyle n\geq 3} are 2-avoidable.
  • All binary patterns can be divided into three categories:
    • ε , x , x y x {\displaystyle \varepsilon ,x,xyx} are unavoidable.
    • x x , x x y , x y y , x x y x , x x y y , x y x x , x y x y , x y y x , x x y x x , x x y x y , x y x y y {\displaystyle xx,xxy,xyy,xxyx,xxyy,xyxx,xyxy,xyyx,xxyxx,xxyxy,xyxyy} have avoidability index of 3.
    • others have avoidability index of 2.
  • a b w b a x b c y a c z c a {\displaystyle abwbaxbcyaczca} has avoidability index of 4, as well as other locked words.
  • a b v a c w b a x b c y c d a z d c d {\displaystyle abvacwbaxbcycdazdcd} has avoidability index of 5.
  • The repetitive threshold R T ( n ) {\displaystyle RT(n)} is the infimum of exponents k {\displaystyle k} such that x k {\displaystyle x^{k}} is avoidable on an alphabet of size n {\displaystyle n} . Also see Dejean's theorem.

Open problems

  • Is there an avoidable pattern p {\displaystyle p} such that the avoidability index of p {\displaystyle p} is 6?
  • Given an arbitrarily pattern p {\displaystyle p} , is there an algorithm to determine the avoidability index of p {\displaystyle p} ?

References

  1. ^ Lothaire, M. (2002). Algebraic Combinatorics on Words. Cambridge University Press. ISBN 9780521812207.
  2. ^ Combinatorics on Words: Christoffel Words and Repetitions in Words. American Mathematical Soc. p. 127. ISBN 978-0-8218-7325-0.
  3. ^ Schmidt, Ursula (1987-08-01). "Long unavoidable patterns". Acta Informatica. 24 (4): 433–445. doi:10.1007/BF00292112. ISSN 1432-0525. S2CID 7928450.
  4. ^ Zimin, A. I. (1984). "Blocking Sets of Terms". Mathematics of the USSR-Sbornik. 47 (2): 353–364. Bibcode:1984SbMat..47..353Z. doi:10.1070/SM1984v047n02ABEH002647. ISSN 0025-5734.
  5. Joshua, Cooper; Rorabaugh, Danny (2013). Bounds on Zimin Word Avoidance. arXiv.org. arXiv:1409.3080. Bibcode:2014arXiv1409.3080C.
  6. ^ Baker, Kirby A.; McNulty, George F.; Taylor, Walter (1989-12-18). "Growth problems for avoidable words". Theoretical Computer Science. 69 (3): 319–345. doi:10.1016/0304-3975(89)90071-6. ISSN 0304-3975.
  7. Bean, Dwight R.; Ehrenfeucht, Andrzej; McNulty, George F. (1979). "Avoidable patterns in strings of symbols". Pacific Journal of Mathematics. 85 (2): 261–294. doi:10.2140/pjm.1979.85.261. ISSN 0030-8730.
  8. ^ Grytczuk, Jarosław (2007-05-28). "Pattern avoidance on graphs". Discrete Mathematics. The Fourth Caracow Conference on Graph Theory. 307 (11): 1341–1346. doi:10.1016/j.disc.2005.11.071. ISSN 0012-365X.
  9. Combinatorics on Words: Christoffel Words and Repetitions in Words. American Mathematical Soc. p. 97. ISBN 978-0-8218-7325-0.
  10. Fogg, N. Pytheas (2002-09-23). Substitutions in Dynamics, Arithmetics and Combinatorics. Springer Science & Business Media. p. 104. ISBN 978-3-540-44141-0.
  11. Allouche, Jean-Paul; Shallit, Jeffrey; Shallit, Professor Jeffrey (2003-07-21). Automatic Sequences: Theory, Applications, Generalizations. Cambridge University Press. p. 24. ISBN 978-0-521-82332-6.
  12. Clark, Ronald J. (2006-04-01). "The existence of a pattern which is 5-avoidable but 4-unavoidable". International Journal of Algebra and Computation. 16 (2): 351–367. doi:10.1142/S0218196706002950. ISSN 0218-1967.
Categories: