Misplaced Pages

Zemor's decoding algorithm

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In coding theory, Zemor's algorithm, designed and developed by Gilles Zemor, is a recursive low-complexity approach to code construction. It is an improvement over the algorithm of Sipser and Spielman.

Zemor considered a typical class of Sipser–Spielman construction of expander codes, where the underlying graph is bipartite graph. Sipser and Spielman introduced a constructive family of asymptotically good linear-error codes together with a simple parallel algorithm that will always remove a constant fraction of errors. The article is based on Dr. Venkatesan Guruswami's course notes

Code construction

Zemor's algorithm is based on a type of expander graphs called Tanner graph. The construction of code was first proposed by Tanner. The codes are based on double cover d {\displaystyle d} , regular expander G {\displaystyle G} , which is a bipartite graph. G {\displaystyle G} = ( V , E ) {\displaystyle \left(V,E\right)} , where V {\displaystyle V} is the set of vertices and E {\displaystyle E} is the set of edges and V {\displaystyle V} = A {\displaystyle A} {\displaystyle \cup } B {\displaystyle B} and A {\displaystyle A} {\displaystyle \cap } B {\displaystyle B} = {\displaystyle \emptyset } , where A {\displaystyle A} and B {\displaystyle B} denotes sets of vertices. Let n {\displaystyle n} be the number of vertices in each group, i.e, | A | = | B | = n {\displaystyle |A|=|B|=n} . The edge set E {\displaystyle E} be of size N {\displaystyle N} = n d {\displaystyle nd} and every edge in E {\displaystyle E} has one endpoint in both A {\displaystyle A} and B {\displaystyle B} . E ( v ) {\displaystyle E(v)} denotes the set of edges containing v {\displaystyle v} .

Assume an ordering on V {\displaystyle V} , therefore ordering will be done on every edges of E ( v ) {\displaystyle E(v)} for every v V {\displaystyle v\in V} . Let finite field F = G F ( 2 ) {\displaystyle \mathbb {F} =GF(2)} , and for a word x = ( x e ) , e E {\displaystyle x=(x_{e}),e\in E} in F N {\displaystyle \mathbb {F} ^{N}} , let the subword of the word will be indexed by E ( v ) {\displaystyle E(v)} . Let that word be denoted by ( x ) v {\displaystyle (x)_{v}} . The subset of vertices A {\displaystyle A} and B {\displaystyle B} induces every word x F N {\displaystyle x\in \mathbb {F} ^{N}} a partition into n {\displaystyle n} non-overlapping sub-words ( x ) v F d {\displaystyle \left(x\right)_{v}\in \mathbb {F} ^{d}} , where v {\displaystyle v} ranges over the elements of A {\displaystyle A} . For constructing a code C {\displaystyle C} , consider a linear subcode C o {\displaystyle C_{o}} , which is a [ d , r o d , δ ] {\displaystyle } code, where q {\displaystyle q} , the size of the alphabet is 2 {\displaystyle 2} . For any vertex v V {\displaystyle v\in V} , let v ( 1 ) , v ( 2 ) , , v ( d ) {\displaystyle v(1),v(2),\ldots ,v(d)} be some ordering of the d {\displaystyle d} vertices of E {\displaystyle E} adjacent to v {\displaystyle v} . In this code, each bit x e {\displaystyle x_{e}} is linked with an edge e {\displaystyle e} of E {\displaystyle E} .

We can define the code C {\displaystyle C} to be the set of binary vectors x = ( x 1 , x 2 , , x N ) {\displaystyle x=\left(x_{1},x_{2},\ldots ,x_{N}\right)} of { 0 , 1 } N {\displaystyle \{0,1\}^{N}} such that, for every vertex v {\displaystyle v} of V {\displaystyle V} , ( x v ( 1 ) , x v ( 2 ) , , x v ( d ) ) {\displaystyle \left(x_{v(1)},x_{v(2)},\ldots ,x_{v(d)}\right)} is a code word of C o {\displaystyle C_{o}} . In this case, we can consider a special case when every edge of E {\displaystyle E} is adjacent to exactly 2 {\displaystyle 2} vertices of V {\displaystyle V} . It means that V {\displaystyle V} and E {\displaystyle E} make up, respectively, the vertex set and edge set of d {\displaystyle d} regular graph G {\displaystyle G} .

Let us call the code C {\displaystyle C} constructed in this way as ( G , C o ) {\displaystyle \left(G,C_{o}\right)} code. For a given graph G {\displaystyle G} and a given code C o {\displaystyle C_{o}} , there are several ( G , C o ) {\displaystyle \left(G,C_{o}\right)} codes as there are different ways of ordering edges incident to a given vertex v {\displaystyle v} , i.e., v ( 1 ) , v ( 2 ) , , v ( d ) {\displaystyle v(1),v(2),\ldots ,v(d)} . In fact our code C {\displaystyle C} consist of all codewords such that x v C o {\displaystyle x_{v}\in C_{o}} for all v A , B {\displaystyle v\in A,B} . The code C {\displaystyle C} is linear [ N , K , D ] {\displaystyle } in F {\displaystyle \mathbb {F} } as it is generated from a subcode C o {\displaystyle C_{o}} , which is linear. The code C {\displaystyle C} is defined as C = { c F N : ( c ) v C o } {\displaystyle C=\{c\in \mathbb {F} ^{N}:(c)_{v}\in C_{o}\}} for every v V {\displaystyle v\in V} .

A
Graph G and code C

In this figure, ( x ) v = ( x e 1 , x e 2 , x e 3 , x e 4 ) C o {\displaystyle (x)_{v}=\left(x_{e1},x_{e2},x_{e3},x_{e4}\right)\in C_{o}} . It shows the graph G {\displaystyle G} and code C {\displaystyle C} .

In matrix G {\displaystyle G} , let λ {\displaystyle \lambda } is equal to the second largest eigenvalue of adjacency matrix of G {\displaystyle G} . Here the largest eigenvalue is d {\displaystyle d} . Two important claims are made:

Claim 1

( K N ) 2 r o 1 {\displaystyle \left({\dfrac {K}{N}}\right)\geq 2r_{o}-1}
. Let R {\displaystyle R} be the rate of a linear code constructed from a bipartite graph whose digit nodes have degree m {\displaystyle m} and whose subcode nodes have degree n {\displaystyle n} . If a single linear code with parameters ( n , k ) {\displaystyle \left(n,k\right)} and rate r = ( k n ) {\displaystyle r=\left({\dfrac {k}{n}}\right)} is associated with each of the subcode nodes, then k 1 ( 1 r ) m {\displaystyle k\geq 1-\left(1-r\right)m} .

Proof

Let R {\displaystyle R} be the rate of the linear code, which is equal to K / N {\displaystyle K/N} Let there are S {\displaystyle S} subcode nodes in the graph. If the degree of the subcode is n {\displaystyle n} , then the code must have ( n m ) S {\displaystyle \left({\dfrac {n}{m}}\right)S} digits, as each digit node is connected to m {\displaystyle m} of the ( n ) S {\displaystyle \left(n\right)S} edges in the graph. Each subcode node contributes ( n k ) {\displaystyle (n-k)} equations to parity check matrix for a total of ( n k ) S {\displaystyle \left(n-k\right)S} . These equations may not be linearly independent. Therefore, ( K N ) ( ( n m ) S ( n k ) S ( n m ) S ) {\displaystyle \left({\dfrac {K}{N}}\right)\geq \left({\dfrac {({\dfrac {n}{m}})S-(n-k)S}{({\dfrac {n}{m}})S}}\right)}
1 m ( n k n ) {\displaystyle \geq 1-m\left({\dfrac {n-k}{n}}\right)}
1 m ( 1 r ) {\displaystyle \geq 1-m\left(1-r\right)} , Since the value of m {\displaystyle m} , i.e., the digit node of this bipartite graph is 2 {\displaystyle 2} and here r = r o {\displaystyle r=r_{o}} , we can write as:
( K N ) 2 r o 1 {\displaystyle \left({\dfrac {K}{N}}\right)\geq 2r_{o}-1}

Claim 2

D N ( ( δ ( λ d ) ) ( 1 ( λ d ) ) ) 2 {\displaystyle D\geq N\left({\dfrac {(\delta -({\dfrac {\lambda }{d}}))}{(1-({\dfrac {\lambda }{d}})}})\right)^{2}}
= N ( δ 2 O ( λ d ) ) {\displaystyle =N\left(\delta ^{2}-O\left({\dfrac {\lambda }{d}}\right)\right)} ( 1 ) {\displaystyle \rightarrow (1)}

If S {\displaystyle S} is linear code of rate r {\displaystyle r} , block code length d {\displaystyle d} , and minimum relative distance δ {\displaystyle \delta } , and if B {\displaystyle B} is the edge vertex incidence graph of a d {\displaystyle d} – regular graph with second largest eigenvalue λ {\displaystyle \lambda } , then the code C ( B , S ) {\displaystyle C(B,S)} has rate at least 2 r o 1 {\displaystyle 2r_{o}-1} and minimum relative distance at least ( ( δ ( λ d ) 1 ( λ d ) ) ) 2 {\displaystyle \left(\left({\dfrac {\delta -\left({\dfrac {\lambda }{d}}\right)}{1-\left({\dfrac {\lambda }{d}}\right)}}\right)\right)^{2}} .

Proof

Let B {\displaystyle B} be derived from the d {\displaystyle d} regular graph G {\displaystyle G} . So, the number of variables of C ( B , S ) {\displaystyle C(B,S)} is ( d n 2 ) {\displaystyle \left({\dfrac {dn}{2}}\right)} and the number of constraints is n {\displaystyle n} . According to Alon - Chung, if X {\displaystyle X} is a subset of vertices of G {\displaystyle G} of size γ n {\displaystyle \gamma n} , then the number of edges contained in the subgraph is induced by X {\displaystyle X} in G {\displaystyle G} is at most ( d n 2 ) ( γ 2 + ( λ d ) γ ( 1 γ ) ) {\displaystyle \left({\dfrac {dn}{2}}\right)\left(\gamma ^{2}+({\dfrac {\lambda }{d}})\gamma \left(1-\gamma \right)\right)} .

As a result, any set of ( d n 2 ) ( γ 2 + ( λ d ) γ ( 1 γ ) ) {\displaystyle \left({\dfrac {dn}{2}}\right)\left(\gamma ^{2}+\left({\dfrac {\lambda }{d}}\right)\gamma \left(1-\gamma \right)\right)} variables will be having at least γ n {\displaystyle \gamma n} constraints as neighbours. So the average number of variables per constraint is : ( ( 2 n d 2 ) ( γ 2 + ( λ d ) γ ( 1 γ ) ) γ n ) {\displaystyle \left({\dfrac {({\dfrac {2nd}{2}})\left(\gamma ^{2}+({\dfrac {\lambda }{d}})\gamma \left(1-\gamma \right)\right)}{\gamma n}}\right)} = d ( γ + ( λ d ) ( 1 γ ) ) {\displaystyle =d\left(\gamma +({\dfrac {\lambda }{d}})\left(1-\gamma \right)\right)} ( 2 ) {\displaystyle \rightarrow (2)}

So if d ( γ + ( λ d ) ( 1 γ ) ) < γ d {\displaystyle d\left(\gamma +({\dfrac {\lambda }{d}})\left(1-\gamma \right)\right)<\gamma d} , then a word of relative weight ( γ 2 + ( λ d ) γ ( 1 γ ) ) {\displaystyle \left(\gamma ^{2}+({\dfrac {\lambda }{d}})\gamma \left(1-\gamma \right)\right)} , cannot be a codeword of C ( B , S ) {\displaystyle C(B,S)} . The inequality ( 2 ) {\displaystyle (2)} is satisfied for γ < ( 1 ( λ d ) δ ( λ d ) ) {\displaystyle \gamma <\left({\dfrac {1-({\dfrac {\lambda }{d}})}{\delta -({\dfrac {\lambda }{d}})}}\right)} . Therefore, C ( B , S ) {\displaystyle C(B,S)} cannot have a non zero codeword of relative weight ( δ ( λ d ) 1 ( λ d ) ) 2 {\displaystyle \left({\dfrac {\delta -({\dfrac {\lambda }{d}})}{1-({\dfrac {\lambda }{d}})}}\right)^{2}} or less.

In matrix G {\displaystyle G} , we can assume that λ / d {\displaystyle \lambda /d} is bounded away from 1 {\displaystyle 1} . For those values of d {\displaystyle d} in which d 1 {\displaystyle d-1} is odd prime, there are explicit constructions of sequences of d {\displaystyle d} - regular bipartite graphs with arbitrarily large number of vertices such that each graph G {\displaystyle G} in the sequence is a Ramanujan graph. It is called Ramanujan graph as it satisfies the inequality λ ( G ) 2 d 1 {\displaystyle \lambda (G)\leq 2{\sqrt {d-1}}} . Certain expansion properties are visible in graph G {\displaystyle G} as the separation between the eigenvalues d {\displaystyle d} and λ {\displaystyle \lambda } . If the graph G {\displaystyle G} is Ramanujan graph, then that expression ( 1 ) {\displaystyle (1)} will become 0 {\displaystyle 0} eventually as d {\displaystyle d} becomes large.

Zemor's algorithm

The iterative decoding algorithm written below alternates between the vertices A {\displaystyle A} and B {\displaystyle B} in G {\displaystyle G} and corrects the codeword of C o {\displaystyle C_{o}} in A {\displaystyle A} and then it switches to correct the codeword C o {\displaystyle C_{o}} in B {\displaystyle B} . Here edges associated with a vertex on one side of a graph are not incident to other vertex on that side. In fact, it doesn't matter in which order, the set of nodes A {\displaystyle A} and B {\displaystyle B} are processed. The vertex processing can also be done in parallel.

The decoder D : F d C o {\displaystyle \mathbb {D} :\mathbb {F} ^{d}\rightarrow C_{o}} stands for a decoder for C o {\displaystyle C_{o}} that recovers correctly with any codewords with less than ( d 2 ) {\displaystyle \left({\dfrac {d}{2}}\right)} errors.

Decoder algorithm

Received word : w = ( w e ) , e E {\displaystyle w=(w_{e}),e\in E}
z w {\displaystyle z\leftarrow w}
For t 1 {\displaystyle t\leftarrow 1} to m {\displaystyle m} do // m {\displaystyle m} is the number of iterations
{ if ( t {\displaystyle t} is odd) // Here the algorithm will alternate between its two vertex sets.
X A {\displaystyle X\leftarrow A}
else X B {\displaystyle X\leftarrow B}
Iteration t {\displaystyle t} : For every v X {\displaystyle v\in X} , let ( z ) v D ( ( z ) v ) {\displaystyle (z)_{v}\leftarrow \mathbb {D} ((z)_{v})} // Decoding z v {\displaystyle z_{v}} to its nearest codeword.
}
Output: z {\displaystyle z}

Explanation of the algorithm

Since G {\displaystyle G} is bipartite, the set A {\displaystyle A} of vertices induces the partition of the edge set E {\displaystyle E} = v A E v {\displaystyle \cup _{v\in A}E_{v}} . The set B {\displaystyle B} induces another partition, E {\displaystyle E} = v B E v {\displaystyle \cup _{v\in B}E_{v}} .

Let w { 0 , 1 } N {\displaystyle w\in \{0,1\}^{N}} be the received vector, and recall that N = d n {\displaystyle N=dn} . The first iteration of the algorithm consists of applying the complete decoding for the code induced by E v {\displaystyle E_{v}} for every v A {\displaystyle v\in A} . This means that for replacing, for every v A {\displaystyle v\in A} , the vector ( w v ( 1 ) , w v ( 2 ) , , w v ( d ) ) {\displaystyle \left(w_{v(1)},w_{v(2)},\ldots ,w_{v(d)}\right)} by one of the closest codewords of C o {\displaystyle C_{o}} . Since the subsets of edges E v {\displaystyle E_{v}} are disjoint for v A {\displaystyle v\in A} , the decoding of these n {\displaystyle n} subvectors of w {\displaystyle w} may be done in parallel.

The iteration will yield a new vector z {\displaystyle z} . The next iteration consists of applying the preceding procedure to z {\displaystyle z} but with A {\displaystyle A} replaced by B {\displaystyle B} . In other words, it consists of decoding all the subvectors induced by the vertices of B {\displaystyle B} . The coming iterations repeat those two steps alternately applying parallel decoding to the subvectors induced by the vertices of A {\displaystyle A} and to the subvectors induced by the vertices of B {\displaystyle B} .
Note: [If d = n {\displaystyle d=n} and G {\displaystyle G} is the complete bipartite graph, then C {\displaystyle C} is a product code of C o {\displaystyle C_{o}} with itself and the above algorithm reduces to the natural hard iterative decoding of product codes].

Here, the number of iterations, m {\displaystyle m} is ( ( log n ) log ( 2 α ) ) {\displaystyle \left({\dfrac {(\log {n})}{\log(2-\alpha )}}\right)} . In general, the above algorithm can correct a code word whose Hamming weight is no more than ( 1 2 ) . α N δ ( ( δ 2 ) ( λ d ) ) = ( ( 1 4 ) . α N ( δ 2 O ( λ d ) ) {\displaystyle ({\dfrac {1}{2}}).\alpha N\delta \left(({\dfrac {\delta }{2}})-({\dfrac {\lambda }{d}})\right)=\left(({\dfrac {1}{4}}).\alpha N(\delta ^{2}-O({\dfrac {\lambda }{d}})\right)} for values of α < 1 {\displaystyle \alpha <1} . Here, the decoding algorithm is implemented as a circuit of size O ( N log N ) {\displaystyle O(N\log {N})} and depth O ( log N ) {\displaystyle O(\log {N})} that returns the codeword given that error vector has weight less than α N δ 2 ( 1 ϵ ) / 4 {\displaystyle \alpha N\delta ^{2}(1-\epsilon )/4} .

Theorem

If G {\displaystyle G} is a Ramanujan graph of sufficiently high degree, for any α < 1 {\displaystyle \alpha <1} , the decoding algorithm can correct ( α δ o 2 4 ) ( 1 ) N {\displaystyle ({\dfrac {\alpha \delta _{o}^{2}}{4}})(1-\in )N} errors, in O ( log n ) {\displaystyle O(\log {n})} rounds ( where the big- O {\displaystyle O} notation hides a dependence on α {\displaystyle \alpha } ). This can be implemented in linear time on a single processor; on n {\displaystyle n} processors each round can be implemented in constant time.

Proof

Since the decoding algorithm is insensitive to the value of the edges and by linearity, we can assume that the transmitted codeword is the all zeros - vector. Let the received codeword be w {\displaystyle w} . The set of edges which has an incorrect value while decoding is considered. Here by incorrect value, we mean 1 {\displaystyle 1} in any of the bits. Let w = w 0 {\displaystyle w=w^{0}} be the initial value of the codeword, w 1 , w 2 , , w t {\displaystyle w^{1},w^{2},\ldots ,w^{t}} be the values after first, second . . . t {\displaystyle t} stages of decoding. Here, X i = e E | x e i = 1 {\displaystyle X^{i}={e\in E|x_{e}^{i}=1}} , and S i = v V i | E v X i + 1 ! = {\displaystyle S^{i}={v\in V^{i}|E_{v}\cap X^{i+1}!=\emptyset }} . Here S i {\displaystyle S^{i}} corresponds to those set of vertices that was not able to successfully decode their codeword in the i t h {\displaystyle i^{th}} round. From the above algorithm S 1 < S 0 {\displaystyle S^{1}<S^{0}} as number of unsuccessful vertices will be corrected in every iteration. We can prove that S 0 > S 1 > S 2 > {\displaystyle S^{0}>S^{1}>S^{2}>\cdots } is a decreasing sequence. In fact, | S i + 1 | <= ( 1 2 α ) | S i | {\displaystyle |S_{i+1}|<=({\dfrac {1}{2-\alpha }})|S_{i}|} . As we are assuming, α < 1 {\displaystyle \alpha <1} , the above equation is in a geometric decreasing sequence. So, when | S i | < n {\displaystyle |S_{i}|<n} , more than l o g 2 α n {\displaystyle log_{2-\alpha }n} rounds are necessary. Furthermore, | S i | = n ( 1 ( 2 α ) i ) = O ( n ) {\displaystyle \sum |S_{i}|=n\sum ({\dfrac {1}{(2-\alpha )^{i}}})=O(n)} , and if we implement the i t h {\displaystyle i^{th}} round in O ( | S i | ) {\displaystyle O(|S_{i}|)} time, then the total sequential running time will be linear.

Drawbacks of Zemor's algorithm

  1. It is lengthy process as the number of iterations m {\displaystyle m} in decoder algorithm takes is [ ( log n ) / ( log ( 2 α ) ) ] {\displaystyle }
  2. Zemor's decoding algorithm finds it difficult to decode erasures. A detailed way of how we can improve the algorithm is

given in.

See also

References

  1. "Gilles Zémor". www.math.u-bordeaux.fr. Retrieved 9 April 2023.
  2. Guruswami, Venkatesan; Cary, Matt (January 27, 2003). "Lecture 5". CSE590G: Codes and Pseudorandom Objects. University of Washington. Archived from the original on 2014-02-24.
  3. "Lecture notes" (PDF). washington.edu. Retrieved 9 April 2023.
  4. N. Alon; F.R.K. Chung (December 1988). "Explicit construction of linear sized tolerant networks". Discrete Mathematics. 72 (1–3): 15–19. CiteSeerX 10.1.1.300.7495. doi:10.1016/0012-365X(88)90189-6.
  5. "Archived copy". Archived from the original on September 14, 2004. Retrieved May 1, 2012.{{cite web}}: CS1 maint: archived copy as title (link)
Categories: