Misplaced Pages

Incomplete Cholesky factorization

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

In numerical analysis, an incomplete Cholesky factorization of a symmetric positive definite matrix is a sparse approximation of the Cholesky factorization. An incomplete Cholesky factorization is often used as a preconditioner for algorithms like the conjugate gradient method.

The Cholesky factorization of a positive definite matrix A is A = LL* where L is a lower triangular matrix. An incomplete Cholesky factorization is given by a sparse lower triangular matrix K that is in some sense close to L. The corresponding preconditioner is KK*.

One popular way to find such a matrix K is to use the algorithm for finding the exact Cholesky decomposition in which K has the same sparsity pattern as A (any entry of K is set to zero if the corresponding entry in A is also zero). This gives an incomplete Cholesky factorization which is as sparse as the matrix A.

Motivation

Consider the following matrix as an example:

A = [ 5 2 0 2 2 2 5 2 0 0 0 2 5 2 0 2 0 2 5 2 2 0 0 2 5 ] {\displaystyle \mathbf {A} ={\begin{bmatrix}5&-2&0&-2&-2\\-2&5&-2&0&0\\0&-2&5&-2&0\\-2&0&-2&5&-2\\-2&0&0&-2&5\\\end{bmatrix}}}

If we apply the full regular Cholesky decomposition, it yields:

L = [ 2.24 0 0 0 0 0.89 2.05 0 0 0 0 0.98 2.02 0 0 0.89 0.39 1.18 1.63 0 0.89 0.39 0.19 1.95 0.45 ] {\displaystyle \mathbf {L} ={\begin{bmatrix}2.24&0&0&0&0\\-0.89&2.05&0&0&0\\0&-0.98&2.02&0&0\\-0.89&-0.39&-1.18&1.63&0\\-0.89&-0.39&-0.19&-1.95&0.45\\\end{bmatrix}}}

And, by definition:

A = L L {\displaystyle \mathbf {A} =\mathbf {L} \mathbf {L'} }

However, by applying Cholesky decomposition, we observe that some zero elements in the original matrix end up being non-zero elements in the decomposed matrix, like elements (4,2), (5,2) and (5,3) in this example. These elements are known as "fill-ins".

This is not an issue per se, but it is very problematic when working with sparse matrices, since the fill-ins generation is mostly unpredictable and reduces the matrix sparsity, impacting the efficiency of sparse matrix algorithms.

Therefore, given the importance of the Cholesky decomposition in matrix calculations, it is extremely relevant to repurpose the regular method, so as to eliminate the fill-ins generation. The incomplete Cholesky factorization does exactly that, by generating a matrix L similar to the one generated by the regular method, but conserving the zero elements in the original matrix.

Naturally:

A L i c h o l L i c h o l {\displaystyle \mathbf {A} \neq \mathbf {L_{ichol}} \mathbf {L_{ichol}'} }

Multiplying matrix L generated by incomplete Cholesky factorization by its transpose won't yield the original matrix, but a similar one.

Algorithm

For i {\displaystyle i} from 1 {\displaystyle 1} to N {\displaystyle N} :

L i i = ( a i i k = 1 i 1 L i k 2 ) 1 2 {\displaystyle L_{ii}=\left({a_{ii}-\sum \limits _{k=1}^{i-1}{L_{ik}^{2}}}\right)^{1 \over 2}}

For j {\displaystyle j} from i + 1 {\displaystyle i+1} to N {\displaystyle N} :

L j i = 1 L i i ( a j i k = 1 i 1 L i k L j k ) {\displaystyle L_{ji}={1 \over {L_{ii}}}\left({a_{ji}-\sum \limits _{k=1}^{i-1}{L_{ik}L_{jk}}}\right)}

Implementation

Implementation of the incomplete Cholesky factorization in the GNU Octave language. The factorization is stored as a lower triangular matrix, with the elements in the upper triangle set to zero.

function a = ichol(a)
	n = size(a,1);
	for k = 1:n
		a(k,k) = sqrt(a(k,k));
		for i = (k+1):n
		    if (a(i,k) != 0)
		        a(i,k) = a(i,k)/a(k,k);            
		    endif
		endfor
		for j = (k+1):n
		    for i = j:n
		        if (a(i,j) != 0)
		            a(i,j) = a(i,j) - a(i,k)*a(j,k);  
		        endif
		    endfor
		endfor
	endfor
    for i = 1:n
        for j = i+1:n
            a(i,j) = 0;
        endfor
    endfor            
endfunction

Sparse example

Consider again the matrix displayed in the beginning of this article. Since it is symmetric and the method only uses the lower triangular elements, we can represent it by:

A t r i = [ 5 0 0 0 0 2 5 0 0 0 0 2 5 0 0 2 0 2 5 0 2 0 0 2 5 ] {\displaystyle \mathbf {A_{tri}} ={\begin{bmatrix}5&0&0&0&0\\-2&5&0&0&0\\0&-2&5&0&0\\-2&0&-2&5&0\\-2&0&0&-2&5\end{bmatrix}}}

More specifically, in its sparse form as a coordinate list, sweeping rows first:

Value	5	-2	-2	-2	5	-2	5	-2	5	-2	5
Row		1	2	4	5	2	3	3	4	4	5	5
Col		1	1	1	1	2	2	3	3	4	4	5

Then, we take the square root of (1,1) and divide the other (i,1) elements by the result:

Value	2.24 -0.89 -0.89 -0.89	| 5 -2 5 -2 5 -2 5
Row		1	  2	   4	 5	    | 2 3  3 4  4 5  5
Col		1	  1	   1  	 1	    | 2 2  3 3  4 4  5

After that, for all the other elements with column greater than 1, calculate (i,j)=(i,j)-(i,1)*(j,1) if (i,1) and (j,1) exist. For example: (5,4) = (5,4)-(5,1)*(4,1) = -2 -(-0.89*-0.89) = -2.8.

Value	2.24 -0.89 -0.89 -0.89	| 4.2 -2 5 -2 4.2 -2.8 4.2
Row		1	  2	   4	 5	    | 2   3  3 4  4	  5    5
Col		1	  1	   1  	 1	    | 2   2  3 3  4	  4    5
                                  ↑           ↑   ↑    ↑

The elements (2,2), (4,4), (5,4) and (5,5) (marked with an arrow) have been recalculated, since they obey this rule. On the other hand, the elements (3,2), (3,3) and (4,3) won't be recalculated since the element (3,1) doesn't exist, even though the elements (2,1) and (4,1) exist.


Now, repeat the process, but for (i,2). Take the square root of (2,2) and divide the other (i,2) elements by the result:

Value	2.24 -0.89 -0.89 -0.89	| 2.05 -0.98 | 5 -2 4.2 -2.8 4.2
Row		1	  2	   4	 5	    | 2    3     | 3  4 4   5    5
Col		1	  1	   1  	 1	    | 2    2     | 3  3 4   4    5

Again, for elements with column greater than 2, calculate (i,j)=(i,j)-(i,2)*(j,2) if (i,2) and (j,2) exist:

Value	2.24 -0.89 -0.89 -0.89	| 2.05 -0.98 | 4.05 -2 4.2 -2.8 4.2
Row		1	  2	   4	 5	    | 2    3     | 3    4  4   5    5
Col		1	  1	   1  	 1	    | 2    2     | 3    3  4   4    5
                                               ↑

Repeat for (i,3). Take the square root of (3,3) and divide the other (i,3):

Value	2.24 -0.89 -0.89 -0.89	2.05 -0.98 | 2.01 -0.99 | 4.2 -2.8 4.2
Row		1	  2	   4	 5	    2    3     | 3    4     | 4	  5    5
Col		1	  1	   1  	 1	    2    2     | 3    3     | 4	  4    5

For elements with column greater than 3, calculate (i,j)=(i,j)-(i,3)*(j,3) if (i,3) and (j,3) exist:

Value	2.24 -0.89 -0.89 -0.89	2.05 -0.98 | 2.01 -0.99 | 3.21 -2.8 4.2
Row		1	  2	   4	 5	    2    3     | 3    4     | 4	   5    5
Col		1	  1	   1  	 1	    2    2     | 3    3     | 4	   4    5
                                                          ↑

Repeat for (i,4). Take the square root of (4,4) and divide the other (i,4):

Value	2.24 -0.89 -0.89 -0.89	2.05 -0.98 2.01 -0.99 | 1.79 -1.56 | 4.2
Row		1	  2	   4	 5	    2    3     3    4     | 4	   5   | 5
Col		1	  1	   1  	 1	    2    2     3    3     | 4	   4   | 5

For elements with column greater than 4, calculate (i,j)=(i,j)-(i,4)*(j,4) if (i,4) and (j,4) exist:

Value	2.24 -0.89 -0.89 -0.89	2.05 -0.98 2.01 -0.99 | 1.79 -1.56 | 1.76
Row		1	  2	   4	 5	    2    3     3    4     | 4	   5   | 5
Col		1	  1	   1  	 1	    2    2     3    3     | 4	   4   | 5
                                                                     ↑

Finally take the square root of (5,5):

Value	2.24 -0.89 -0.89 -0.89	2.05 -0.98 2.01 -0.99 1.79 -1.56 | 1.33
Row		1	  2	   4	 5	    2    3     3    4     4	   5     | 5
Col		1	  1	   1  	 1	    2    2     3    3     4	   4     | 5

Expanding the matrix to its full form:

L = [ 2.24 0 0 0 0 0.89 2.05 0 0 0 0 0.98 2.01 0 0 0.89 0 0.99 1.79 0 0.89 0 0 1.56 1.33 ] {\displaystyle \mathbf {L} ={\begin{bmatrix}2.24&0&0&0&0\\-0.89&2.05&0&0&0\\0&-0.98&2.01&0&0\\-0.89&0&-0.99&1.79&0\\-0.89&0&0&-1.56&1.33\end{bmatrix}}}

Note that in this case no fill-ins were generated compared to the original matrix. The elements (4,2), (5,2) and (5,3) are still zero.

However, if we perform the multiplication of L to its transpose:

L L = [ 5 2 0 2 2 2 5 2 0.8 0.8 0 2 5 2 0 2 0.8 2 5 2 2 0.8 0 2 5 ] {\displaystyle \mathbf {LL'} ={\begin{bmatrix}5&-2&0&-2&-2\\-2&5&-2&0.8&0.8\\0&-2&5&-2&0\\-2&0.8&-2&5&-2\\-2&0.8&0&-2&5\end{bmatrix}}}

We get a matrix slightly different from the original one, since the decomposition didn't take into account all the elements, in order to eliminate the fill-ins.

Sparse implementation

The sparse version for the incomplete Cholesky factorization (the same procedure presented above) implemented in MATLAB can be seen below. Naturally, MATLAB has its own means for dealing with sparse matrixes, but the code below was made explicit for pedagogic purposes. This algorithm is efficient, since it treats the matrix as a sequential 1D array, automatically avoiding the zero elements.

function A=Sp_ichol(A)
	n=size(A,1);
	ncols=A(n).col;
    c_end=0;
    for col=1:ncols
        is_next_col=0;
        c_start=c_end+1;
        for i=c_start:n
            if A(i).col==col % in the current column (col):
                if A(i).col==A(i).row 
                    A(i).val=sqrt(A(i).val); % take the square root of the current column's diagonal element
                    div=A(i).val;
                else
                    A(i).val=A(i).val/div; % divide the other current column's elements by the square root of the diagonal element
                end
            end
            if A(i).col>col % in the next columns (col+1 ... ncols):
                if is_next_col==0
                    c_end=i-1;
                    is_next_col=1;
                end
                v1=0;
                v2=0;
                for j=c_start:c_end
                    if A(j).col==col
                        if A(j).row==A(i).row % search for current column's (col) elements A(j) whose row index is the same as current element's A(i) row index
                            v1=A(j).val;
                        end
                        if A(j).row==A(i).col % search for current column's (col) elements A(j) whose row index is the same as current element's A(i) column index
                            v2=A(j).val;
                        end
                        if v1~=0 && v2~=0 % if these elements exist in the current column (col), recalculate the current element A(i):
                            A(i).val=A(i).val-v1*v2;
                            break;
                        end
                    end
                end
            end
        end
    end
end

References

Numerical linear algebra
Key concepts
Problems
Hardware
Software
Category: