Misplaced Pages

Linear prediction

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Mathematical operation that predicts future values of a discrete-time signal

Linear prediction is a mathematical operation where future values of a discrete-time signal are estimated as a linear function of previous samples.

In digital signal processing, linear prediction is often called linear predictive coding (LPC) and can thus be viewed as a subset of filter theory. In system analysis, a subfield of mathematics, linear prediction can be viewed as a part of mathematical modelling or optimization.

The prediction model

The most common representation is

x ^ ( n ) = i = 1 p a i x ( n i ) {\displaystyle {\widehat {x}}(n)=\sum _{i=1}^{p}a_{i}x(n-i)\,}

where x ^ ( n ) {\displaystyle {\widehat {x}}(n)} is the predicted signal value, x ( n i ) {\displaystyle x(n-i)} the previous observed values, with p n {\displaystyle p\leq n} , and a i {\displaystyle a_{i}} the predictor coefficients. The error generated by this estimate is

e ( n ) = x ( n ) x ^ ( n ) {\displaystyle e(n)=x(n)-{\widehat {x}}(n)\,}

where x ( n ) {\displaystyle x(n)} is the true signal value.

These equations are valid for all types of (one-dimensional) linear prediction. The differences are found in the way the predictor coefficients a i {\displaystyle a_{i}} are chosen.

For multi-dimensional signals the error metric is often defined as

e ( n ) = x ( n ) x ^ ( n ) {\displaystyle e(n)=\|x(n)-{\widehat {x}}(n)\|\,}

where {\displaystyle \|\cdot \|} is a suitable chosen vector norm. Predictions such as x ^ ( n ) {\displaystyle {\widehat {x}}(n)} are routinely used within Kalman filters and smoothers to estimate current and past signal values, respectively, from noisy measurements.

Estimating the parameters

The most common choice in optimization of parameters a i {\displaystyle a_{i}} is the root mean square criterion which is also called the autocorrelation criterion. In this method we minimize the expected value of the squared error E [ e 2 ( n ) ] {\displaystyle E} , which yields the equation

i = 1 p a i R ( j i ) = R ( j ) , {\displaystyle \sum _{i=1}^{p}a_{i}R(j-i)=R(j),}

for 1 ≤ jp, where R is the autocorrelation of signal xn, defined as

  R ( i ) = E { x ( n ) x ( n i ) } {\displaystyle \ R(i)=E\{x(n)x(n-i)\}\,} ,

and E is the expected value. In the multi-dimensional case this corresponds to minimizing the L2 norm.

The above equations are called the normal equations or Yule-Walker equations. In matrix form the equations can be equivalently written as

R A = r {\displaystyle \mathbf {RA} =\mathbf {r} }

where the autocorrelation matrix R {\displaystyle \mathbf {R} } is a symmetric, p × p {\displaystyle p\times p} Toeplitz matrix with elements r i j = R ( i j ) , 0 i , j < p {\displaystyle r_{ij}=R(i-j),0\leq i,j<p} , the vector r {\displaystyle \mathbf {r} } is the autocorrelation vector r j = R ( j ) , 0 < j p {\displaystyle r_{j}=R(j),0<j\leq p} , and A = [ a 1 , a 2 , , a p 1 , a p ] {\displaystyle \mathbf {A} =} , the parameter vector.

Another, more general, approach is to minimize the sum of squares of the errors defined in the form

e ( n ) = x ( n ) x ^ ( n ) = x ( n ) i = 1 p a i x ( n i ) = i = 0 p a i x ( n i ) {\displaystyle e(n)=x(n)-{\widehat {x}}(n)=x(n)-\sum _{i=1}^{p}a_{i}x(n-i)=-\sum _{i=0}^{p}a_{i}x(n-i)}

where the optimisation problem searching over all a i {\displaystyle a_{i}} must now be constrained with a 0 = 1 {\displaystyle a_{0}=-1} .

On the other hand, if the mean square prediction error is constrained to be unity and the prediction error equation is included on top of the normal equations, the augmented set of equations is obtained as

  R A = [ 1 , 0 , . . . , 0 ] T {\displaystyle \ \mathbf {RA} =^{\mathrm {T} }}

where the index i {\displaystyle i} ranges from 0 to p {\displaystyle p} , and R {\displaystyle \mathbf {R} } is a ( p + 1 ) × ( p + 1 ) {\displaystyle (p+1)\times (p+1)} matrix.

Specification of the parameters of the linear predictor is a wide topic and a large number of other approaches have been proposed. In fact, the autocorrelation method is the most common and it is used, for example, for speech coding in the GSM standard.

Solution of the matrix equation R A = r {\displaystyle \mathbf {RA} =\mathbf {r} } is computationally a relatively expensive process. The Gaussian elimination for matrix inversion is probably the oldest solution but this approach does not efficiently use the symmetry of R {\displaystyle \mathbf {R} } . A faster algorithm is the Levinson recursion proposed by Norman Levinson in 1947, which recursively calculates the solution. In particular, the autocorrelation equations above may be more efficiently solved by the Durbin algorithm.

In 1986, Philippe Delsarte and Y.V. Genin proposed an improvement to this algorithm called the split Levinson recursion, which requires about half the number of multiplications and divisions. It uses a special symmetrical property of parameter vectors on subsequent recursion levels. That is, calculations for the optimal predictor containing p {\displaystyle p} terms make use of similar calculations for the optimal predictor containing p 1 {\displaystyle p-1} terms.

Another way of identifying model parameters is to iteratively calculate state estimates using Kalman filters and obtaining maximum likelihood estimates within expectation–maximization algorithms.

For equally-spaced values, a polynomial interpolation is a linear combination of the known values. If the discrete time signal is estimated to obey a polynomial of degree p 1 , {\displaystyle p-1,} then the predictor coefficients a i {\displaystyle a_{i}} are given by the corresponding row of the triangle of binomial transform coefficients. This estimate might be suitable for a slowly varying signal with low noise. The predictions for the first few values of p {\displaystyle p} are

p = 1 : x ^ ( n ) = 1 x ( n 1 ) p = 2 : x ^ ( n ) = 2 x ( n 1 ) 1 x ( n 2 ) p = 3 : x ^ ( n ) = 3 x ( n 1 ) 3 x ( n 2 ) + 1 x ( n 3 ) p = 4 : x ^ ( n ) = 4 x ( n 1 ) 6 x ( n 2 ) + 4 x ( n 3 ) 1 x ( n 4 ) {\displaystyle {\begin{array}{lcl}p=1&:&{\widehat {x}}(n)=1x(n-1)\\p=2&:&{\widehat {x}}(n)=2x(n-1)-1x(n-2)\\p=3&:&{\widehat {x}}(n)=3x(n-1)-3x(n-2)+1x(n-3)\\p=4&:&{\widehat {x}}(n)=4x(n-1)-6x(n-2)+4x(n-3)-1x(n-4)\\\end{array}}}

See also

References

  1. "Kalman Filter - an overview | ScienceDirect Topics". www.sciencedirect.com. Retrieved 2022-06-24.
  2. "Linear Prediction - an overview | ScienceDirect Topics". www.sciencedirect.com. Retrieved 2022-06-24.
  3. Ramirez, M. A. (2008). "A Levinson Algorithm Based on an Isometric Transformation of Durbin's" (PDF). IEEE Signal Processing Letters. 15: 99–102. doi:10.1109/LSP.2007.910319. S2CID 18906207.
  4. Delsarte, P. and Genin, Y. V. (1986), The split Levinson algorithm, IEEE Transactions on Acoustics, Speech, and Signal Processing, v. ASSP-34(3), pp. 470–478
This article includes a list of general references, but it lacks sufficient corresponding inline citations. Please help to improve this article by introducing more precise citations. (November 2010) (Learn how and when to remove this message)

Further reading

External links

Categories: