Signal Modeling - 爱码网

Signal models

In this chapter, the model that will be used is one that represents a signal as the output of a causal linear shift-invariant filter that has a rational system function of the form
H ( z ) = B q ( z ) A p ( z ) = ∑ k = 0 q b q ( k ) z − k 1 + ∑ k = 1 p a p ( k ) z − k H(z)=\frac{B_{q}(z)}{A_{p}(z)}=\frac{\sum_{k=0}^{q} b_{q}(k) z^{-k}}{1+\sum_{k=1}^{p} a_{p}(k) z^{-k}} H(z)=Ap(z)Bq(z)=1+∑k=1pap(k)z−k∑k=0qbq(k)z−k

Signal Modeling

Model identification

For deterministic signals, given observations x [ n ] , n = 0 , ⋅ ⋅ ⋅ , N − 1 x[n], n = 0, · · · ,N − 1 x[n],n=0,⋅⋅⋅,N−1 and filter order p , q p, q p,q, find the parameters of H ( z ) H(z) H(z) such that the modeled output signal x ^ [ n ] = h [ n ] \hat x[n] = h[n] x^[n]=h[n] best matches the observations.
For stochastic signals, we will try to match the correlation sequences: r x [ k ] = r h [ k ] r_x [k] = r_h[k] rx[k]=rh[k].

Determine models: Least Squares

Signal Modeling

Minimizing the error with Least squares criterion:
E L S = ∑ n = 0 ∞ ∣ x [ n ] − h [ n ] ∣ 2 \mathcal E_{LS}=\sum_{n=0}^\infty |x[n]-h[n]|^2 ELS=n=0∑∞∣x[n]−h[n]∣2
Using Parseval’s theorem, the least square error may be written in the frequency domain:
E L S = 1 2 π ∫ − π π ∣ X ( e j ω ) − H ( e j ω ) ∣ 2 d ω \mathcal E_{LS}=\frac{1}{2\pi}\int_{-\pi}^{\pi}|X(e^{j\omega})-H(e^{j\omega})|^2d\omega ELS=2π1∫−ππ∣X(ejω)−H(ejω)∣2dω
This representation allows us to express E L S \mathcal{E}_{L S} ELS explicitly in terms of the model parameters a p ( k ) a_p(k) ap(k) and b q ( k ) b_q(k) bq(k):
{ ∂ E L S ∂ a ∗ [ k ] = 0 , k = 1 , ⋯ , p ∂ E L S ∂ b ∗ [ k ] = 0 , k = 0 , ⋯ , q \left\{\begin{array}{l} \frac{\partial \mathcal{E}_{L S}}{\partial a^{*}[k]}=0, \quad k=1, \cdots, p \\ \frac{\partial \mathcal{E}_{L S}}{\partial b^{*}[k]}=0, \quad k=0, \cdots, q \end{array}\right. {∂a∗[k]∂ELS=0,k=1,⋯,p∂b∗[k]∂ELS=0,k=0,⋯,q
The result is not mathematically tractable and not amenable to real-time signal processing applications.

Alternative that leads to tractable results: consider “weighted” error:

Since X ( z ) − H ( z ) = X ( z ) − B q ( z ) / A p ( z ) X(z)-H(z)=X(z)-B_q(z)/A_p(z) X(z)−H(z)=X(z)−Bq(z)/Ap(z) contains fraction, we can consider A p ( z ) ( X ( z ) − H ( z ) ) = A p ( z ) X ( z ) − B q ( z ) A_p(z)(X(z)-H(z))=A_p(z)X(z)-B_q(z) Ap(z)(X(z)−H(z))=Ap(z)X(z)−Bq(z), i.e.,
E ( z ) = A p ( z ) E ′ ( z ) = A p ( z ) X ( z ) − B q ( z ) E(z)=A_p(z)E'(z)=A_p(z)X(z)-B_q(z) E(z)=Ap(z)E′(z)=Ap(z)X(z)−Bq(z)

Signal Modeling

Now the error equation is linear.

Based on this, we can develop techniques like Pade Approximation, Prony’s Method, Shank’s Method.

Determine models: Pade Approximation

We have p + q + 1 p+q+1 p+q+1 model parameters. They can be solved by matching p + q + 1 p+q+1 p+q+1 signal samples exactly:
h [ n ] = x [ n ] , n = 0 , ⋯ , p + q h[n]=x[n], \quad n=0,\cdots,p+q h[n]=x[n],n=0,⋯,p+q
From the definition of H ( z ) H(z) H(z),
H ( z ) = B ( z ) A ( z ) ⇒ H ( z ) A ( z ) = B ( z ) ⇒ h [ n ] ∗ a [ n ] = b [ n ] ⇒ h [ n ] + ∑ k = 1 p a [ k ] h [ n − k ] = b [ n ] \begin{aligned} H(z)=\frac{B(z)}{A(z)} & \Rightarrow H(z) A(z)=B(z) \Rightarrow h[n] * a[n]=b[n] \\ & \Rightarrow \quad h[n]+\sum_{k=1}^{p} a[k] h[n-k]=b[n] \end{aligned} H(z)=A(z)B(z)⇒H(z)A(z)=B(z)⇒h[n]∗a[n]=b[n]⇒h[n]+k=1∑pa[k]h[n−k]=b[n]
where h [ n ] = 0 , n < 0 , h[n]=0, n<0, h[n]=0,n<0, and b [ n ] = 0 , n < 0 b[n]=0, n<0 b[n]=0,n<0 or n > q n>q n>q.

Since we assume h [ n ] = x [ n ] h[n]=x[n] h[n]=x[n],
x [ n ] + ∑ k = 1 p a [ k ] x [ n − k ] = { b [ n ] , n = 0 , ⋯ , q 0 , n = q + 1 , ⋯ , p + q x[n]+\sum_{k=1}^{p} a[k] x[n-k]=\left\{\begin{array}{ll} b[n], & n=0, \cdots, q \\ 0, & n=q+1, \cdots, p+q \end{array}\right. x[n]+k=1∑pa[k]x[n−k]={b[n],0,n=0,⋯,qn=q+1,⋯,p+q
These equations can be rearranged in a matrix form:
[ x ( 0 ) 0 ⋯ 0 x ( 1 ) x ( 0 ) ⋯ 0 x ( 2 ) x ( 1 ) ⋯ 0 ⋮ ⋮ ⋮ x ( q ) x ( q − 1 ) ⋯ x ( q − p ) − − − − − − − − − − − − − − − − − − − − x ( q + 1 ) x ( q ) ⋯ x ( q − p + 1 ) ⋮ ⋮ ⋮ x ( q + p ) x ( q + p − 1 ) ⋯ x ( q ) ] [ 1 a p ( 1 ) a p ( 2 ) ⋮ a p ( p ) ] = [ b q ( 0 ) b q ( 1 ) b q ( 2 ) ⋮ b q ( q ) − − 0 ⋮ 0 ] \left[\begin{array}{cccc} x(0) & 0 & \cdots & 0 \\ x(1) & x(0) & \cdots & 0 \\ x(2) & x(1) & \cdots & 0 \\ \vdots & \vdots & & \vdots \\ x(q) & x(q-1) & \cdots & x(q-p) \\ ----- & ----- & -----& -----\\ x(q+1) & x(q) & \cdots & x(q-p+1) \\ \vdots & \vdots & & \vdots \\ x(q+p) & x(q+p-1) & \cdots & x(q) \end{array}\right] \left[\begin{array}{c} 1 \\ a_{p}(1) \\ a_{p}(2) \\ \vdots \\ a_{p}(p) \end{array}\right]= \left[\begin{array}{c} b_{q}(0) \\ b_{q}(1) \\ b_{q}(2) \\ \vdots \\ b_{q}(q)\\ --\\ 0\\ \vdots\\ 0 \end{array}\right] ⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡x(0)x(1)x(2)⋮x(q)−−−−−x(q+1)⋮x(q+p)0x(0)x(1)⋮x(q−1)−−−−−x(q)⋮x(q+p−1)⋯⋯⋯⋯−−−−−⋯⋯000⋮x(q−p)−−−−−x(q−p+1)⋮x(q)⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡1ap(1)ap(2)⋮ap(p)⎦⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡bq(0)bq(1)bq(2)⋮bq(q)−−0⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤

In the first step of the Pade approximation method, solving for the coefficients a p ( k ) a_p(k) ap(k), we use the last p p p equations as indicated by the partitioning, i.e.,
[ x ( q + 1 ) x ( q ) ⋯ x ( q − p + 1 ) x ( q + 2 ) x ( q + 1 ) ⋯ x ( q − p + 2 ) ⋮ ⋮ ⋮ x ( q + p ) x ( q + p − 1 ) ⋯ x ( q ) ] [ 1 a p ( 1 ) ⋮ a p ( p ) ] = [ 0 0 ⋮ 0 ] \left[\begin{array}{cccc} x(q+1) & x(q) & \cdots & x(q-p+1) \\ x(q+2) & x(q+1) & \cdots & x(q-p+2) \\ \vdots & \vdots & & \vdots \\ x(q+p) & x(q+p-1) & \cdots & x(q) \end{array}\right]\left[\begin{array}{c} 1 \\ a_{p}(1) \\ \vdots \\ a_{p}(p) \end{array}\right]=\left[\begin{array}{c} 0 \\ 0 \\ \vdots \\ 0 \end{array}\right] ⎣⎢⎢⎢⎡x(q+1)x(q+2)⋮x(q+p)x(q)x(q+1)⋮x(q+p−1)⋯⋯⋯x(q−p+1)x(q−p+2)⋮x(q)⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡1ap(1)⋮ap(p)⎦⎥⎥⎥⎤=⎣⎢⎢⎢⎡00⋮0⎦⎥⎥⎥⎤
This can be rearranged as
[ x ( q ) x ( q − 1 ) ⋯ x ( q − p + 1 ) x ( q + 1 ) x ( q ) ⋯ x ( q − p + 2 ) ⋮ ⋮ ⋮ x ( q + p − 1 ) x ( q + p − 2 ) ⋯ x ( q ) ] [ a p ( 1 ) a p ( 2 ) ⋮ a p ( p ) ] = − [ x ( q + 1 ) x ( q + 2 ) ⋮ x ( q + p ) ] \left[\begin{array}{cccc} x(q) & x(q-1) & \cdots & x(q-p+1) \\ x(q+1) & x(q) & \cdots & x(q-p+2) \\ \vdots & \vdots & & \vdots \\ x(q+p-1) & x(q+p-2) & \cdots & x(q) \end{array}\right]\left[\begin{array}{c} a_{p}(1) \\ a_{p}(2) \\ \vdots \\ a_{p}(p) \end{array}\right]=-\left[\begin{array}{c} x(q+1) \\ x(q+2) \\ \vdots \\ x(q+p) \end{array}\right] ⎣⎢⎢⎢⎡x(q)x(q+1)⋮x(q+p−1)x(q−1)x(q)⋮x(q+p−2)⋯⋯⋯x(q−p+1)x(q−p+2)⋮x(q)⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡ap(1)ap(2)⋮ap(p)⎦⎥⎥⎥⎤=−⎣⎢⎢⎢⎡x(q+1)x(q+2)⋮x(q+p)⎦⎥⎥⎥⎤
Using matrix notation it may be expressed concisely as
X q a ‾ p = − x q + 1 \mathbf X_q\overline {\mathbf a}_p=-\mathbf x_{q+1} Xqap=−xq+1
Plug the solution back to find the b ( k ) b(k) b(k):
[ x ( 0 ) 0 0 ⋯ 0 x ( 1 ) x ( 0 ) 0 ⋯ 0 x ( 2 ) x ( 1 ) x ( 0 ) ⋯ 0 ⋮ ⋮ ⋮ ⋮ x ( q ) x ( q − 1 ) x ( q − 2 ) ⋯ x ( q − p ) ] [ 1 a p ( 1 ) a p ( 2 ) ⋮ a p ( p ) ] = [ b q ( 0 ) b q ( 1 ) b q ( 2 ) ⋮ b q ( q ) ] , \left[\begin{array}{ccccc} x(0) & 0 & 0 & \cdots & 0 \\ x(1) & x(0) & 0 & \cdots & 0 \\ x(2) & x(1) & x(0) & \cdots & 0 \\ \vdots & \vdots & \vdots & & \vdots \\ x(q) & x(q-1) & x(q-2) & \cdots & x(q-p) \end{array}\right]\left[\begin{array}{c} 1 \\ a_{p}(1) \\ a_{p}(2) \\ \vdots \\ a_{p}(p) \end{array}\right]=\left[\begin{array}{c} b_{q}(0) \\ b_{q}(1) \\ b_{q}(2) \\ \vdots \\ b_{q}(q) \end{array}\right], ⎣⎢⎢⎢⎢⎢⎡x(0)x(1)x(2)⋮x(q)0x(0)x(1)⋮x(q−1)00x(0)⋮x(q−2)⋯⋯⋯⋯000⋮x(q−p)⎦⎥⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡1ap(1)ap(2)⋮ap(p)⎦⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎢⎢⎡bq(0)bq(1)bq(2)⋮bq(q)⎦⎥⎥⎥⎥⎥⎤,
which can also be written in matrix form:
X 0 a p = b q \mathbf X_0 \mathbf a_p=\mathbf b_q X0ap=bq

Disadvantages of the Pade approximation

The resulting model doesn’t have to be stable
Outside the used interval [ 0 , p + q ] [0, p+q] [0,p+q], the approximation may be very bad
If X q \mathrm{X}_{q} Xq is singular, then a solution does not always exist that matches all signal values. (Sometimes a solution exists of lower order for A ( z ) , B ( z ) , A(z), B(z), A(z),B(z), i.e., smaller p , q , p, q, p,q, but then no guarantee on matching for samples beyond p + q p+q p+q.)

Determine models: Prony’s Method

As before, E ( z ) = A p ( z ) E ′ ( z ) = A p ( z ) X ( z ) − B q ( z ) E(z)=A_p(z)E'(z)=A_p(z)X(z)-B_q(z) E(z)=Ap(z)E′(z)=Ap(z)X(z)−Bq(z), i.e.,
e [ n ] = { x [ n ] + ∑ k = 1 p a [ k ] x [ n − k ] − b [ n ] , n = 0 , ⋯ , q x [ n ] + ∑ k = 1 p a [ k ] x [ n − k ] , n > q e[n]=\left\{\begin{array}{ll} x[n]+\sum_{k=1}^{p} a[k] x[n-k]-b[n], & n=0, \cdots, q \\ x[n]+\sum_{k=1}^{p} a[k] x[n-k], & n>q \end{array}\right. e[n]={x[n]+∑k=1pa[k]x[n−k]−b[n],x[n]+∑k=1pa[k]x[n−k],n=0,⋯,qn>q

For Pade, we first solved e [ n ] = 0 , n = q + 1 , ⋯ , q + p e[n]=0, n=q+1, \cdots, q+p e[n]=0,n=q+1,⋯,q+p to find A ( z ) A(z) A(z).
For Prony, solve
min ⁡ { a [ k ] } ∑ n = q + 1 ∞ ∣ e [ n ] ∣ 2 = min ⁡ { a [ k ] } ∑ n = q + 1 ∞ ∣ x [ n ] + ∑ k = 1 p a [ k ] x [ n − k ] ∣ 2 \min _{\{a[k]\}} \sum_{n=q+1}^{\infty}|e[n]|^{2}=\min _{\{a[k]\}} \sum_{n=q+1}^{\infty}\left|x[n]+\sum_{k=1}^{p} a[k] x[n-k]\right|^{2} {a[k]}minn=q+1∑∞∣e[n]∣2={a[k]}minn=q+1∑∞∣∣∣∣∣x[n]+k=1∑pa[k]x[n−k]∣∣∣∣∣2

There are two ways to derive Prony’s Method.

In terms of finding the least squares solution to a set of overdetermined linear equations. Specifically, note that in Prony’s method we would like to find a set of coefficients a p ( k ) a_p(k) ap(k) such that
x ( n ) + ∑ k = 1 p a p ( k ) x ( n − k ) = 0 ; n > q x(n)+\sum_{k=1}^p a_p(k)x(n-k)=0;\quad n>q x(n)+k=1∑pap(k)x(n−k)=0;n>q
i.e., we would like e ( n ) e(n) e(n) to be equal to zero for n > q n > q n>q. In matrix form, these equations may be expressed as
[ x ( q ) x ( q − 1 ) ⋯ x ( q − p + 1 ) x ( q + 1 ) x ( q ) ⋯ x ( q − p + 2 ) x ( q + 2 ) x ( q + 1 ) ⋯ x ( q − p + 3 ) ⋮ ⋮ ⋮ ] [ a p ( 1 ) a p ( 2 ) ⋮ a p ( p ) ] = − [ x ( q + 1 ) x ( q + 2 ) x ( q + 3 ) ⋮ ] \left[\begin{array}{cccc} x(q) & x(q-1) & \cdots & x(q-p+1) \\ x(q+1) & x(q) & \cdots & x(q-p+2) \\ x(q+2) & x(q+1) & \cdots & x(q-p+3) \\ \vdots & \vdots & & \vdots \end{array}\right]\left[\begin{array}{c} a_{p}(1) \\ a_{p}(2) \\ \vdots \\ a_{p}(p) \end{array}\right]=-\left[\begin{array}{c} x(q+1) \\ x(q+2) \\ x(q+3) \\ \vdots \end{array}\right] ⎣⎢⎢⎢⎡x(q)x(q+1)x(q+2)⋮x(q−1)x(q)x(q+1)⋮⋯⋯⋯x(q−p+1)x(q−p+2)x(q−p+3)⋮⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡ap(1)ap(2)⋮ap(p)⎦⎥⎥⎥⎤=−⎣⎢⎢⎢⎡x(q+1)x(q+2)x(q+3)⋮⎦⎥⎥⎥⎤
or equivalently, as
X q a p ‾ = − x q + 1 \mathbf X_q\overline{\mathbf a_p}=-\mathbf x_{q+1} Xqap=−xq+1
The least squares solution is found by solving the normal equations
( X q H X q ) a p ‾ = − X q H x q + 1 (\mathbf X_q^H \mathbf X_q)\overline{\mathbf a_p}=-\mathbf X_q^H\mathbf x_{q+1} (XqHXq)ap=−XqHxq+1
In terms of setting the partial derivatives equal to zero. Define
E p , q = ∑ n = q + 1 ∞ ∣ e ( n ) ∣ 2 = ∑ n = q + 1 ∞ ∣ x ( n ) + ∑ l = 1 p a p ( l ) x ( n − l ) ∣ 2 \mathcal{E}_{p, q}=\sum_{n=q+1}^{\infty}|e(n)|^{2}=\sum_{n=q+1}^{\infty}\left|x(n)+\sum_{l=1}^{p} a_{p}(l) x(n-l)\right|^{2} Ep,q=n=q+1∑∞∣e(n)∣2=n=q+1∑∞∣∣∣∣∣x(n)+l=1∑pap(l)x(n−l)∣∣∣∣∣2
By setting the partial derivatives of E p , q \mathcal E_{p,q} Ep,q with respect to a p ∗ ( k ) a_p^*(k) ap∗(k) equal to zero, we obtain
∑ n = q + 1 ∞ e ( n ) x ∗ ( n − k ) = 0 ; k = 1 , 2 , ⋯ , p \sum_{n=q+1}^{\infty} e(n)x^*(n-k)=0;\qquad k=1,2,\cdots,p n=q+1∑∞e(n)x∗(n−k)=0;k=1,2,⋯,p
which is known as the orthogonality principle.

Substituting e ( n ) = x ( n ) + ∑ k = 1 p a p ( k ) x ( n − k ) e(n)=x(n)+\sum_{k=1}^p a_p(k)x(n-k) e(n)=x(n)+∑k=1pap(k)x(n−k) and defining
r x ( k , l ) = ∑ n = q + 1 ∞ x ( n − l ) x ∗ ( n − k ) , r_x(k,l)=\sum_{n=q+1}^\infty x(n-l)x^*(n-k), rx(k,l)=n=q+1∑∞x(n−l)x∗(n−k),
which is a conjugate symmetric function r x ( k , l ) = r x ∗ ( l , k ) r_x(k,l)=r_x^*(l,k) rx(k,l)=rx∗(l,k), we have
∑ l = 1 p a p ( l ) r x ( k , l ) = − r x ( k , 0 ) ; k = 1 , 2 , ⋯ , p . \sum_{l=1}^p a_p(l)r_x(k,l)=-r_x(k,0);\qquad k=1,2,\cdots,p. l=1∑pap(l)rx(k,l)=−rx(k,0);k=1,2,⋯,p.
It is referred to as the Prony normal equations. In matrix form, the normal equations are
[ r x ( 1 , 1 ) r x ( 1 , 2 ) r x ( 1 , 3 ) ⋯ r x ( 1 , p ) r x ( 2 , 1 ) r x ( 2 , 2 ) r x ( 2 , 3 ) ⋯ r x ( 2 , p ) r x ( 3 , 1 ) r x ( 3 , 2 ) r x ( 3 , 3 ) ⋯ r x ( 3 , p ) ⋮ ⋮ ⋮ ⋮ r x ( p , 1 ) r x ( p , 2 ) r x ( p , 3 ) ⋯ r x ( p , p ) ] [ a p ( 1 ) a p ( 2 ) a p ( 3 ) ⋮ a p ( p ) ] = − [ r x ( 1 , 0 ) r x ( 2 , 0 ) r x ( 3 , 0 ) ⋮ r x ( p , 0 ) ] \left[\begin{array}{ccccc} r_{x}(1,1) & r_{x}(1,2) & r_{x}(1,3) & \cdots & r_{x}(1, p) \\ r_{x}(2,1) & r_{x}(2,2) & r_{x}(2,3) & \cdots & r_{x}(2, p) \\ r_{x}(3,1) & r_{x}(3,2) & r_{x}(3,3) & \cdots & r_{x}(3, p) \\ \vdots & \vdots & \vdots & & \vdots \\ r_{x}(p, 1) & r_{x}(p, 2) & r_{x}(p, 3) & \cdots & r_{x}(p, p) \end{array}\right]\left[\begin{array}{c} a_{p}(1) \\ a_{p}(2) \\ a_{p}(3) \\ \vdots \\ a_{p}(p) \end{array}\right]=-\left[\begin{array}{c} r_{x}(1,0) \\ r_{x}(2,0) \\ r_{x}(3,0) \\ \vdots \\ r_{x}(p, 0) \end{array}\right] ⎣⎢⎢⎢⎢⎢⎡rx(1,1)rx(2,1)rx(3,1)⋮rx(p,1)rx(1,2)rx(2,2)rx(3,2)⋮rx(p,2)rx(1,3)rx(2,3)rx(3,3)⋮rx(p,3)⋯⋯⋯⋯rx(1,p)rx(2,p)rx(3,p)⋮rx(p,p)⎦⎥⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡ap(1)ap(2)ap(3)⋮ap(p)⎦⎥⎥⎥⎥⎥⎤=−⎣⎢⎢⎢⎢⎢⎡rx(1,0)rx(2,0)rx(3,0)⋮rx(p,0)⎦⎥⎥⎥⎥⎥⎤
which may be written concisely as follows
R x a p ‾ = − r x \mathbf R_x\overline{\mathbf a_p}=-\mathbf r_x Rxap=−rx
where R x \mathbf R_x Rx is a p × p p\times p p×p Hermitian matrix.
In fact, observing the matrix X q \mathbf X_q Xq and R x \mathbf R_x Rx, it can be found that
R x = X q H X q , \mathbf R_x=\mathbf X_q^H \mathbf X_q, Rx=XqHXq,
and similarly
r x = X q H x q + 1 \mathbf r_x=\mathbf X_q^H\mathbf x_{q+1} rx=XqHxq+1
Thus, R x a p ‾ = − r x \mathbf R_x\overline{\mathbf a_p}=-\mathbf r_x Rxap=−rx can also be written as ( X q H X q ) a p ‾ = − X q H x q + 1 (\mathbf X_q^H \mathbf X_q)\overline{\mathbf a_p}=-\mathbf X_q^H\mathbf x_{q+1} (XqHXq)ap=−XqHxq+1, which is the normal equation in method one.

Once the coefficients a p ( k ) a_p(k) ap(k) have been found, the coefficients b n b_n bn can be obtained by
b q ( n ) = x ( n ) + ∑ k = 1 p a p ( k ) x ( n − k ) . b_q(n)=x(n)+\sum_{k=1}^p a_p(k)x(n-k). bq(n)=x(n)+k=1∑pap(k)x(n−k).
Alternatively, Shank’s method switches back to e ′ [ n ] e′[n] e′[n] and minimizes over the entire data. see Hayes, 4.4.2.

The minimum error is (due to the orthogonality principle)
ε p , q = ∥ e ∥ 2 = e H e = ( X q a p ‾ + x q + 1 ) H e = x q + 1 H e \varepsilon_{p,q}=\|\mathbf e\|^2=\mathbf e^H\mathbf e=(\mathbf X_q\overline {\mathbf a_p}+\mathbf x_{q+1})^H\mathbf e=\mathbf x_{q+1}^H \mathbf e εp,q=∥e∥2=eHe=(Xqap+xq+1)He=xq+1He
This can further be written as
ϵ p , q = x q + 1 H ( X q a p ‾ + x q + 1 ) = ( x q + 1 H X q ) a p ‾ + x q + 1 H x q + 1 \epsilon_{p, q}=\mathbf{x}_{q+1}^{H}\left(\mathbf{X}_{q} \overline{\mathbf{a}_p}+\mathbf{x}_{q+1}\right)=\left(\mathbf{x}_{q+1}^{H} \mathbf{X}_{q}\right) \overline{\mathbf{a}_p}+\mathbf{x}_{q+1}^{H} \mathbf{x}_{q+1} ϵp,q=xq+1H(Xqap+xq+1)=(xq+1HXq)ap+xq+1Hxq+1
In terms of the autocorrelation sequence,
r x ( 0 , 0 ) = ∑ n = q + 1 ∞ x ∗ [ n ] x [ n ] = x q + 1 H x q + 1 , r x ( 0 , k ) = ∑ n = q + 1 ∞ x ∗ [ n ] x [ n − k ] = [ x q + 1 H X q ] k r_{x}(0,0)=\sum_{n=q+1}^{\infty} x^{*}[n] x[n]=\mathbf{x}_{q+1}^{H} \mathbf{x}_{q+1}, \quad r_{x}(0, k)=\sum_{n=q+1}^{\infty} x^{*}[n] x[n-k]=\left[\mathbf{x}_{q+1}^{H} \mathbf{X}_{q}\right]_{k} rx(0,0)=n=q+1∑∞x∗[n]x[n]=xq+1Hxq+1,rx(0,k)=n=q+1∑∞x∗[n]x[n−k]=[xq+1HXq]k
this can be written as
ϵ p , q = r x ( 0 , 0 ) + ∑ k = 1 p a [ k ] r x ( 0 , k ) \epsilon_{p, q}=r_{x}(0,0)+\sum_{k=1}^{p} a[k] r_{x}(0, k) ϵp,q=rx(0,0)+k=1∑pa[k]rx(0,k)
This can be combined with the equation R x a p ‾ = − r x \mathbf R_x \overline {\mathbf a_p}=-\mathbf r_x Rxap=−rx as
[ r x ( 0 , 0 ) r x ( 0 , 1 ) r x ( 0 , 2 ) ⋯ r x ( 0 , p ) r x ( 1 , 0 ) r x ( 1 , 1 ) r x ( 1 , 2 ) ⋯ r x ( 1 , p ) r x ( 2 , 0 ) r x ( 2 , 1 ) r x ( 2 , 2 ) ⋯ r x ( 2 , p ) ⋮ ⋱ ⋮ r x ( p , 0 ) r x ( p , 1 ) r x ( p , 2 ) ⋯ r x ( p , p ) ] [ 1 a [ 1 ] a [ 2 ] ⋮ a [ p ] ] = [ ϵ p , q 0 0 ⋮ 0 ] \left[\begin{array}{c|cccc} r_{x}(0,0) & r_{x}(0,1) & r_{x}(0,2) & \cdots & r_{x}(0, p) \\ \hline r_{x}(1,0) & r_{x}(1,1) & r_{x}(1,2) & \cdots & r_{x}(1, p) \\ r_{x}(2,0) & r_{x}(2,1) & r_{x}(2,2) & \cdots & r_{x}(2, p) \\ \vdots & \ddots & & \vdots & \\ r_{x}(p, 0) & r_{x}(p, 1) & r_{x}(p, 2) & \cdots & r_{x}(p, p) \end{array}\right]\left[\begin{array}{c} 1 \\ \hline a[1] \\ a[2] \\ \vdots \\ a[p] \end{array}\right]=\left[\begin{array}{c} \epsilon_{p, q} \\ \hline 0 \\ 0 \\ \vdots \\ 0 \end{array}\right] ⎣⎢⎢⎢⎢⎢⎡rx(0,0)rx(1,0)rx(2,0)⋮rx(p,0)rx(0,1)rx(1,1)rx(2,1)⋱rx(p,1)rx(0,2)rx(1,2)rx(2,2)rx(p,2)⋯⋯⋯⋮⋯rx(0,p)rx(1,p)rx(2,p)rx(p,p)⎦⎥⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡1a[1]a[2]⋮a[p]⎦⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎢⎢⎡ϵp,q00⋮0⎦⎥⎥⎥⎥⎥⎤
or also as
R x a p = ϵ p , q u 1 \mathbf R_x \mathbf a_p=\epsilon_{p,q}\mathbf u_1 Rxap=ϵp,qu1
These are the augmented normal equations.

DM-Special case: All-pole modeling

If q = 0 q=0 q=0, we have an all-pole model:
H ( z ) = b ( 0 ) 1 + ∑ k = 1 p a p ( k ) z − k H(z)=\frac{b(0)}{1+\sum_{k=1}^{p} a_{p}(k) z^{-k}} H(z)=1+∑k=1pap(k)z−kb(0)
Recall that the solution is given by solving the overdetermined system ( q = 0 ) (q=0) (q=0)
[ x [ 0 ] 0 0 ⋯ 0 x [ 1 ] x [ 0 ] 0 ⋯ 0 x [ 2 ] x [ 1 ] x [ 0 ] ⋱ 0 x [ 3 ] x [ 2 ] x [ 1 ] ⋱ x [ 0 ] ⋮ ⋮ ⋮ ⋮ ⋮ ] [ a [ 1 ] a [ 2 ] a [ 3 ] ⋮ a [ p ] ] = − [ x [ 1 ] x [ 2 ] x [ 3 ] ⋮ ⋮ ] \left[\begin{array}{ccccc} x[0] & 0 & 0 & \cdots & 0 \\ x[1] & x[0] & 0 & \cdots & 0 \\ x[2] & x[1] & x[0] & \ddots & 0 \\ x[3] & x[2] & x[1] & \ddots & x[0] \\ \vdots & \vdots & \vdots & \vdots & \vdots \end{array}\right]\left[\begin{array}{c} a[1] \\ a[2] \\ a[3] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{c} x[1] \\ x[2] \\ x[3] \\ \vdots \\ \vdots \end{array}\right] ⎣⎢⎢⎢⎢⎢⎡x[0]x[1]x[2]x[3]⋮0x[0]x[1]x[2]⋮00x[0]x[1]⋮⋯⋯⋱⋱⋮000x[0]⋮⎦⎥⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡a[1]a[2]a[3]⋮a[p]⎦⎥⎥⎥⎥⎥⎤=−⎣⎢⎢⎢⎢⎢⎢⎡x[1]x[2]x[3]⋮⋮⎦⎥⎥⎥⎥⎥⎥⎤
The normal equations become (premultiply with X 0 H \mathbf X_0^H X0H)
[ x ∗ [ 0 ] x ∗ [ 1 ] x ∗ [ 2 ] ⋯ 0 ⋱ ⋱ 0 0 x ∗ [ 0 ] ⋯ ] [ x [ 0 ] 0 0 x [ 1 ] ⋱ 0 x [ 2 ] ⋱ x [ 0 ] ⋮ ⋮ ⋮ ] [ a [ 1 ] ⋮ a [ p ] ] = − [ x ∗ [ 0 ] x ∗ [ 1 ] x ∗ [ 2 ] ⋯ 0 ⋱ ⋱ 0 0 x ∗ [ 0 ] ⋯ ] [ x [ 1 ] x [ 2 ] x [ 3 ] ⋮ ] \left[\begin{array}{cccc} x^{*}[0] & x^{*}[1] & x^{*}[2] \cdots \\ 0 & \ddots & \ddots \\ 0 & 0 & x^{*}[0] \cdots \end{array}\right]\left[\begin{array}{ccc} x[0] & 0 & 0 \\ x[1] & \ddots & 0 \\ x[2] & \ddots & x[0] \\ \vdots & \vdots & \vdots \end{array}\right]\left[\begin{array}{c} a[1] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{cccc} x^{*}[0] & x^{*}[1] & x^{*}[2] \cdots \\ 0 & \ddots & \ddots \\ 0 & 0 & x^{*}[0] \cdots \end{array}\right]\left[\begin{array}{c} x[1] \\ x[2] \\ x[3] \\ \vdots \end{array}\right] ⎣⎡x∗[0]00x∗[1]⋱0x∗[2]⋯⋱x∗[0]⋯⎦⎤⎣⎢⎢⎢⎡x[0]x[1]x[2]⋮0⋱⋱⋮00x[0]⋮⎦⎥⎥⎥⎤⎣⎢⎡a[1]⋮a[p]⎦⎥⎤=−⎣⎡x∗[0]00x∗[1]⋱0x∗[2]⋯⋱x∗[0]⋯⎦⎤⎣⎢⎢⎢⎡x[1]x[2]x[3]⋮⎦⎥⎥⎥⎤
From this structure it is seen that the normal equations become
[ r x ( 0 ) r x ∗ ( 1 ) ⋯ r x ∗ ( p − 1 ) r x ( 1 ) r x ( 0 ) ⋯ r x ∗ ( p − 2 ) ⋮ ⋱ ⋮ r x ( p − 1 ) r x ( p − 2 ) ⋯ r x ( 0 ) ] [ a [ 1 ] a [ 2 ] ⋮ a [ p ] ] = − [ r x ( 1 ) r x ( 2 ) ⋮ r x ( p ) ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}^{*}(1) & \cdots & r_{x}^{*}(p-1) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}^{*}(p-2) \\ \vdots & \ddots & \vdots \\ r_{x}(p-1) & r_{x}(p-2) & \cdots & r_{x}(0) \end{array}\right]\left[\begin{array}{c} a[1] \\ a[2] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{c} r_{x}(1) \\ r_{x}(2) \\ \vdots \\ r_{x}(p) \end{array}\right] ⎣⎢⎢⎢⎡rx(0)rx(1)⋮rx(p−1)rx∗(1)rx(0)⋱rx(p−2)⋯⋯⋮⋯rx∗(p−1)rx∗(p−2)rx(0)⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡a[1]a[2]⋮a[p]⎦⎥⎥⎥⎤=−⎣⎢⎢⎢⎡rx(1)rx(2)⋮rx(p)⎦⎥⎥⎥⎤
where the autocorrelation sequence is in fact “stationary”:
r x ( k , l ) = ∑ n = 0 ∞ x [ n − k ] x [ n − l ] ∗ = x [ k − l ] x [ 0 ] ∗ + x [ k − l + 1 ] x [ 1 ] ∗ + ⋯ = ∑ n = 0 ∞ x [ n + k − l ] x [ n ] ∗ = r x ( k − l ) \begin{aligned} r_x(k,l)&=\sum_{n=0}^\infty x[n-k]x[n-l]^*=x[k-l]x[0]^*+x[k-l+1]x[1]^*+\cdots\\ &=\sum_{n=0}^\infty x[n+k-l]x[n]^*=r_x(k-l) \end{aligned} rx(k,l)=n=0∑∞x[n−k]x[n−l]∗=x[k−l]x[0]∗+x[k−l+1]x[1]∗+⋯=n=0∑∞x[n+k−l]x[n]∗=rx(k−l)
For an all-pole model, the matrix R x \mathbf R_x Rx has a Toeplitz structure; it can be efficiently inverted using the Levinson algorithm (later in Ch. 5)

Finally, the minimum modeling error for the all-pole model is given by
{ E p } m i n = ϵ p = r x ( 0 ) + ∑ k = 1 p a p ( k ) r x ∗ ( k ) \{\mathcal E_p\}_{min}=\epsilon_p=r_x(0)+\sum_{k=1}^p a_p(k)r^*_x(k) {Ep}min=ϵp=rx(0)+k=1∑pap(k)rx∗(k)
Incorporating it into the normal equations,
[ r x ( 0 ) r x ∗ ( 1 ) r x ∗ ( 2 ) ⋯ r x ∗ ( p ) r x ( 1 ) r x ( 0 ) r x ∗ ( 1 ) ⋯ r x ∗ ( p − 1 ) r x ( 2 ) r x ( 1 ) r x ( 0 ) ⋯ r x ∗ ( p − 2 ) ⋮ ⋮ ⋮ ⋮ r x ( p ) r x ( p − 1 ) r x ( p − 2 ) ⋯ r x ( 0 ) ] [ 1 a p ( 1 ) a p ( 2 ) ⋮ a p ( p ) ] = [ ϵ p 0 0 ⋮ 0 ] \left[\begin{array}{ccccc} r_{x}(0) & r_{x}^{*}(1) & r_{x}^{*}(2) & \cdots & r_{x}^{*}(p) \\ r_{x}(1) & r_{x}(0) & r_{x}^{*}(1) & \cdots & r_{x}^{*}(p-1) \\ r_{x}(2) & r_{x}(1) & r_{x}(0) & \cdots & r_{x}^{*}(p-2) \\ \vdots & \vdots & \vdots & & \vdots \\ r_{x}(p) & r_{x}(p-1) & r_{x}(p-2) & \cdots & r_{x}(0) \end{array}\right]\left[\begin{array}{c} 1 \\ a_{p}(1) \\ a_{p}(2) \\ \vdots \\ a_{p}(p) \end{array}\right]=\left[\begin{array}{c} \epsilon_{p} \\ 0 \\ 0 \\ \vdots \\ 0 \end{array}\right] ⎣⎢⎢⎢⎢⎢⎡rx(0)rx(1)rx(2)⋮rx(p)rx∗(1)rx(0)rx(1)⋮rx(p−1)rx∗(2)rx∗(1)rx(0)⋮rx(p−2)⋯⋯⋯⋯rx∗(p)rx∗(p−1)rx∗(p−2)⋮rx(0)⎦⎥⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡1ap(1)ap(2)⋮ap(p)⎦⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎢⎢⎡ϵp00⋮0⎦⎥⎥⎥⎥⎥⎤
or
R x a p = ϵ p u 1 \mathbf R_x \mathbf a_p=\epsilon _p \mathbf u_1 Rxap=ϵpu1
The numerator will be chosen as b [ 0 ] = ϵ p b[0]=\sqrt{\epsilon_p} b[0]=ϵp . It can be shown that then
r x ( k ) = r h ( k ) , ∣ k ∣ ≤ p r_x(k)=r_h(k),|k|\le p rx(k)=rh(k),∣k∣≤p
Meaning: the autocorrelation sequence of the filter matches the first p p p lags of the autocorrelation sequence of the specified data (“moment matching”).

All-pole modeling with finite data

Since Prony’s method assumes that x ( n ) x(n) x(n) is known for all n n n, the coefficients a p ( k ) a_p(k) ap(k) are found by minimizing the sum of the squares of e ( n ) e(n) e(n) for all n > q n > q n>q. What happens, however, when x ( n ) x(n) x(n) is only known or measured for values of n n n over a finite interval such as [ 0 , N ] [0, N] [0,N]?

Autocorrelation method

Assume that x ( n ) = 0 x(n) = 0 x(n)=0 for n < 0 n < 0 n<0 and for n > N n > N n>N and then use Prony’s method to find the coefficients a p ( k ) a_p(k) ap(k). In other words, we form a new signal x ~ ( n ) \tilde x(n) x~(n) by applying a rectangular window to x ( n ) x(n) x(n), i.e.
x ~ ( n ) = x ( n ) w R ( n ) \tilde x(n)=x(n)w_R(n) x~(n)=x(n)wR(n)
where
w R ( n ) = { 1 ; n = 0 , 1 , ⋯ , N 0 ; otherwise w_R(n)=\left\{\begin{array}{ll}1;\quad n=0,1,\cdots,N\\ 0;\quad\text{otherwise} \end{array} \right. wR(n)={1;n=0,1,⋯,N0;otherwise
and then uses Prony’s method to find an all-pole model for x ~ ( n ) \tilde x(n) x~(n):
r x ( k ) = ∑ n = 0 ∞ x ~ ( n ) x ~ ∗ ( n − k ) = ∑ n = k N x ( n ) x ∗ ( n − k ) ; k = 0 , 1 , ⋯ , p r_x(k)=\sum_{n=0}^\infty \tilde x(n)\tilde x^*(n-k)=\sum_{n=k}^{N}x(n)x^*(n-k);\quad k=0,1,\cdots,p rx(k)=n=0∑∞x~(n)x~∗(n−k)=n=k∑Nx(n)x∗(n−k);k=0,1,⋯,p
It can also be interpreted as solving the overdetermined system
[ x ( 0 ) 0 0 ⋯ 0 x ( 1 ) x ( 0 ) 0 ⋯ 0 x ( 2 ) x ( 1 ) x ( 0 ) ⋯ 0 ⋮ ⋮ ⋮ ⋮ − − − − − − − − − − − − − − − x ( p − 1 ) x ( p − 2 ) x ( p − 3 ) ⋯ x ( 0 ) x ( p ) x ( p − 1 ) x ( p − 2 ) ⋯ x ( 1 ) ⋮ ⋮ ⋮ ⋮ x ( N − 2 ) x ( N − 3 ) x ( N − 4 ) ⋯ x ( N − p − 1 ) x ( N − 1 ) x ( N − 2 ) x ( N − 3 ) ⋯ x ( N − p ) − − − − − − − − − − − − − − − x ( N ) x ( N − 1 ) x ( N − 2 ) ⋯ x ( N − p + 1 ) 0 x ( N ) x ( N − 1 ) ⋯ x ( N − p + 2 ) ⋮ ⋮ ⋮ ⋮ 0 0 0 ⋯ x ( N ) ] [ a [ 1 ] a [ 2 ] a [ 3 ] ⋮ a [ p ] ] = − [ x [ 1 ] x [ 2 ] x [ 3 ] ⋮ x [ N ] 0 ⋮ 0 ] \left[\begin{array}{ccccc} x(0) & 0 & 0 & \cdots & 0 \\ x(1) & x(0) & 0 & \cdots & 0 \\ x(2) & x(1) & x(0) & \cdots & 0 \\ \vdots & \vdots & \vdots & & \vdots \\ ---&---&---&---&---\\ x(p-1) & x(p-2) & x(p-3) & \cdots & x(0) \\ x(p) & x(p-1) & x(p-2) & \cdots & x(1) \\ \vdots & \vdots & \vdots & & \vdots \\ x(N-2) & x(N-3) & x(N-4) & \cdots & x(N-p-1) \\ x(N-1) & x(N-2) & x(N-3) & \cdots & x(N-p) \\ ---&---&---&---&---\\ x(N) & x(N-1) & x(N-2) & \cdots & x(N-p+1) \\ 0 & x(N) & x(N-1) & \cdots & x(N-p+2) \\ \vdots & \vdots & \vdots & & \vdots \\ 0 & 0 & 0 & \cdots & x(N) \end{array}\right]\left[\begin{array}{c} a[1] \\ a[2] \\ a[3] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{c} x[1] \\ x[2] \\ x[3] \\ \vdots \\ x[N]\\ 0\\ \vdots\\ 0 \end{array}\right] ⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡x(0)x(1)x(2)⋮−−−x(p−1)x(p)⋮x(N−2)x(N−1)−−−x(N)0⋮00x(0)x(1)⋮−−−x(p−2)x(p−1)⋮x(N−3)x(N−2)−−−x(N−1)x(N)⋮000x(0)⋮−−−x(p−3)x(p−2)⋮x(N−4)x(N−3)−−−x(N−2)x(N−1)⋮0⋯⋯⋯−−−⋯⋯⋯⋯−−−⋯⋯⋯000⋮−−−x(0)x(1)⋮x(N−p−1)x(N−p)−−−x(N−p+1)x(N−p+2)⋮x(N)⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡a[1]a[2]a[3]⋮a[p]⎦⎥⎥⎥⎥⎥⎤=−⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡x[1]x[2]x[3]⋮x[N]0⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤

Covariance method

If we know that x [ n ] x[n] x[n] is not zero for n < 0 n < 0 n<0 and n > N n > N n>N, it is more accurate to omit those equations that contain an extension with zeros:
[ x [ p − 1 ] ⋱ x [ 0 ] x [ p ] ⋱ x [ 1 ] ⋮ ⋱ ⋮ x [ N − 1 ] ⋱ x [ N − p ] ] [ a [ 1 ] ⋮ a [ p ] ] = − [ x [ p ] x [ p + 1 ] ⋮ x [ N ] ] \left[\begin{array}{ccc} x[p-1] & \ddots & x[0] \\ x[p] & \ddots & x[1] \\ \vdots & \ddots & \vdots \\ x[N-1] & \ddots & x[N-p] \end{array}\right]\left[\begin{array}{c} a[1] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{l} x[p] \\ x[p+1] \\ \vdots \\ x[N] \end{array}\right] ⎣⎢⎢⎢⎡x[p−1]x[p]⋮x[N−1]⋱⋱⋱⋱x[0]x[1]⋮x[N−p]⎦⎥⎥⎥⎤⎣⎢⎡a[1]⋮a[p]⎦⎥⎤=−⎣⎢⎢⎢⎡x[p]x[p+1]⋮x[N]⎦⎥⎥⎥⎤
We will have slightly different normal equations as before, with r x ( k , ℓ ) r_x (k, ℓ) rx(k,ℓ) defined as
r x ( k , l ) = ∑ n = p N x ( n − k ) x ∗ ( n − l ) r_x(k,l)=\sum_{n=p}^N x(n-k)x^*(n-l) rx(k,l)=n=p∑Nx(n−k)x∗(n−l)
Unlike the autocorrelation normal equations, the covariance normal equations are not Toeplitz.

Examples

Pole estimation

Consider the finite data sequence x = [ 1 , β , β 2 , ⋯ , β N ] T . x=\left[1, \beta, \beta^{2}, \cdots, \beta^{N}\right]^{T} . x=[1,β,β2,⋯,βN]T. Estimate an A R ( 1 ) A R(1) AR(1) model for this signal ( p = 1 ) (p=1) (p=1) :
H ( z ) = b ( 0 ) 1 + a [ 1 ] z − 1 H(z)=\frac{b(0)}{1+a[1] z^{-1}} H(z)=1+a[1]z−1b(0)
Autocorrelation method
The normal equations collapse to r x ( 0 ) a [ 1 ] = − r x ( 1 ) , r_{x}(0) a[1]=-r_{x}(1), rx(0)a[1]=−rx(1), where
r x ( 0 ) = [ 1 , β , β 2 , ⋯ , β N ] [ 1 β β 2 ⋮ β N ] = ∑ n = 0 N ∣ β ∣ 2 n = 1 − ∣ β ∣ 2 N + 2 1 − ∣ β ∣ 2 r x ( 1 ) = [ 0 , 1 , β , ⋯ , β N ] [ 1 β N 0 ] = ∑ n = 0 N − 1 β ∣ β ∣ 2 n = β 1 − ∣ β ∣ 2 N 1 − ∣ β ∣ 2 ⇒ a [ 1 ] = − β 1 − ∣ β ∣ 2 N 1 − ∣ β ∣ 2 N + 2 \begin{array}{r} r_{x}(0)=\left[1, \beta, \beta^{2}, \cdots, \beta^{N}\right]\left[\begin{array}{c} 1 \\ \beta \\ \beta^{2} \\ \vdots \\ \beta^{N} \end{array}\right]=\sum_{n=0}^{N}|\beta|^{2 n}=\frac{1-|\beta|^{2 N+2}}{1-|\beta|^{2}} \\ r_{x}(1)=\left[0,1, \beta, \cdots, \beta^{N}\right]\left[\begin{array}{c} 1 \\ \beta^{N} \\ 0 \end{array}\right]=\sum_{n=0}^{N-1} \beta|\beta|^{2 n}=\beta \frac{1-|\beta|^{2 N}}{1-|\beta|^{2}} \\ \Rightarrow \quad a[1]=-\beta \frac{1-|\beta|^{2 N}}{1-|\beta|^{2 N+2}} \end{array} rx(0)=[1,β,β2,⋯,βN]⎣⎢⎢⎢⎢⎢⎡1ββ2⋮βN⎦⎥⎥⎥⎥⎥⎤=∑n=0N∣β∣2n=1−∣β∣21−∣β∣2N+2rx(1)=[0,1,β,⋯,βN]⎣⎡1βN0⎦⎤=∑n=0N−1β∣β∣2n=β1−∣β∣21−∣β∣2N⇒a[1]=−β1−∣β∣2N+21−∣β∣2N
Covariance method

The normal equations collapse to r x ( 1 , 1 ) a [ 1 ] = − r x ( 1 , 0 ) , r_{x}(1,1) a[1]=-r_{x}(1,0), rx(1,1)a[1]=−rx(1,0), where
r x ( 1 , 1 ) = [ 1 , β , β 2 , ⋯ , β N − 1 ] [ 1 β β 2 ⋮ β N − 1 ] = ∑ n = 0 N − 1 ∣ β ∣ 2 n = 1 − ∣ β ∣ 2 N 1 − ∣ β ∣ 2 r x ( 1 , 0 ) = [ 1 , β , ⋯ , β N − 1 ] [ β 2 ⋮ β N ] = ∑ n = 0 N − 1 β ∣ β ∣ 2 n = β 1 − ∣ β ∣ 2 N 1 − ∣ β ∣ 2 \begin{array}{l} r_{x}(1,1)=\left[1, \beta, \beta^{2}, \cdots, \beta^{N-1}\right]\left[\begin{array}{c} 1 \\ \beta \\ \beta^{2} \\ \vdots \\ \beta^{N-1} \end{array}\right]=\sum_{n=0}^{N-1}|\beta|^{2 n}=\frac{1-|\beta|^{2 N}}{1-|\beta|^{2}} \\ r_{x}(1,0)=\left[1, \beta, \cdots, \beta^{N-1}\right]\left[\begin{array}{c} \beta^{2} \\ \vdots \\ \beta^{N} \end{array}\right]=\sum_{n=0}^{N-1} \beta|\beta|^{2 n}=\beta \frac{1-|\beta|^{2 N}}{1-|\beta|^{2}} \end{array} rx(1,1)=[1,β,β2,⋯,βN−1]⎣⎢⎢⎢⎢⎢⎡1ββ2⋮βN−1⎦⎥⎥⎥⎥⎥⎤=∑n=0N−1∣β∣2n=1−∣β∣21−∣β∣2Nrx(1,0)=[1,β,⋯,βN−1]⎣⎢⎡β2⋮βN⎦⎥⎤=∑n=0N−1β∣β∣2n=β1−∣β∣21−∣β∣2N
Thus, a [ 1 ] = − β . a[1]=-\beta . a[1]=−β. Set b ( 0 ) = 1 b(0)=1 b(0)=1 so that x ( 0 ) = x ^ ( 0 ) x(0)=\hat x(0) x(0)=x^(0) then
H ( z ) = 1 1 − β z − 1 H(z)=\frac{1}{1-\beta z^{-1}} H(z)=1−βz−11
The pole location is found exactly.

Channel inversion

Signal Modeling

Consider a communication channel with a known (estimated) transfer function G ( z ) G(z) G(z).
At the receiver, we wish to equalize (invert) the channel using an equalizer H ( z ) H(z) H(z) :
G ( z ) H ( z ) = 1 ⇔ g [ n ] ∗ h [ n ] = δ [ n ] G(z) H(z)=1 \quad \Leftrightarrow \quad g[n] * h[n]=\delta[n] G(z)H(z)=1⇔g[n]∗h[n]=δ[n]
Further H ( z ) H(z) H(z) must be causal and stable, typically FIR.
If G ( z ) G(z) G(z) is not minimum-phase (has zeros outside the unit circle), causal inversion
is not possible, and we allow for a delay:
G ( z ) H ( z ) = z − M ⇔ g [ n ] ∗ h [ n ] = δ [ n − M ] = : d [ n ] G(z) H(z)=z^{-M} \quad \Leftrightarrow \quad g[n] * h[n]=\delta[n-M]=: d[n] G(z)H(z)=z−M⇔g[n]∗h[n]=δ[n−M]=:d[n]
Design of an FIR equalizer H ( z ) H(z) H(z) of length N N N :
e [ n ] = d [ n ] − h [ n ] ∗ g [ n ] = d [ n ] − ∑ ℓ = 0 N − 1 h [ ℓ ] g [ n − ℓ ] e[n]=d[n]-h[n] * g[n]=d[n]-\sum_{\ell=0}^{N-1} h[\ell] g[n-\ell] e[n]=d[n]−h[n]∗g[n]=d[n]−ℓ=0∑N−1h[ℓ]g[n−ℓ]
Minimize E = ∑ n = 0 ∞ ∣ e [ n ] ∣ 2 = ∑ n = 0 ∞ ∣ d [ n ] − ∑ ℓ = 0 N − 1 h [ ℓ ] g [ n − ℓ ] ∣ 2 : \mathcal{E}=\sum_{n=0}^{\infty}|e[n]|^{2}=\sum_{n=0}^{\infty}\left|d[n]-\sum_{\ell=0}^{N-1} h[\ell] g[n-\ell]\right|^{2}: E=∑n=0∞∣e[n]∣2=∑n=0∞∣∣∣d[n]−∑ℓ=0N−1h[ℓ]g[n−ℓ]∣∣∣2:
min ⁡ { h [ ℓ ] } ∥ [ g [ 0 ] 0 0 g [ 1 ] ⋱ 0 ⋮ ⋱ g [ 0 ] g [ N ] ⋱ g [ 1 ] ⋮ ⋱ ⋮ ] [ h [ 0 ] ⋮ h [ N − 1 ] ] − [ d [ 0 ] ⋮ d [ N − 1 ] d [ N ] ⋮ ] ∥ 2 \min _{\{h[\ell]\}}\left\|\left[\begin{array}{ccc} g[0] & 0 & 0 \\ g[1] & \ddots & 0 \\ \vdots & \ddots & g[0] \\ g[N] & \ddots & g[1] \\ \vdots & \ddots & \vdots \end{array}\right]\left[\begin{array}{c} h[0] \\ \vdots \\ h[N-1] \end{array}\right]-\left[\begin{array}{c} d[0] \\ \vdots \\ d[N-1] \\ d[N] \\ \vdots \end{array}\right]\right\|^{2} {h[ℓ]}min∥∥∥∥∥∥∥∥∥∥∥∥⎣⎢⎢⎢⎢⎢⎢⎡g[0]g[1]⋮g[N]⋮0⋱⋱⋱⋱00g[0]g[1]⋮⎦⎥⎥⎥⎥⎥⎥⎤⎣⎢⎡h[0]⋮h[N−1]⎦⎥⎤−⎣⎢⎢⎢⎢⎢⎢⎡d[0]⋮d[N−1]d[N]⋮⎦⎥⎥⎥⎥⎥⎥⎤∥∥∥∥∥∥∥∥∥∥∥∥2
The LS solution of this overdetermined system G h = d \mathrm{Gh}=\mathrm{d} Gh=d is h = G † d = ( G H G ) − 1 G H d \mathrm{h}=\mathrm{G}^{\dagger} \mathrm{d}=\left(\mathrm{G}^{H} \mathrm{G}\right)^{-1} \mathrm{G}^{H} \mathrm{d} h=G†d=(GHG)−1GHd
Also, the corresponding normal equations are ( G H G ) h = G H d \left(\mathrm{G}^{H} \mathrm{G}\right) \mathrm{h}=\mathrm{G}^{H} \mathrm{d} (GHG)h=GHd, or
[ r g ( 0 ) ⋯ r g ∗ ( N − 1 ) ⋮ ⋮ r g ( N − 1 ) ⋯ r g ( 0 ) ] [ h [ 0 ] ⋮ h [ N − 1 ] ] = − [ r d g ( 0 ) ⋮ r d g ( N − 1 ) ] \left[\begin{array}{lll} r_{g}(0) & \cdots & r_{g}^{*}(N-1) \\ \vdots & \vdots & \\ r_{g}(N-1) & \cdots & r_{g}(0) \end{array}\right]\left[\begin{array}{c} h[0] \\ \vdots \\ h[N-1] \end{array}\right]=-\left[\begin{array}{c} r_{d g}(0) \\ \vdots \\ r_{d g}(N-1) \end{array}\right] ⎣⎢⎡rg(0)⋮rg(N−1)⋯⋮⋯rg∗(N−1)rg(0)⎦⎥⎤⎣⎢⎡h[0]⋮h[N−1]⎦⎥⎤=−⎣⎢⎡rdg(0)⋮rdg(N−1)⎦⎥⎤
where r g ( k ) = ∑ n = 0 ∞ g [ n ] g ∗ [ n − k ] r_{g}(k)=\sum_{n=0}^{\infty} g[n] g^{*}[n-k] rg(k)=∑n=0∞g[n]g∗[n−k] and r d g ( k ) = ∑ n = 0 ∞ d [ n ] g ∗ [ n − k ] . r_{d g}(k)=\sum_{n=0}^{\infty} d[n] g^{*}[n-k] . rdg(k)=∑n=0∞d[n]g∗[n−k]. Because the matrix R g = G H G \mathrm{R}_{g}=\mathrm{G}^{H} \mathrm{G} Rg=GHG is Toeplitz, it can be inverted efficiently.

Stochastic models

We now consider modeling a stochastic process x [ n ] x[n] x[n] via a rational filter H ( z ) H(z) H(z).

Signal Modeling

Instead of minimizing ∑ ∣ e ′ [ n ] ∣ 2 , \sum\left|e^{\prime}[n]\right|^{2}, ∑∣e′[n]∣2, it makes more sense to minimize
E ( ∣ e ′ [ n ] ∣ 2 ) = E ( ∣ x [ n ] − x ^ [ n ] ∣ 2 ) E\left(\left|e^{\prime}[n]\right|^{2}\right)=E\left(|x[n]-\hat x[n]|^{2}\right) E(∣e′[n]∣2)=E(∣x[n]−x^[n]∣2)
As before, minimizing this error is not tractable.
Instead, we can try to match the correlation sequence r x ( k ) r_{x}(k) rx(k) of x [ n ] x[n] x[n] with the correlation sequence of x ^ [ n ] \hat{x}[n] x^[n].

Autoregressive moving average processes

Suppose we filter white noise v ( n ) v(n) v(n) of variance σ v 2 \sigma_{v}^{2} σv2 with the filter
H ( z ) = B ( z ) A ( z ) = ∑ k = 0 q b ( k ) z − k 1 + ∑ k = 1 p a ( k ) z − k H(z)=\frac{B(z)}{A(z)}=\frac{\sum_{k=0}^{q} b(k) z^{-k}}{1+\sum_{k=1}^{p} a(k) z^{-k}} H(z)=A(z)B(z)=1+∑k=1pa(k)z−k∑k=0qb(k)z−k
Assuming that the filter is stable, the output process x ( n ) x(n) x(n) will be wide-sense stationary and the power spectrum of the output x ( n ) x(n) x(n) can then be written as
P x ( z ) = σ v 2 B ( z ) B ∗ ( 1 / z ∗ ) A ( z ) A ∗ ( 1 / z ∗ ) P x ( e j ω ) = σ v 2 ∣ B ( e j ω ) ∣ 2 ∣ A ( e j ω ) ∣ 2 P_{x}(z)=\sigma_{v}^{2} \frac{B(z) B^{*}\left(1 / z^{*}\right)}{A(z) A^{*}\left(1 / z^{*}\right)} \quad P_{x}\left(e^{j \omega}\right)=\sigma_{v}^{2} \frac{\left|B\left(e^{j \omega}\right)\right|^{2}}{\left|A\left(e^{j \omega}\right)\right|^{2}} Px(z)=σv2A(z)A∗(1/z∗)B(z)B∗(1/z∗)Px(ejω)=σv2∣A(ejω)∣2∣∣B(ejω)∣∣2

Such a process is known as an autoregressive moving average process of order ( p , q ) , (p, q), (p,q), or A R M A ( p , q ) \mathrm{ARMA}(p, q) ARMA(p,q).
The power spectrum of an ARMA ( p , q ) (p, q) (p,q) process has 2 p 2 p 2p poles and 2 q 2 q 2q zeros with conjugate reciprocal symmetry.

Yule-Walker equations

From the LCCDE between v ( n ) v(n) v(n) and x ( n ) = h ( n ) ∗ v ( n ) x(n)=h(n) * v(n) x(n)=h(n)∗v(n) :
x ( n ) + ∑ ℓ = 1 p a ( ℓ ) x ( n − ℓ ) = ∑ ℓ = 0 q b ( ℓ ) v ( n − ℓ ) x(n)+\sum_{\ell=1}^{p} a(\ell) x(n-\ell)=\sum_{\ell=0}^{q} b(\ell) v(n-\ell) x(n)+ℓ=1∑pa(ℓ)x(n−ℓ)=ℓ=0∑qb(ℓ)v(n−ℓ)
we can multiply both sides with x ∗ ( n − k ) x^{*}(n-k) x∗(n−k) and take the expectation:
r x ( k ) + ∑ ℓ = 1 p a ( ℓ ) r x ( k − ℓ ) = ∑ ℓ = 0 q b ( ℓ ) E { v ( n − ℓ ) x ∗ ( n − k ) } = ∑ ℓ = 0 q b ( ℓ ) r v x ( k − ℓ ) r_{x}(k)+\sum_{\ell=1}^{p} a(\ell) r_{x}(k-\ell)=\sum_{\ell=0}^{q} b(\ell) E\left\{v(n-\ell) x^{*}(n-k)\right\}=\sum_{\ell=0}^{q} b(\ell) r_{v x}(k-\ell) rx(k)+ℓ=1∑pa(ℓ)rx(k−ℓ)=ℓ=0∑qb(ℓ)E{v(n−ℓ)x∗(n−k)}=ℓ=0∑qb(ℓ)rvx(k−ℓ)
The cross correlation between v ( n ) v(n) v(n) and x ( n ) x(n) x(n) can further be expressed as
r v x ( k − ℓ ) = E { v ( k ) x ∗ ( ℓ ) } = ∑ m = − ∞ ∞ E { v ( k ) v ∗ ( ℓ − m ) } h ∗ ( m ) = σ v 2 h ∗ ( ℓ − k ) r_{v x}(k-\ell)=E\left\{v(k) x^{*}(\ell)\right\}=\sum_{m=-\infty}^{\infty} E\left\{v(k) v^{*}(\ell-m)\right\} h^{*}(m)=\sigma_{v}^{2} h^{*}(\ell-k) rvx(k−ℓ)=E{v(k)x∗(ℓ)}=m=−∞∑∞E{v(k)v∗(ℓ−m)}h∗(m)=σv2h∗(ℓ−k)
For k ≥ 0 , k \geq 0, k≥0, this leads to the Yule-Walker equations
r x ( k ) + ∑ ℓ = 1 p a ( ℓ ) r x ( k − ℓ ) = { σ v 2 ∑ l = 0 q b ( ℓ ) h ∗ ( ℓ − k ) = σ v 2 c ( k ) ; 0 ≤ k ≤ q 0 ; k > q r_{x}(k)+\sum_{\ell=1}^{p} a(\ell) r_{x}(k-\ell)=\left\{\begin{array}{cc} &\sigma_{v}^{2} \sum_{l=0}^{q} b(\ell) h^{*}(\ell-k)=\sigma_{v}^{2} c(k) ;& 0 \leq k \leq q \\ &0 ;& k>q \end{array}\right. rx(k)+ℓ=1∑pa(ℓ)rx(k−ℓ)={σv2∑l=0qb(ℓ)h∗(ℓ−k)=σv2c(k);0;0≤k≤qk>q

Autoregressive processes

An A R M A ( p , 0 ) \mathrm {ARMA}(p,0) ARMA(p,0) process is an autoregressive process, or A R ( p ) \mathrm{AR}(p) AR(p):
H ( z ) = b ( 0 ) 1 + ∑ k = 1 ρ a ( k ) z − k P x ( z ) = σ v 2 ∣ b ( 0 ) ∣ 2 A ( z ) A ∗ ( 1 / z ∗ ) , P x ( e j ω ) = σ v 2 ∣ b ( 0 ) ∣ 2 ∣ A ( e j ω ) ∣ 2 H(z)=\frac{b(0)}{1+\sum_{k=1}^{\rho} a(k) z^{-k}}\\ \quad P_{x}(z)=\sigma_{v}^{2} \frac{|b(0)|^{2}}{A(z) A^{*}\left(1 / z^{*}\right)}, \quad P_{x}\left(e^{j \omega}\right)=\sigma_{v}^{2} \frac{|b(0)|^{2}}{\left|A\left(e^{j \omega}\right)\right|^{2}} H(z)=1+∑k=1ρa(k)z−kb(0)Px(z)=σv2A(z)A∗(1/z∗)∣b(0)∣2,Px(ejω)=σv2∣A(ejω)∣2∣b(0)∣2

The Yule-Walker equations are given by
r x ( k ) + ∑ ℓ = 1 p a ( ℓ ) r x ( k − ℓ ) = σ v 2 ∣ b ( 0 ) ∣ 2 δ ( k ) ; k ≥ 0 r_{x}(k)+\sum_{\ell=1}^{p} a(\ell) r_{x}(k-\ell)=\sigma_{v}^{2}|b(0)|^{2} \delta(k) ; k \geq 0 rx(k)+ℓ=1∑pa(ℓ)rx(k−ℓ)=σv2∣b(0)∣2δ(k);k≥0

Moving average processes

An A R M A ( 0 , q ) \mathrm {ARMA}(0,q) ARMA(0,q) process is a moving average process, or M A ( q ) \mathrm{MA}(q) MA(q):
H ( z ) = B ( z ) = ∑ k = 0 q b ( k ) z − k P x ( z ) = σ v 2 B ( z ) B ∗ ( 1 / z ∗ ) , P x ( e j ω ) = σ v 2 ∣ B ( e j ω ) ∣ 2 H(z)=B(z)=\sum_{k=0}^{q} b(k) z^{-k}\\ P_{x}(z)=\sigma_{v}^{2} B(z) B^{*}\left(1 / z^{*}\right), \quad P_{x}\left(e^{j \omega}\right)=\sigma_{v}^{2}\left|B\left(e^{j \omega}\right)\right|^{2} H(z)=B(z)=k=0∑qb(k)z−kPx(z)=σv2B(z)B∗(1/z∗),Px(ejω)=σv2∣∣B(ejω)∣∣2
The Yule-Walker equations are given by
r x ( k ) = σ v 2 ∑ ℓ = 0 q b ( ℓ ) b ∗ ( k − ℓ ) = σ v 2 b ( k ) ∗ b ∗ ( − k ) r_{x}(k)=\sigma_{v}^{2} \sum_{\ell=0}^{q} b(\ell) b^{*}(k-\ell)=\sigma_{v}^{2} b(k) * b^{*}(-k) rx(k)=σv2ℓ=0∑qb(ℓ)b∗(k−ℓ)=σv2b(k)∗b∗(−k)
The autocorrelation function is zero outside [ − q , q ] [-q, q] [−q,q]
Estimating b ( k ) b(k) b(k) from the Yule-Walker equations is not easy (nonlinear).

Stochastic model: Pade

In matrix form, the Yule-Walker equations become
[ r x ( 0 ) r x ( − 1 ) ⋯ r x ( − p ) r x ( 1 ) r x ( 0 ) ⋯ r x ( − p + 1 ) ⋮ ⋮ ⋮ r x ( q ) r x ( q − 1 ) ⋯ r x ( q − p ) r x ( q + 1 ) r x ( q ) ⋯ r x ( q − p + 1 ) ⋮ ⋮ ⋮ r x ( L ) r x ( L − 1 ) ⋯ r x ( L − p ) ⋮ ⋮ ⋮ ] [ 1 a ( 1 ) a ( 2 ) ⋮ a ( p ) ] = σ v 2 [ c ( 0 ) c ( 1 ) ⋮ c ( q ) 0 ⋮ 0 ⋮ ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}(-1) & \cdots & r_{x}(-p) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}(-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(q) & r_{x}(q-1) & \cdots & r_{x}(q-p) \\ \hline r_{x}(q+1) & r_{x}(q) & \cdots & r_{x}(q-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(L) & r_{x}(L-1) & \cdots & r_{x}(L-p)\\ \vdots & \vdots & & \vdots \\ \end{array}\right]\left[\begin{array}{c} 1 \\ a(1) \\ a(2)\\ \vdots \\ a(p) \end{array}\right]=\sigma_{v}^{2} \left[\begin{array}{c} c(0) \\ c(1) \\ \vdots \\ c(q) \\ \hline 0 \\ \vdots \\ 0\\ \vdots \end{array}\right] ⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡rx(0)rx(1)⋮rx(q)rx(q+1)⋮rx(L)⋮rx(−1)rx(0)⋮rx(q−1)rx(q)⋮rx(L−1)⋮⋯⋯⋯⋯⋯rx(−p)rx(−p+1)⋮rx(q−p)rx(q−p+1)⋮rx(L−p)⋮⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡1a(1)a(2)⋮a(p)⎦⎥⎥⎥⎥⎥⎤=σv2⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡c(0)c(1)⋮c(q)0⋮0⋮⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤
Keep the first p + q + 1 p+q+1 p+q+1 equations when k = 0 , 1 , ⋯ , p + q k=0,1,\cdots,p+q k=0,1,⋯,p+q: (because we only have p + q + 1 p+q+1 p+q+1 unknowns).
[ r x ( 0 ) r x ( − 1 ) ⋯ r x ( − p ) r x ( 1 ) r x ( 0 ) ⋯ r x ( − p + 1 ) ⋮ ⋮ ⋮ r x ( q ) r x ( q − 1 ) ⋯ r x ( q − p ) r x ( q + 1 ) r x ( q ) ⋯ r x ( q − p + 1 ) ⋮ ⋮ ⋮ r x ( q + p ) r x ( q + p − 1 ) ⋯ r x ( q ) ] [ 1 a ( 1 ) ⋮ a ( p ) ] = σ v 2 [ c ( 0 ) c ( 1 ) ⋮ c ( q ) 0 ⋮ 0 ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}(-1) & \cdots & r_{x}(-p) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}(-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(q) & r_{x}(q-1) & \cdots & r_{x}(q-p) \\ \hline r_{x}(q+1) & r_{x}(q) & \cdots & r_{x}(q-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(q+p) & r_{x}(q+p-1) & \cdots & r_{x}(q) \end{array}\right]\left[\begin{array}{c} 1 \\ a(1) \\ \vdots \\ a(p) \end{array}\right]=\sigma_{v}^{2} \left[\begin{array}{c} c(0) \\ c(1) \\ \vdots \\ c(q) \\ \hline 0 \\ \vdots \\ 0 \end{array}\right] ⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡rx(0)rx(1)⋮rx(q)rx(q+1)⋮rx(q+p)rx(−1)rx(0)⋮rx(q−1)rx(q)⋮rx(q+p−1)⋯⋯⋯⋯⋯rx(−p)rx(−p+1)⋮rx(q−p)rx(q−p+1)⋮rx(q)⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤⎣⎢⎢⎢⎡1a(1)⋮a(p)⎦⎥⎥⎥⎤=σv2⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡c(0)c(1)⋮c(q)0⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤
We can solve a ( 1 ) , ⋯ , a ( p ) a(1),\cdots,a(p) a(1),⋯,a(p) using the bottom half equations,
[ r x ( q ) r x ∗ ( q − 1 ) ⋯ r x ∗ ( q − p + 1 ) r x ( q + 1 ) r x ( q ) ⋯ r x ∗ ( q − p + 2 ) ⋮ ⋮ ⋮ r x ( q + p − 1 ) r x ( q + p − 2 ) ⋯ r x ( q ) ] [ a [ 1 ] a [ 2 ] ⋮ a [ p ] ] = − [ r x ( q + 1 ) r x ( q + 2 ) ⋮ r x ( q + p ) ] \left[\begin{array}{c} r_{x}(q) & r_{x}^{*}(q-1) & \cdots & r_{x}^{*}(q-p+1) \\ r_{x}(q+1) & r_{x}(q) & \cdots & r_{x}^{*}(q-p+2) \\ \vdots & \vdots& & \vdots \\ r_{x}(q+p-1) & r_{x}(q+p-2) & \cdots & r_{x}(q) \end{array}\right]\left[\begin{array}{c} a[1] \\ a[2] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{c} r_{x}(q+1) \\ r_{x}(q+2) \\ \vdots \\ r_{x}(q+p) \end{array}\right] ⎣⎢⎢⎢⎡rx(q)rx(q+1)⋮rx(q+p−1)rx∗(q−1)rx(q)⋮rx(q+p−2)⋯⋯⋯rx∗(q−p+1)rx∗(q−p+2)⋮rx(q)⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡a[1]a[2]⋮a[p]⎦⎥⎥⎥⎤=−⎣⎢⎢⎢⎡rx(q+1)rx(q+2)⋮rx(q+p)⎦⎥⎥⎥⎤
and plug them back to obtain c ( 0 ) , ⋯ , c ( q ) c(0),\cdots,c(q) c(0),⋯,c(q):
[ r x ( 0 ) r x ∗ ( 1 ) ⋯ r x ∗ ( p ) r x ( 1 ) r x ( 0 ) ⋯ r x ∗ ( p + 1 ) ⋮ ⋮ ⋮ r x ( q ) r x ( q − 1 ) ⋯ r x ( q − p ) ] [ 1 a [ 1 ] a [ 2 ] ⋮ a [ p ] ] = [ c [ 0 ] c [ 1 ] ⋮ c [ q ] ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}^{*}(1) & \cdots & r_{x}^{*}(p) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}^{*}(p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(q) & r_{x}(q-1) & \cdots & r_{x}(q-p) \end{array}\right]\left[\begin{array}{c} 1 \\ a[1] \\ a[2] \\ \vdots \\ a[p] \end{array}\right]=\left[\begin{array}{c} c[0] \\ c[1] \\ \vdots \\ c[q] \end{array}\right] ⎣⎢⎢⎢⎡rx(0)rx(1)⋮rx(q)rx∗(1)rx(0)⋮rx(q−1)⋯⋯⋯rx∗(p)rx∗(p+1)⋮rx(q−p)⎦⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡1a[1]a[2]⋮a[p]⎦⎥⎥⎥⎥⎥⎤=⎣⎢⎢⎢⎡c[0]c[1]⋮c[q]⎦⎥⎥⎥⎤
Since c ( k ) = b ( k ) ∗ h ∗ ( − k ) c(k)=b(k)*h^*(-k) c(k)=b(k)∗h∗(−k), where h ∗ ( − k ) h^*(-k) h∗(−k) depends on b ( k ) b(k) b(k) as well, estimating b ( k ) b(k) b(k) from the Yule-Walker equations is not easy.

Stochastic model: Prony

Prony follows by taking L > p + q L>p+q L>p+q; this results in an overdetermined set of equations for estimating a ( k ) a(k) a(k):

[ r x ( 0 ) r x ( − 1 ) ⋯ r x ( − p ) r x ( 1 ) r x ( 0 ) ⋯ r x ( − p + 1 ) ⋮ ⋮ ⋮ r x ( q ) r x ( q − 1 ) ⋯ r x ( q − p ) r x ( q + 1 ) r x ( q ) ⋯ r x ( q − p + 1 ) ⋮ ⋮ ⋮ r x ( L ) r x ( L − 1 ) ⋯ r x ( L − p ) ] [ 1 a ( 1 ) a ( 2 ) ⋮ a ( p ) ] = σ v 2 [ c ( 0 ) c ( 1 ) ⋮ c ( q ) 0 ⋮ 0 ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}(-1) & \cdots & r_{x}(-p) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}(-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(q) & r_{x}(q-1) & \cdots & r_{x}(q-p) \\ \hline r_{x}(q+1) & r_{x}(q) & \cdots & r_{x}(q-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(L) & r_{x}(L-1) & \cdots & r_{x}(L-p) \end{array}\right]\left[\begin{array}{c} 1 \\ a(1) \\ a(2)\\ \vdots \\ a(p) \end{array}\right]=\sigma_{v}^{2} \left[\begin{array}{c} c(0) \\ c(1) \\ \vdots \\ c(q) \\ \hline 0 \\ \vdots \\ 0 \end{array}\right] ⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡rx(0)rx(1)⋮rx(q)rx(q+1)⋮rx(L)rx(−1)rx(0)⋮rx(q−1)rx(q)⋮rx(L−1)⋯⋯⋯⋯⋯rx(−p)rx(−p+1)⋮rx(q−p)rx(q−p+1)⋮rx(L−p)⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤⎣⎢⎢⎢⎢⎢⎡1a(1)a(2)⋮a(p)⎦⎥⎥⎥⎥⎥⎤=σv2⎣⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎡c(0)c(1)⋮c(q)0⋮0⎦⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎤
From the last L − q L - q L−q equations we have
[ r x ( q ) r x ( q − 1 ) ⋯ r x ( q − p + 1 ) r x ( q + 1 ) r x ( q ) ⋯ r x ( q − p + 2 ) ⋮ ⋮ ⋮ r x ( L − 1 ) r x ( L − 2 ) ⋯ r x ( L − p ) ] [ a ( 1 ) a ( 2 ) ⋮ a ( p ) ] = − [ r x ( q + 1 ) r x ( q + 2 ) ⋮ r x ( L ) ] \left[\begin{array}{cccc} r_{x}(q) & r_{x}(q-1) & \cdots & r_{x}(q-p+1) \\ r_{x}(q+1) & r_{x}(q) & \cdots & r_{x}(q-p+2) \\ \vdots & \vdots & & \vdots \\ r_{x}(L-1) & r_{x}(L-2) & \cdots & r_{x}(L-p) \end{array}\right]\left[\begin{array}{c} a(1) \\ a(2) \\ \vdots \\ a(p) \end{array}\right]=-\left[\begin{array}{c} r_{x}(q+1) \\ r_{x}(q+2) \\ \vdots \\ r_{x}(L) \end{array}\right] ⎣⎢⎢⎢⎡rx(q)rx(q+1)⋮rx(L−1)rx(q−1)rx(q)⋮rx(L−2)⋯⋯⋯rx(q−p+1)rx(q−p+2)⋮rx(L−p)⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡a(1)a(2)⋮a(p)⎦⎥⎥⎥⎤=−⎣⎢⎢⎢⎡rx(q+1)rx(q+2)⋮rx(L)⎦⎥⎥⎥⎤
or
R q a ‾ = − r q + 1 \mathbf R_q \overline{\mathbf a}=-\mathbf r_{q+1} Rqa=−rq+1
We may find the least squares solution by solving
( R q H R q ) a ‾ = − R q H r q + 1 (\mathbf R_q^H \mathbf R_q)\overline {\mathbf a}=-\mathbf R_q^H \mathbf r_{q+1} (RqHRq)a=−RqHrq+1
where ( R q H R q ) (\mathbf R_q^H\mathbf R_q) (RqHRq) is a p × p p\times p p×p Hermitian Toeplitz matrix.

SM-Special case: All-pole modeling

For autoregressive processes, stack the Yule-Walker equations for k = 0 , 1 , … , p k=0,1, \ldots, p k=0,1,…,p
[ r x ( 0 ) r x ( − 1 ) ⋯ r x ( − p ) r x ( 1 ) r x ( 0 ) ⋮ r x ( − p + 1 ) ⋮ ⋮ ⋮ r x ( p ) r x ( p − 1 ) ⋯ r x ( 0 ) ] [ 1 a ( 1 ) ⋮ a ( p ) ] = σ v 2 ∣ b ( 0 ) ∣ 2 [ 1 0 ⋮ 0 ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}(-1) & \cdots & r_{x}(-p) \\ r_{x}(1) & r_{x}(0) & \vdots & r_{x}(-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(p) & r_{x}(p-1) & \cdots & r_{x}(0) \end{array}\right]\left[\begin{array}{c} 1 \\ a(1) \\ \vdots \\ a(p) \end{array}\right]=\sigma_{v}^{2}|b(0)|^{2}\left[\begin{array}{c} 1 \\ 0 \\ \vdots \\ 0 \end{array}\right] ⎣⎢⎢⎢⎢⎡rx(0)rx(1)⋮rx(p)rx(−1)rx(0)⋮rx(p−1)⋯⋮⋯rx(−p)rx(−p+1)⋮rx(0)⎦⎥⎥⎥⎥⎤⎣⎢⎢⎢⎡1a(1)⋮a(p)⎦⎥⎥⎥⎤=σv2∣b(0)∣2⎣⎢⎢⎢⎡10⋮0⎦⎥⎥⎥⎤
It can also be written as
[ r x ( 0 ) r x ∗ ( 1 ) ⋯ r x ∗ ( p − 1 ) r x ( 1 ) r x ( 0 ) ⋯ r x ∗ ( p − 2 ) ⋮ ⋱ ⋮ r x ( p − 1 ) r x ( p − 2 ) ⋯ r x ( 0 ) ] [ a [ 1 ] a [ 2 ] ⋮ a [ p ] ] = − [ r x ( 1 ) r x ( 2 ) ⋮ r x ( p ) ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}^{*}(1) & \cdots & r_{x}^{*}(p-1) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}^{*}(p-2) \\ \vdots & \ddots & \vdots \\ r_{x}(p-1) & r_{x}(p-2) & \cdots & r_{x}(0) \end{array}\right]\left[\begin{array}{c} a[1] \\ a[2] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{c} r_{x}(1) \\ r_{x}(2) \\ \vdots \\ r_{x}(p) \end{array}\right] ⎣⎢⎢⎢⎡rx(0)rx(1)⋮rx(p−1)rx∗(1)rx(0)⋱rx(p−2)⋯⋯⋮⋯rx∗(p−1)rx∗(p−2)rx(0)⎦⎥⎥⎥⎤⎣⎢⎢⎢⎡a[1]a[2]⋮a[p]⎦⎥⎥⎥⎤=−⎣⎢⎢⎢⎡rx(1)rx(2)⋮rx(p)⎦⎥⎥⎥⎤
This is exactly the same as the normal equations for [all-pole modeling with Prony’s method](# DM-Special case: All-pole modeling) in the deterministic case, except that here r x ( k ) r_{x}(k) rx(k) is a statistical autocorrelation.

If it is estimated from the data, r ^ x ( k ) = 1 N ∑ n = 0 N − 1 x [ n ] x ∗ [ n − k ] \hat{r}_{x}(k)=\frac{1}{N} \sum_{n=0}^{N-1} x[n] x^{*}[n-k] r^x(k)=N1∑n=0N−1x[n]x∗[n−k], the methods become identical. Also: ∣ b [ 0 ] ∣ 2 = r x ( 0 ) + ∑ k = 1 p a [ k ] r x ( k ) |b[0]|^{2}=r_{x}(0)+\sum_{k=1}^{p} a[k] r_{x}(k) ∣b[0]∣2=rx(0)+∑k=1pa[k]rx(k)

Summary

Determined model: Pade

Signal Modeling

Determined model: Prony

Signal Modeling

Determined model: Shank

Signal Modeling

DM-Special case: All-pole modeling

Signal Modeling

DM-Special case: All-pole modeling with finite data

Signal Modeling

Content

Signal models

Determine models: Least Squares

Determine models: Pade Approximation

Determine models: Prony’s Method

DM-Special case: All-pole modeling

All-pole modeling with finite data

Autocorrelation method

Covariance method

Examples

Pole estimation

Channel inversion

Stochastic models

Autoregressive moving average processes

Yule-Walker equations

Autoregressive processes

Moving average processes

Stochastic model: Pade

Stochastic model: Prony

SM-Special case: All-pole modeling

Summary