Signal models

In this chapter, the model that will be used is one that represents a signal as the output of a causal linear shift-invariant filter that has a rational system function of the form
H ( z ) = B q ( z ) A p ( z ) = ∑ k = 0 q b q ( k ) z − k 1 + ∑ k = 1 p a p ( k ) z − k H(z)=\frac{B_{q}(z)}{A_{p}(z)}=\frac{\sum_{k=0}^{q} b_{q}(k) z^{-k}}{1+\sum_{k=1}^{p} a_{p}(k) z^{-k}} H(z)=Ap(z)Bq(z)=1+k=1pap(k)zkk=0qbq(k)zk

Signal Modeling

Model identification

For deterministic signals, given observations x [ n ] , n = 0 , ⋅ ⋅ ⋅ , N − 1 x[n], n = 0, · · · ,N − 1 x[n],n=0,,N1 and filter order p , q p, q p,q, find the parameters of H ( z ) H(z) H(z) such that the modeled output signal x ^ [ n ] = h [ n ] \hat x[n] = h[n] x^[n]=h[n] best matches the observations.
For stochastic signals, we will try to match the correlation sequences: r x [ k ] = r h [ k ] r_x [k] = r_h[k] rx[k]=rh[k].

Determine models: Least Squares

Signal Modeling

Minimizing the error with Least squares criterion:
E L S = ∑ n = 0 ∞ ∣ x [ n ] − h [ n ] ∣ 2 \mathcal E_{LS}=\sum_{n=0}^\infty |x[n]-h[n]|^2 ELS=n=0x[n]h[n]2
Using Parseval’s theorem, the least square error may be written in the frequency domain:
E L S = 1 2 π ∫ − π π ∣ X ( e j ω ) − H ( e j ω ) ∣ 2 d ω \mathcal E_{LS}=\frac{1}{2\pi}\int_{-\pi}^{\pi}|X(e^{j\omega})-H(e^{j\omega})|^2d\omega ELS=2π1ππX(ejω)H(ejω)2dω
This representation allows us to express E L S \mathcal{E}_{L S} ELS explicitly in terms of the model parameters a p ( k ) a_p(k) ap(k) and b q ( k ) b_q(k) bq(k):
{ ∂ E L S ∂ a ∗ [ k ] = 0 , k = 1 , ⋯   , p ∂ E L S ∂ b ∗ [ k ] = 0 , k = 0 , ⋯   , q \left\{\begin{array}{l} \frac{\partial \mathcal{E}_{L S}}{\partial a^{*}[k]}=0, \quad k=1, \cdots, p \\ \frac{\partial \mathcal{E}_{L S}}{\partial b^{*}[k]}=0, \quad k=0, \cdots, q \end{array}\right. {a[k]ELS=0,k=1,,pb[k]ELS=0,k=0,,q
The result is not mathematically tractable and not amenable to real-time signal processing applications.

Alternative that leads to tractable results: consider “weighted” error:

Since X ( z ) − H ( z ) = X ( z ) − B q ( z ) / A p ( z ) X(z)-H(z)=X(z)-B_q(z)/A_p(z) X(z)H(z)=X(z)Bq(z)/Ap(z) contains fraction, we can consider A p ( z ) ( X ( z ) − H ( z ) ) = A p ( z ) X ( z ) − B q ( z ) A_p(z)(X(z)-H(z))=A_p(z)X(z)-B_q(z) Ap(z)(X(z)H(z))=Ap(z)X(z)Bq(z), i.e.,
E ( z ) = A p ( z ) E ′ ( z ) = A p ( z ) X ( z ) − B q ( z ) E(z)=A_p(z)E'(z)=A_p(z)X(z)-B_q(z) E(z)=Ap(z)E(z)=Ap(z)X(z)Bq(z)

Signal Modeling

Now the error equation is linear.

Based on this, we can develop techniques like Pade Approximation, Prony’s Method, Shank’s Method.

Determine models: Pade Approximation

We have p + q + 1 p+q+1 p+q+1 model parameters. They can be solved by matching p + q + 1 p+q+1 p+q+1 signal samples exactly:
h [ n ] = x [ n ] , n = 0 , ⋯   , p + q h[n]=x[n], \quad n=0,\cdots,p+q h[n]=x[n],n=0,,p+q
From the definition of H ( z ) H(z) H(z),
H ( z ) = B ( z ) A ( z ) ⇒ H ( z ) A ( z ) = B ( z ) ⇒ h [ n ] ∗ a [ n ] = b [ n ] ⇒ h [ n ] + ∑ k = 1 p a [ k ] h [ n − k ] = b [ n ] \begin{aligned} H(z)=\frac{B(z)}{A(z)} & \Rightarrow H(z) A(z)=B(z) \Rightarrow h[n] * a[n]=b[n] \\ & \Rightarrow \quad h[n]+\sum_{k=1}^{p} a[k] h[n-k]=b[n] \end{aligned} H(z)=A(z)B(z)H(z)A(z)=B(z)h[n]a[n]=b[n]h[n]+k=1pa[k]h[nk]=b[n]
where h [ n ] = 0 , n < 0 , h[n]=0, n<0, h[n]=0,n<0, and b [ n ] = 0 , n < 0 b[n]=0, n<0 b[n]=0,n<0 or n > q n>q n>q.

Since we assume h [ n ] = x [ n ] h[n]=x[n] h[n]=x[n],
x [ n ] + ∑ k = 1 p a [ k ] x [ n − k ] = { b [ n ] , n = 0 , ⋯   , q 0 , n = q + 1 , ⋯   , p + q x[n]+\sum_{k=1}^{p} a[k] x[n-k]=\left\{\begin{array}{ll} b[n], & n=0, \cdots, q \\ 0, & n=q+1, \cdots, p+q \end{array}\right. x[n]+k=1pa[k]x[nk]={b[n],0,n=0,,qn=q+1,,p+q
These equations can be rearranged in a matrix form:
[ x ( 0 ) 0 ⋯ 0 x ( 1 ) x ( 0 ) ⋯ 0 x ( 2 ) x ( 1 ) ⋯ 0 ⋮ ⋮ ⋮ x ( q ) x ( q − 1 ) ⋯ x ( q − p ) − − − − − − − − − − − − − − − − − − − − x ( q + 1 ) x ( q ) ⋯ x ( q − p + 1 ) ⋮ ⋮ ⋮ x ( q + p ) x ( q + p − 1 ) ⋯ x ( q ) ] [ 1 a p ( 1 ) a p ( 2 ) ⋮ a p ( p ) ] = [ b q ( 0 ) b q ( 1 ) b q ( 2 ) ⋮ b q ( q ) − − 0 ⋮ 0 ] \left[\begin{array}{cccc} x(0) & 0 & \cdots & 0 \\ x(1) & x(0) & \cdots & 0 \\ x(2) & x(1) & \cdots & 0 \\ \vdots & \vdots & & \vdots \\ x(q) & x(q-1) & \cdots & x(q-p) \\ ----- & ----- & -----& -----\\ x(q+1) & x(q) & \cdots & x(q-p+1) \\ \vdots & \vdots & & \vdots \\ x(q+p) & x(q+p-1) & \cdots & x(q) \end{array}\right] \left[\begin{array}{c} 1 \\ a_{p}(1) \\ a_{p}(2) \\ \vdots \\ a_{p}(p) \end{array}\right]= \left[\begin{array}{c} b_{q}(0) \\ b_{q}(1) \\ b_{q}(2) \\ \vdots \\ b_{q}(q)\\ --\\ 0\\ \vdots\\ 0 \end{array}\right] x(0)x(1)x(2)x(q)x(q+1)x(q+p)0x(0)x(1)x(q1)x(q)x(q+p1)000x(qp)x(qp+1)x(q)1ap(1)ap(2)ap(p)=bq(0)bq(1)bq(2)bq(q)00

  • In the first step of the Pade approximation method, solving for the coefficients a p ( k ) a_p(k) ap(k), we use the last p p p equations as indicated by the partitioning, i.e.,
    [ x ( q + 1 ) x ( q ) ⋯ x ( q − p + 1 ) x ( q + 2 ) x ( q + 1 ) ⋯ x ( q − p + 2 ) ⋮ ⋮ ⋮ x ( q + p ) x ( q + p − 1 ) ⋯ x ( q ) ] [ 1 a p ( 1 ) ⋮ a p ( p ) ] = [ 0 0 ⋮ 0 ] \left[\begin{array}{cccc} x(q+1) & x(q) & \cdots & x(q-p+1) \\ x(q+2) & x(q+1) & \cdots & x(q-p+2) \\ \vdots & \vdots & & \vdots \\ x(q+p) & x(q+p-1) & \cdots & x(q) \end{array}\right]\left[\begin{array}{c} 1 \\ a_{p}(1) \\ \vdots \\ a_{p}(p) \end{array}\right]=\left[\begin{array}{c} 0 \\ 0 \\ \vdots \\ 0 \end{array}\right] x(q+1)x(q+2)x(q+p)x(q)x(q+1)x(q+p1)x(qp+1)x(qp+2)x(q)1ap(1)ap(p)=000
    This can be rearranged as
    [ x ( q ) x ( q − 1 ) ⋯ x ( q − p + 1 ) x ( q + 1 ) x ( q ) ⋯ x ( q − p + 2 ) ⋮ ⋮ ⋮ x ( q + p − 1 ) x ( q + p − 2 ) ⋯ x ( q ) ] [ a p ( 1 ) a p ( 2 ) ⋮ a p ( p ) ] = − [ x ( q + 1 ) x ( q + 2 ) ⋮ x ( q + p ) ] \left[\begin{array}{cccc} x(q) & x(q-1) & \cdots & x(q-p+1) \\ x(q+1) & x(q) & \cdots & x(q-p+2) \\ \vdots & \vdots & & \vdots \\ x(q+p-1) & x(q+p-2) & \cdots & x(q) \end{array}\right]\left[\begin{array}{c} a_{p}(1) \\ a_{p}(2) \\ \vdots \\ a_{p}(p) \end{array}\right]=-\left[\begin{array}{c} x(q+1) \\ x(q+2) \\ \vdots \\ x(q+p) \end{array}\right] x(q)x(q+1)x(q+p1)x(q1)x(q)x(q+p2)x(qp+1)x(qp+2)x(q)ap(1)ap(2)ap(p)=x(q+1)x(q+2)x(q+p)
    Using matrix notation it may be expressed concisely as
    X q a ‾ p = − x q + 1 \mathbf X_q\overline {\mathbf a}_p=-\mathbf x_{q+1} Xqap=xq+1

  • Plug the solution back to find the b ( k ) b(k) b(k):
    [ x ( 0 ) 0 0 ⋯ 0 x ( 1 ) x ( 0 ) 0 ⋯ 0 x ( 2 ) x ( 1 ) x ( 0 ) ⋯ 0 ⋮ ⋮ ⋮ ⋮ x ( q ) x ( q − 1 ) x ( q − 2 ) ⋯ x ( q − p ) ] [ 1 a p ( 1 ) a p ( 2 ) ⋮ a p ( p ) ] = [ b q ( 0 ) b q ( 1 ) b q ( 2 ) ⋮ b q ( q ) ] , \left[\begin{array}{ccccc} x(0) & 0 & 0 & \cdots & 0 \\ x(1) & x(0) & 0 & \cdots & 0 \\ x(2) & x(1) & x(0) & \cdots & 0 \\ \vdots & \vdots & \vdots & & \vdots \\ x(q) & x(q-1) & x(q-2) & \cdots & x(q-p) \end{array}\right]\left[\begin{array}{c} 1 \\ a_{p}(1) \\ a_{p}(2) \\ \vdots \\ a_{p}(p) \end{array}\right]=\left[\begin{array}{c} b_{q}(0) \\ b_{q}(1) \\ b_{q}(2) \\ \vdots \\ b_{q}(q) \end{array}\right], x(0)x(1)x(2)x(q)0x(0)x(1)x(q1)00x(0)x(q2)000x(qp)1ap(1)ap(2)ap(p)=bq(0)bq(1)bq(2)bq(q),
    which can also be written in matrix form:
    X 0 a p = b q \mathbf X_0 \mathbf a_p=\mathbf b_q X0ap=bq


Disadvantages of the Pade approximation

  • The resulting model doesn’t have to be stable
  • Outside the used interval [ 0 , p + q ] [0, p+q] [0,p+q], the approximation may be very bad
  • If X q \mathrm{X}_{q} Xq is singular, then a solution does not always exist that matches all signal values. (Sometimes a solution exists of lower order for A ( z ) , B ( z ) , A(z), B(z), A(z),B(z), i.e., smaller p , q , p, q, p,q, but then no guarantee on matching for samples beyond p + q p+q p+q.)

Determine models: Prony’s Method

As before, E ( z ) = A p ( z ) E ′ ( z ) = A p ( z ) X ( z ) − B q ( z ) E(z)=A_p(z)E'(z)=A_p(z)X(z)-B_q(z) E(z)=Ap(z)E(z)=Ap(z)X(z)Bq(z), i.e.,
e [ n ] = { x [ n ] + ∑ k = 1 p a [ k ] x [ n − k ] − b [ n ] , n = 0 , ⋯   , q x [ n ] + ∑ k = 1 p a [ k ] x [ n − k ] , n > q e[n]=\left\{\begin{array}{ll} x[n]+\sum_{k=1}^{p} a[k] x[n-k]-b[n], & n=0, \cdots, q \\ x[n]+\sum_{k=1}^{p} a[k] x[n-k], & n>q \end{array}\right. e[n]={x[n]+k=1pa[k]x[nk]b[n],x[n]+k=1pa[k]x[nk],n=0,,qn>q

  • For Pade, we first solved e [ n ] = 0 , n = q + 1 , ⋯   , q + p e[n]=0, n=q+1, \cdots, q+p e[n]=0,n=q+1,,q+p to find A ( z ) A(z) A(z).

  • For Prony, solve
    min ⁡ { a [ k ] } ∑ n = q + 1 ∞ ∣ e [ n ] ∣ 2 = min ⁡ { a [ k ] } ∑ n = q + 1 ∞ ∣ x [ n ] + ∑ k = 1 p a [ k ] x [ n − k ] ∣ 2 \min _{\{a[k]\}} \sum_{n=q+1}^{\infty}|e[n]|^{2}=\min _{\{a[k]\}} \sum_{n=q+1}^{\infty}\left|x[n]+\sum_{k=1}^{p} a[k] x[n-k]\right|^{2} {a[k]}minn=q+1e[n]2={a[k]}minn=q+1x[n]+k=1pa[k]x[nk]2


There are two ways to derive Prony’s Method.

  • In terms of finding the least squares solution to a set of overdetermined linear equations. Specifically, note that in Prony’s method we would like to find a set of coefficients a p ( k ) a_p(k) ap(k) such that
    x ( n ) + ∑ k = 1 p a p ( k ) x ( n − k ) = 0 ; n > q x(n)+\sum_{k=1}^p a_p(k)x(n-k)=0;\quad n>q x(n)+k=1pap(k)x(nk)=0;n>q
    i.e., we would like e ( n ) e(n) e(n) to be equal to zero for n > q n > q n>q. In matrix form, these equations may be expressed as
    [ x ( q ) x ( q − 1 ) ⋯ x ( q − p + 1 ) x ( q + 1 ) x ( q ) ⋯ x ( q − p + 2 ) x ( q + 2 ) x ( q + 1 ) ⋯ x ( q − p + 3 ) ⋮ ⋮ ⋮ ] [ a p ( 1 ) a p ( 2 ) ⋮ a p ( p ) ] = − [ x ( q + 1 ) x ( q + 2 ) x ( q + 3 ) ⋮ ] \left[\begin{array}{cccc} x(q) & x(q-1) & \cdots & x(q-p+1) \\ x(q+1) & x(q) & \cdots & x(q-p+2) \\ x(q+2) & x(q+1) & \cdots & x(q-p+3) \\ \vdots & \vdots & & \vdots \end{array}\right]\left[\begin{array}{c} a_{p}(1) \\ a_{p}(2) \\ \vdots \\ a_{p}(p) \end{array}\right]=-\left[\begin{array}{c} x(q+1) \\ x(q+2) \\ x(q+3) \\ \vdots \end{array}\right] x(q)x(q+1)x(q+2)x(q1)x(q)x(q+1)x(qp+1)x(qp+2)x(qp+3)ap(1)ap(2)ap(p)=x(q+1)x(q+2)x(q+3)
    or equivalently, as
    X q a p ‾ = − x q + 1 \mathbf X_q\overline{\mathbf a_p}=-\mathbf x_{q+1} Xqap=xq+1
    The least squares solution is found by solving the normal equations
    ( X q H X q ) a p ‾ = − X q H x q + 1 (\mathbf X_q^H \mathbf X_q)\overline{\mathbf a_p}=-\mathbf X_q^H\mathbf x_{q+1} (XqHXq)ap=XqHxq+1

  • In terms of setting the partial derivatives equal to zero. Define
    E p , q = ∑ n = q + 1 ∞ ∣ e ( n ) ∣ 2 = ∑ n = q + 1 ∞ ∣ x ( n ) + ∑ l = 1 p a p ( l ) x ( n − l ) ∣ 2 \mathcal{E}_{p, q}=\sum_{n=q+1}^{\infty}|e(n)|^{2}=\sum_{n=q+1}^{\infty}\left|x(n)+\sum_{l=1}^{p} a_{p}(l) x(n-l)\right|^{2} Ep,q=n=q+1e(n)2=n=q+1x(n)+l=1pap(l)x(nl)2
    By setting the partial derivatives of E p , q \mathcal E_{p,q} Ep,q with respect to a p ∗ ( k ) a_p^*(k) ap(k) equal to zero, we obtain
    ∑ n = q + 1 ∞ e ( n ) x ∗ ( n − k ) = 0 ; k = 1 , 2 , ⋯   , p \sum_{n=q+1}^{\infty} e(n)x^*(n-k)=0;\qquad k=1,2,\cdots,p n=q+1e(n)x(nk)=0;k=1,2,,p
    which is known as the orthogonality principle.

    Substituting e ( n ) = x ( n ) + ∑ k = 1 p a p ( k ) x ( n − k ) e(n)=x(n)+\sum_{k=1}^p a_p(k)x(n-k) e(n)=x(n)+k=1pap(k)x(nk) and defining
    r x ( k , l ) = ∑ n = q + 1 ∞ x ( n − l ) x ∗ ( n − k ) , r_x(k,l)=\sum_{n=q+1}^\infty x(n-l)x^*(n-k), rx(k,l)=n=q+1x(nl)x(nk),
    which is a conjugate symmetric function r x ( k , l ) = r x ∗ ( l , k ) r_x(k,l)=r_x^*(l,k) rx(k,l)=rx(l,k), we have
    ∑ l = 1 p a p ( l ) r x ( k , l ) = − r x ( k , 0 ) ; k = 1 , 2 , ⋯   , p . \sum_{l=1}^p a_p(l)r_x(k,l)=-r_x(k,0);\qquad k=1,2,\cdots,p. l=1pap(l)rx(k,l)=rx(k,0);k=1,2,,p.
    It is referred to as the Prony normal equations. In matrix form, the normal equations are
    [ r x ( 1 , 1 ) r x ( 1 , 2 ) r x ( 1 , 3 ) ⋯ r x ( 1 , p ) r x ( 2 , 1 ) r x ( 2 , 2 ) r x ( 2 , 3 ) ⋯ r x ( 2 , p ) r x ( 3 , 1 ) r x ( 3 , 2 ) r x ( 3 , 3 ) ⋯ r x ( 3 , p ) ⋮ ⋮ ⋮ ⋮ r x ( p , 1 ) r x ( p , 2 ) r x ( p , 3 ) ⋯ r x ( p , p ) ] [ a p ( 1 ) a p ( 2 ) a p ( 3 ) ⋮ a p ( p ) ] = − [ r x ( 1 , 0 ) r x ( 2 , 0 ) r x ( 3 , 0 ) ⋮ r x ( p , 0 ) ] \left[\begin{array}{ccccc} r_{x}(1,1) & r_{x}(1,2) & r_{x}(1,3) & \cdots & r_{x}(1, p) \\ r_{x}(2,1) & r_{x}(2,2) & r_{x}(2,3) & \cdots & r_{x}(2, p) \\ r_{x}(3,1) & r_{x}(3,2) & r_{x}(3,3) & \cdots & r_{x}(3, p) \\ \vdots & \vdots & \vdots & & \vdots \\ r_{x}(p, 1) & r_{x}(p, 2) & r_{x}(p, 3) & \cdots & r_{x}(p, p) \end{array}\right]\left[\begin{array}{c} a_{p}(1) \\ a_{p}(2) \\ a_{p}(3) \\ \vdots \\ a_{p}(p) \end{array}\right]=-\left[\begin{array}{c} r_{x}(1,0) \\ r_{x}(2,0) \\ r_{x}(3,0) \\ \vdots \\ r_{x}(p, 0) \end{array}\right] rx(1,1)rx(2,1)rx(3,1)rx(p,1)rx(1,2)rx(2,2)rx(3,2)rx(p,2)rx(1,3)rx(2,3)rx(3,3)rx(p,3)rx(1,p)rx(2,p)rx(3,p)rx(p,p)ap(1)ap(2)ap(3)ap(p)=rx(1,0)rx(2,0)rx(3,0)rx(p,0)
    which may be written concisely as follows
    R x a p ‾ = − r x \mathbf R_x\overline{\mathbf a_p}=-\mathbf r_x Rxap=rx
    where R x \mathbf R_x Rx is a p × p p\times p p×p Hermitian matrix.

  • In fact, observing the matrix X q \mathbf X_q Xq and R x \mathbf R_x Rx, it can be found that
    R x = X q H X q , \mathbf R_x=\mathbf X_q^H \mathbf X_q, Rx=XqHXq,
    and similarly
    r x = X q H x q + 1 \mathbf r_x=\mathbf X_q^H\mathbf x_{q+1} rx=XqHxq+1
    Thus, R x a p ‾ = − r x \mathbf R_x\overline{\mathbf a_p}=-\mathbf r_x Rxap=rx can also be written as ( X q H X q ) a p ‾ = − X q H x q + 1 (\mathbf X_q^H \mathbf X_q)\overline{\mathbf a_p}=-\mathbf X_q^H\mathbf x_{q+1} (XqHXq)ap=XqHxq+1, which is the normal equation in method one.


Once the coefficients a p ( k ) a_p(k) ap(k) have been found, the coefficients b n b_n bn can be obtained by
b q ( n ) = x ( n ) + ∑ k = 1 p a p ( k ) x ( n − k ) . b_q(n)=x(n)+\sum_{k=1}^p a_p(k)x(n-k). bq(n)=x(n)+k=1pap(k)x(nk).
Alternatively, Shank’s method switches back to e ′ [ n ] e′[n] e[n] and minimizes over the entire data. see Hayes, 4.4.2.


The minimum error is (due to the orthogonality principle)
ε p , q = ∥ e ∥ 2 = e H e = ( X q a p ‾ + x q + 1 ) H e = x q + 1 H e \varepsilon_{p,q}=\|\mathbf e\|^2=\mathbf e^H\mathbf e=(\mathbf X_q\overline {\mathbf a_p}+\mathbf x_{q+1})^H\mathbf e=\mathbf x_{q+1}^H \mathbf e εp,q=e2=eHe=(Xqap+xq+1)He=xq+1He
This can further be written as
ϵ p , q = x q + 1 H ( X q a p ‾ + x q + 1 ) = ( x q + 1 H X q ) a p ‾ + x q + 1 H x q + 1 \epsilon_{p, q}=\mathbf{x}_{q+1}^{H}\left(\mathbf{X}_{q} \overline{\mathbf{a}_p}+\mathbf{x}_{q+1}\right)=\left(\mathbf{x}_{q+1}^{H} \mathbf{X}_{q}\right) \overline{\mathbf{a}_p}+\mathbf{x}_{q+1}^{H} \mathbf{x}_{q+1} ϵp,q=xq+1H(Xqap+xq+1)=(xq+1HXq)ap+xq+1Hxq+1
In terms of the autocorrelation sequence,
r x ( 0 , 0 ) = ∑ n = q + 1 ∞ x ∗ [ n ] x [ n ] = x q + 1 H x q + 1 , r x ( 0 , k ) = ∑ n = q + 1 ∞ x ∗ [ n ] x [ n − k ] = [ x q + 1 H X q ] k r_{x}(0,0)=\sum_{n=q+1}^{\infty} x^{*}[n] x[n]=\mathbf{x}_{q+1}^{H} \mathbf{x}_{q+1}, \quad r_{x}(0, k)=\sum_{n=q+1}^{\infty} x^{*}[n] x[n-k]=\left[\mathbf{x}_{q+1}^{H} \mathbf{X}_{q}\right]_{k} rx(0,0)=n=q+1x[n]x[n]=xq+1Hxq+1,rx(0,k)=n=q+1x[n]x[nk]=[xq+1HXq]k
this can be written as
ϵ p , q = r x ( 0 , 0 ) + ∑ k = 1 p a [ k ] r x ( 0 , k ) \epsilon_{p, q}=r_{x}(0,0)+\sum_{k=1}^{p} a[k] r_{x}(0, k) ϵp,q=rx(0,0)+k=1pa[k]rx(0,k)
This can be combined with the equation R x a p ‾ = − r x \mathbf R_x \overline {\mathbf a_p}=-\mathbf r_x Rxap=rx as
[ r x ( 0 , 0 ) r x ( 0 , 1 ) r x ( 0 , 2 ) ⋯ r x ( 0 , p ) r x ( 1 , 0 ) r x ( 1 , 1 ) r x ( 1 , 2 ) ⋯ r x ( 1 , p ) r x ( 2 , 0 ) r x ( 2 , 1 ) r x ( 2 , 2 ) ⋯ r x ( 2 , p ) ⋮ ⋱ ⋮ r x ( p , 0 ) r x ( p , 1 ) r x ( p , 2 ) ⋯ r x ( p , p ) ] [ 1 a [ 1 ] a [ 2 ] ⋮ a [ p ] ] = [ ϵ p , q 0 0 ⋮ 0 ] \left[\begin{array}{c|cccc} r_{x}(0,0) & r_{x}(0,1) & r_{x}(0,2) & \cdots & r_{x}(0, p) \\ \hline r_{x}(1,0) & r_{x}(1,1) & r_{x}(1,2) & \cdots & r_{x}(1, p) \\ r_{x}(2,0) & r_{x}(2,1) & r_{x}(2,2) & \cdots & r_{x}(2, p) \\ \vdots & \ddots & & \vdots & \\ r_{x}(p, 0) & r_{x}(p, 1) & r_{x}(p, 2) & \cdots & r_{x}(p, p) \end{array}\right]\left[\begin{array}{c} 1 \\ \hline a[1] \\ a[2] \\ \vdots \\ a[p] \end{array}\right]=\left[\begin{array}{c} \epsilon_{p, q} \\ \hline 0 \\ 0 \\ \vdots \\ 0 \end{array}\right] rx(0,0)rx(1,0)rx(2,0)rx(p,0)rx(0,1)rx(1,1)rx(2,1)rx(p,1)rx(0,2)rx(1,2)rx(2,2)rx(p,2)rx(0,p)rx(1,p)rx(2,p)rx(p,p)1a[1]a[2]a[p]=ϵp,q000
or also as
R x a p = ϵ p , q u 1 \mathbf R_x \mathbf a_p=\epsilon_{p,q}\mathbf u_1 Rxap=ϵp,qu1
These are the augmented normal equations.

DM-Special case: All-pole modeling

If q = 0 q=0 q=0, we have an all-pole model:
H ( z ) = b ( 0 ) 1 + ∑ k = 1 p a p ( k ) z − k H(z)=\frac{b(0)}{1+\sum_{k=1}^{p} a_{p}(k) z^{-k}} H(z)=1+k=1pap(k)zkb(0)
Recall that the solution is given by solving the overdetermined system ( q = 0 ) (q=0) (q=0)
[ x [ 0 ] 0 0 ⋯ 0 x [ 1 ] x [ 0 ] 0 ⋯ 0 x [ 2 ] x [ 1 ] x [ 0 ] ⋱ 0 x [ 3 ] x [ 2 ] x [ 1 ] ⋱ x [ 0 ] ⋮ ⋮ ⋮ ⋮ ⋮ ] [ a [ 1 ] a [ 2 ] a [ 3 ] ⋮ a [ p ] ] = − [ x [ 1 ] x [ 2 ] x [ 3 ] ⋮ ⋮ ] \left[\begin{array}{ccccc} x[0] & 0 & 0 & \cdots & 0 \\ x[1] & x[0] & 0 & \cdots & 0 \\ x[2] & x[1] & x[0] & \ddots & 0 \\ x[3] & x[2] & x[1] & \ddots & x[0] \\ \vdots & \vdots & \vdots & \vdots & \vdots \end{array}\right]\left[\begin{array}{c} a[1] \\ a[2] \\ a[3] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{c} x[1] \\ x[2] \\ x[3] \\ \vdots \\ \vdots \end{array}\right] x[0]x[1]x[2]x[3]0x[0]x[1]x[2]00x[0]x[1]000x[0]a[1]a[2]a[3]a[p]=x[1]x[2]x[3]
The normal equations become (premultiply with X 0 H \mathbf X_0^H X0H)
[ x ∗ [ 0 ] x ∗ [ 1 ] x ∗ [ 2 ] ⋯ 0 ⋱ ⋱ 0 0 x ∗ [ 0 ] ⋯ ] [ x [ 0 ] 0 0 x [ 1 ] ⋱ 0 x [ 2 ] ⋱ x [ 0 ] ⋮ ⋮ ⋮ ] [ a [ 1 ] ⋮ a [ p ] ] = − [ x ∗ [ 0 ] x ∗ [ 1 ] x ∗ [ 2 ] ⋯ 0 ⋱ ⋱ 0 0 x ∗ [ 0 ] ⋯ ] [ x [ 1 ] x [ 2 ] x [ 3 ] ⋮ ] \left[\begin{array}{cccc} x^{*}[0] & x^{*}[1] & x^{*}[2] \cdots \\ 0 & \ddots & \ddots \\ 0 & 0 & x^{*}[0] \cdots \end{array}\right]\left[\begin{array}{ccc} x[0] & 0 & 0 \\ x[1] & \ddots & 0 \\ x[2] & \ddots & x[0] \\ \vdots & \vdots & \vdots \end{array}\right]\left[\begin{array}{c} a[1] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{cccc} x^{*}[0] & x^{*}[1] & x^{*}[2] \cdots \\ 0 & \ddots & \ddots \\ 0 & 0 & x^{*}[0] \cdots \end{array}\right]\left[\begin{array}{c} x[1] \\ x[2] \\ x[3] \\ \vdots \end{array}\right] x[0]00x[1]0x[2]x[0]x[0]x[1]x[2]000x[0]a[1]a[p]=x[0]00x[1]0x[2]x[0]x[1]x[2]x[3]
From this structure it is seen that the normal equations become
[ r x ( 0 ) r x ∗ ( 1 ) ⋯ r x ∗ ( p − 1 ) r x ( 1 ) r x ( 0 ) ⋯ r x ∗ ( p − 2 ) ⋮ ⋱ ⋮ r x ( p − 1 ) r x ( p − 2 ) ⋯ r x ( 0 ) ] [ a [ 1 ] a [ 2 ] ⋮ a [ p ] ] = − [ r x ( 1 ) r x ( 2 ) ⋮ r x ( p ) ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}^{*}(1) & \cdots & r_{x}^{*}(p-1) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}^{*}(p-2) \\ \vdots & \ddots & \vdots \\ r_{x}(p-1) & r_{x}(p-2) & \cdots & r_{x}(0) \end{array}\right]\left[\begin{array}{c} a[1] \\ a[2] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{c} r_{x}(1) \\ r_{x}(2) \\ \vdots \\ r_{x}(p) \end{array}\right] rx(0)rx(1)rx(p1)rx(1)rx(0)rx(p2)rx(p1)rx(p2)rx(0)a[1]a[2]a[p]=rx(1)rx(2)rx(p)
where the autocorrelation sequence is in fact “stationary”:
r x ( k , l ) = ∑ n = 0 ∞ x [ n − k ] x [ n − l ] ∗ = x [ k − l ] x [ 0 ] ∗ + x [ k − l + 1 ] x [ 1 ] ∗ + ⋯ = ∑ n = 0 ∞ x [ n + k − l ] x [ n ] ∗ = r x ( k − l ) \begin{aligned} r_x(k,l)&=\sum_{n=0}^\infty x[n-k]x[n-l]^*=x[k-l]x[0]^*+x[k-l+1]x[1]^*+\cdots\\ &=\sum_{n=0}^\infty x[n+k-l]x[n]^*=r_x(k-l) \end{aligned} rx(k,l)=n=0x[nk]x[nl]=x[kl]x[0]+x[kl+1]x[1]+=n=0x[n+kl]x[n]=rx(kl)
For an all-pole model, the matrix R x \mathbf R_x Rx has a Toeplitz structure; it can be efficiently inverted using the Levinson algorithm (later in Ch. 5)

Finally, the minimum modeling error for the all-pole model is given by
{ E p } m i n = ϵ p = r x ( 0 ) + ∑ k = 1 p a p ( k ) r x ∗ ( k ) \{\mathcal E_p\}_{min}=\epsilon_p=r_x(0)+\sum_{k=1}^p a_p(k)r^*_x(k) {Ep}min=ϵp=rx(0)+k=1pap(k)rx(k)
Incorporating it into the normal equations,
[ r x ( 0 ) r x ∗ ( 1 ) r x ∗ ( 2 ) ⋯ r x ∗ ( p ) r x ( 1 ) r x ( 0 ) r x ∗ ( 1 ) ⋯ r x ∗ ( p − 1 ) r x ( 2 ) r x ( 1 ) r x ( 0 ) ⋯ r x ∗ ( p − 2 ) ⋮ ⋮ ⋮ ⋮ r x ( p ) r x ( p − 1 ) r x ( p − 2 ) ⋯ r x ( 0 ) ] [ 1 a p ( 1 ) a p ( 2 ) ⋮ a p ( p ) ] = [ ϵ p 0 0 ⋮ 0 ] \left[\begin{array}{ccccc} r_{x}(0) & r_{x}^{*}(1) & r_{x}^{*}(2) & \cdots & r_{x}^{*}(p) \\ r_{x}(1) & r_{x}(0) & r_{x}^{*}(1) & \cdots & r_{x}^{*}(p-1) \\ r_{x}(2) & r_{x}(1) & r_{x}(0) & \cdots & r_{x}^{*}(p-2) \\ \vdots & \vdots & \vdots & & \vdots \\ r_{x}(p) & r_{x}(p-1) & r_{x}(p-2) & \cdots & r_{x}(0) \end{array}\right]\left[\begin{array}{c} 1 \\ a_{p}(1) \\ a_{p}(2) \\ \vdots \\ a_{p}(p) \end{array}\right]=\left[\begin{array}{c} \epsilon_{p} \\ 0 \\ 0 \\ \vdots \\ 0 \end{array}\right] rx(0)rx(1)rx(2)rx(p)rx(1)rx(0)rx(1)rx(p1)rx(2)rx(1)rx(0)rx(p2)rx(p)rx(p1)rx(p2)rx(0)1ap(1)ap(2)ap(p)=ϵp000
or
R x a p = ϵ p u 1 \mathbf R_x \mathbf a_p=\epsilon _p \mathbf u_1 Rxap=ϵpu1
The numerator will be chosen as b [ 0 ] = ϵ p b[0]=\sqrt{\epsilon_p} b[0]=ϵp . It can be shown that then
r x ( k ) = r h ( k ) , ∣ k ∣ ≤ p r_x(k)=r_h(k),|k|\le p rx(k)=rh(k),kp
Meaning: the autocorrelation sequence of the filter matches the first p p p lags of the autocorrelation sequence of the specified data (“moment matching”).

All-pole modeling with finite data

Since Prony’s method assumes that x ( n ) x(n) x(n) is known for all n n n, the coefficients a p ( k ) a_p(k) ap(k) are found by minimizing the sum of the squares of e ( n ) e(n) e(n) for all n > q n > q n>q. What happens, however, when x ( n ) x(n) x(n) is only known or measured for values of n n n over a finite interval such as [ 0 , N ] [0, N] [0,N]?

Autocorrelation method

Assume that x ( n ) = 0 x(n) = 0 x(n)=0 for n < 0 n < 0 n<0 and for n > N n > N n>N and then use Prony’s method to find the coefficients a p ( k ) a_p(k) ap(k). In other words, we form a new signal x ~ ( n ) \tilde x(n) x~(n) by applying a rectangular window to x ( n ) x(n) x(n), i.e.
x ~ ( n ) = x ( n ) w R ( n ) \tilde x(n)=x(n)w_R(n) x~(n)=x(n)wR(n)
where
w R ( n ) = { 1 ; n = 0 , 1 , ⋯   , N 0 ; otherwise w_R(n)=\left\{\begin{array}{ll}1;\quad n=0,1,\cdots,N\\ 0;\quad\text{otherwise} \end{array} \right. wR(n)={1;n=0,1,,N0;otherwise
and then uses Prony’s method to find an all-pole model for x ~ ( n ) \tilde x(n) x~(n):
r x ( k ) = ∑ n = 0 ∞ x ~ ( n ) x ~ ∗ ( n − k ) = ∑ n = k N x ( n ) x ∗ ( n − k ) ; k = 0 , 1 , ⋯   , p r_x(k)=\sum_{n=0}^\infty \tilde x(n)\tilde x^*(n-k)=\sum_{n=k}^{N}x(n)x^*(n-k);\quad k=0,1,\cdots,p rx(k)=n=0x~(n)x~(nk)=n=kNx(n)x(nk);k=0,1,,p
It can also be interpreted as solving the overdetermined system
[ x ( 0 ) 0 0 ⋯ 0 x ( 1 ) x ( 0 ) 0 ⋯ 0 x ( 2 ) x ( 1 ) x ( 0 ) ⋯ 0 ⋮ ⋮ ⋮ ⋮ − − − − − − − − − − − − − − − x ( p − 1 ) x ( p − 2 ) x ( p − 3 ) ⋯ x ( 0 ) x ( p ) x ( p − 1 ) x ( p − 2 ) ⋯ x ( 1 ) ⋮ ⋮ ⋮ ⋮ x ( N − 2 ) x ( N − 3 ) x ( N − 4 ) ⋯ x ( N − p − 1 ) x ( N − 1 ) x ( N − 2 ) x ( N − 3 ) ⋯ x ( N − p ) − − − − − − − − − − − − − − − x ( N ) x ( N − 1 ) x ( N − 2 ) ⋯ x ( N − p + 1 ) 0 x ( N ) x ( N − 1 ) ⋯ x ( N − p + 2 ) ⋮ ⋮ ⋮ ⋮ 0 0 0 ⋯ x ( N ) ] [ a [ 1 ] a [ 2 ] a [ 3 ] ⋮ a [ p ] ] = − [ x [ 1 ] x [ 2 ] x [ 3 ] ⋮ x [ N ] 0 ⋮ 0 ] \left[\begin{array}{ccccc} x(0) & 0 & 0 & \cdots & 0 \\ x(1) & x(0) & 0 & \cdots & 0 \\ x(2) & x(1) & x(0) & \cdots & 0 \\ \vdots & \vdots & \vdots & & \vdots \\ ---&---&---&---&---\\ x(p-1) & x(p-2) & x(p-3) & \cdots & x(0) \\ x(p) & x(p-1) & x(p-2) & \cdots & x(1) \\ \vdots & \vdots & \vdots & & \vdots \\ x(N-2) & x(N-3) & x(N-4) & \cdots & x(N-p-1) \\ x(N-1) & x(N-2) & x(N-3) & \cdots & x(N-p) \\ ---&---&---&---&---\\ x(N) & x(N-1) & x(N-2) & \cdots & x(N-p+1) \\ 0 & x(N) & x(N-1) & \cdots & x(N-p+2) \\ \vdots & \vdots & \vdots & & \vdots \\ 0 & 0 & 0 & \cdots & x(N) \end{array}\right]\left[\begin{array}{c} a[1] \\ a[2] \\ a[3] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{c} x[1] \\ x[2] \\ x[3] \\ \vdots \\ x[N]\\ 0\\ \vdots\\ 0 \end{array}\right] x(0)x(1)x(2)x(p1)x(p)x(N2)x(N1)x(N)000x(0)x(1)x(p2)x(p1)x(N3)x(N2)x(N1)x(N)000x(0)x(p3)x(p2)x(N4)x(N3)x(N2)x(N1)0000x(0)x(1)x(Np1)x(Np)x(Np+1)x(Np+2)x(N)a[1]a[2]a[3]a[p]=x[1]x[2]x[3]x[N]00

Covariance method

If we know that x [ n ] x[n] x[n] is not zero for n < 0 n < 0 n<0 and n > N n > N n>N, it is more accurate to omit those equations that contain an extension with zeros:
[ x [ p − 1 ] ⋱ x [ 0 ] x [ p ] ⋱ x [ 1 ] ⋮ ⋱ ⋮ x [ N − 1 ] ⋱ x [ N − p ] ] [ a [ 1 ] ⋮ a [ p ] ] = − [ x [ p ] x [ p + 1 ] ⋮ x [ N ] ] \left[\begin{array}{ccc} x[p-1] & \ddots & x[0] \\ x[p] & \ddots & x[1] \\ \vdots & \ddots & \vdots \\ x[N-1] & \ddots & x[N-p] \end{array}\right]\left[\begin{array}{c} a[1] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{l} x[p] \\ x[p+1] \\ \vdots \\ x[N] \end{array}\right] x[p1]x[p]x[N1]x[0]x[1]x[Np]a[1]a[p]=x[p]x[p+1]x[N]
We will have slightly different normal equations as before, with r x ( k , ℓ ) r_x (k, ℓ) rx(k,) defined as
r x ( k , l ) = ∑ n = p N x ( n − k ) x ∗ ( n − l ) r_x(k,l)=\sum_{n=p}^N x(n-k)x^*(n-l) rx(k,l)=n=pNx(nk)x(nl)
Unlike the autocorrelation normal equations, the covariance normal equations are not Toeplitz.

Examples

Pole estimation

Consider the finite data sequence x = [ 1 , β , β 2 , ⋯   , β N ] T . x=\left[1, \beta, \beta^{2}, \cdots, \beta^{N}\right]^{T} . x=[1,β,β2,,βN]T. Estimate an A R ( 1 ) A R(1) AR(1) model for this signal ( p = 1 ) (p=1) (p=1) :
H ( z ) = b ( 0 ) 1 + a [ 1 ] z − 1 H(z)=\frac{b(0)}{1+a[1] z^{-1}} H(z)=1+a[1]z1b(0)
Autocorrelation method
The normal equations collapse to r x ( 0 ) a [ 1 ] = − r x ( 1 ) , r_{x}(0) a[1]=-r_{x}(1), rx(0)a[1]=rx(1), where
r x ( 0 ) = [ 1 , β , β 2 , ⋯   , β N ] [ 1 β β 2 ⋮ β N ] = ∑ n = 0 N ∣ β ∣ 2 n = 1 − ∣ β ∣ 2 N + 2 1 − ∣ β ∣ 2 r x ( 1 ) = [ 0 , 1 , β , ⋯   , β N ] [ 1 β N 0 ] = ∑ n = 0 N − 1 β ∣ β ∣ 2 n = β 1 − ∣ β ∣ 2 N 1 − ∣ β ∣ 2 ⇒ a [ 1 ] = − β 1 − ∣ β ∣ 2 N 1 − ∣ β ∣ 2 N + 2 \begin{array}{r} r_{x}(0)=\left[1, \beta, \beta^{2}, \cdots, \beta^{N}\right]\left[\begin{array}{c} 1 \\ \beta \\ \beta^{2} \\ \vdots \\ \beta^{N} \end{array}\right]=\sum_{n=0}^{N}|\beta|^{2 n}=\frac{1-|\beta|^{2 N+2}}{1-|\beta|^{2}} \\ r_{x}(1)=\left[0,1, \beta, \cdots, \beta^{N}\right]\left[\begin{array}{c} 1 \\ \beta^{N} \\ 0 \end{array}\right]=\sum_{n=0}^{N-1} \beta|\beta|^{2 n}=\beta \frac{1-|\beta|^{2 N}}{1-|\beta|^{2}} \\ \Rightarrow \quad a[1]=-\beta \frac{1-|\beta|^{2 N}}{1-|\beta|^{2 N+2}} \end{array} rx(0)=[1,β,β2,,βN]1ββ2βN=n=0Nβ2n=1β21β2N+2rx(1)=[0,1,β,,βN]1βN0=n=0N1ββ2n=β1β21β2Na[1]=β1β2N+21β2N
Covariance method

The normal equations collapse to r x ( 1 , 1 ) a [ 1 ] = − r x ( 1 , 0 ) , r_{x}(1,1) a[1]=-r_{x}(1,0), rx(1,1)a[1]=rx(1,0), where
r x ( 1 , 1 ) = [ 1 , β , β 2 , ⋯   , β N − 1 ] [ 1 β β 2 ⋮ β N − 1 ] = ∑ n = 0 N − 1 ∣ β ∣ 2 n = 1 − ∣ β ∣ 2 N 1 − ∣ β ∣ 2 r x ( 1 , 0 ) = [ 1 , β , ⋯   , β N − 1 ] [ β 2 ⋮ β N ] = ∑ n = 0 N − 1 β ∣ β ∣ 2 n = β 1 − ∣ β ∣ 2 N 1 − ∣ β ∣ 2 \begin{array}{l} r_{x}(1,1)=\left[1, \beta, \beta^{2}, \cdots, \beta^{N-1}\right]\left[\begin{array}{c} 1 \\ \beta \\ \beta^{2} \\ \vdots \\ \beta^{N-1} \end{array}\right]=\sum_{n=0}^{N-1}|\beta|^{2 n}=\frac{1-|\beta|^{2 N}}{1-|\beta|^{2}} \\ r_{x}(1,0)=\left[1, \beta, \cdots, \beta^{N-1}\right]\left[\begin{array}{c} \beta^{2} \\ \vdots \\ \beta^{N} \end{array}\right]=\sum_{n=0}^{N-1} \beta|\beta|^{2 n}=\beta \frac{1-|\beta|^{2 N}}{1-|\beta|^{2}} \end{array} rx(1,1)=[1,β,β2,,βN1]1ββ2βN1=n=0N1β2n=1β21β2Nrx(1,0)=[1,β,,βN1]β2βN=n=0N1ββ2n=β1β21β2N
Thus, a [ 1 ] = − β . a[1]=-\beta . a[1]=β. Set b ( 0 ) = 1 b(0)=1 b(0)=1 so that x ( 0 ) = x ^ ( 0 ) x(0)=\hat x(0) x(0)=x^(0) then
H ( z ) = 1 1 − β z − 1 H(z)=\frac{1}{1-\beta z^{-1}} H(z)=1βz11
The pole location is found exactly.

Channel inversion

Signal Modeling

Consider a communication channel with a known (estimated) transfer function G ( z ) G(z) G(z).
At the receiver, we wish to equalize (invert) the channel using an equalizer H ( z ) H(z) H(z) :
G ( z ) H ( z ) = 1 ⇔ g [ n ] ∗ h [ n ] = δ [ n ] G(z) H(z)=1 \quad \Leftrightarrow \quad g[n] * h[n]=\delta[n] G(z)H(z)=1g[n]h[n]=δ[n]
Further H ( z ) H(z) H(z) must be causal and stable, typically FIR.
If G ( z ) G(z) G(z) is not minimum-phase (has zeros outside the unit circle), causal inversion
is not possible, and we allow for a delay:
G ( z ) H ( z ) = z − M ⇔ g [ n ] ∗ h [ n ] = δ [ n − M ] = : d [ n ] G(z) H(z)=z^{-M} \quad \Leftrightarrow \quad g[n] * h[n]=\delta[n-M]=: d[n] G(z)H(z)=zMg[n]h[n]=δ[nM]=:d[n]
Design of an FIR equalizer H ( z ) H(z) H(z) of length N N N :
e [ n ] = d [ n ] − h [ n ] ∗ g [ n ] = d [ n ] − ∑ ℓ = 0 N − 1 h [ ℓ ] g [ n − ℓ ] e[n]=d[n]-h[n] * g[n]=d[n]-\sum_{\ell=0}^{N-1} h[\ell] g[n-\ell] e[n]=d[n]h[n]g[n]=d[n]=0N1h[]g[n]
Minimize E = ∑ n = 0 ∞ ∣ e [ n ] ∣ 2 = ∑ n = 0 ∞ ∣ d [ n ] − ∑ ℓ = 0 N − 1 h [ ℓ ] g [ n − ℓ ] ∣ 2 : \mathcal{E}=\sum_{n=0}^{\infty}|e[n]|^{2}=\sum_{n=0}^{\infty}\left|d[n]-\sum_{\ell=0}^{N-1} h[\ell] g[n-\ell]\right|^{2}: E=n=0e[n]2=n=0d[n]=0N1h[]g[n]2:
min ⁡ { h [ ℓ ] } ∥ [ g [ 0 ] 0 0 g [ 1 ] ⋱ 0 ⋮ ⋱ g [ 0 ] g [ N ] ⋱ g [ 1 ] ⋮ ⋱ ⋮ ] [ h [ 0 ] ⋮ h [ N − 1 ] ] − [ d [ 0 ] ⋮ d [ N − 1 ] d [ N ] ⋮ ] ∥ 2 \min _{\{h[\ell]\}}\left\|\left[\begin{array}{ccc} g[0] & 0 & 0 \\ g[1] & \ddots & 0 \\ \vdots & \ddots & g[0] \\ g[N] & \ddots & g[1] \\ \vdots & \ddots & \vdots \end{array}\right]\left[\begin{array}{c} h[0] \\ \vdots \\ h[N-1] \end{array}\right]-\left[\begin{array}{c} d[0] \\ \vdots \\ d[N-1] \\ d[N] \\ \vdots \end{array}\right]\right\|^{2} {h[]}ming[0]g[1]g[N]000g[0]g[1]h[0]h[N1]d[0]d[N1]d[N]2
The LS solution of this overdetermined system G h = d \mathrm{Gh}=\mathrm{d} Gh=d is h = G † d = ( G H G ) − 1 G H d \mathrm{h}=\mathrm{G}^{\dagger} \mathrm{d}=\left(\mathrm{G}^{H} \mathrm{G}\right)^{-1} \mathrm{G}^{H} \mathrm{d} h=Gd=(GHG)1GHd
Also, the corresponding normal equations are ( G H G ) h = G H d \left(\mathrm{G}^{H} \mathrm{G}\right) \mathrm{h}=\mathrm{G}^{H} \mathrm{d} (GHG)h=GHd, or
[ r g ( 0 ) ⋯ r g ∗ ( N − 1 ) ⋮ ⋮ r g ( N − 1 ) ⋯ r g ( 0 ) ] [ h [ 0 ] ⋮ h [ N − 1 ] ] = − [ r d g ( 0 ) ⋮ r d g ( N − 1 ) ] \left[\begin{array}{lll} r_{g}(0) & \cdots & r_{g}^{*}(N-1) \\ \vdots & \vdots & \\ r_{g}(N-1) & \cdots & r_{g}(0) \end{array}\right]\left[\begin{array}{c} h[0] \\ \vdots \\ h[N-1] \end{array}\right]=-\left[\begin{array}{c} r_{d g}(0) \\ \vdots \\ r_{d g}(N-1) \end{array}\right] rg(0)rg(N1)rg(N1)rg(0)h[0]h[N1]=rdg(0)rdg(N1)
where r g ( k ) = ∑ n = 0 ∞ g [ n ] g ∗ [ n − k ] r_{g}(k)=\sum_{n=0}^{\infty} g[n] g^{*}[n-k] rg(k)=n=0g[n]g[nk] and r d g ( k ) = ∑ n = 0 ∞ d [ n ] g ∗ [ n − k ] . r_{d g}(k)=\sum_{n=0}^{\infty} d[n] g^{*}[n-k] . rdg(k)=n=0d[n]g[nk]. Because the matrix R g = G H G \mathrm{R}_{g}=\mathrm{G}^{H} \mathrm{G} Rg=GHG is Toeplitz, it can be inverted efficiently.

Stochastic models

We now consider modeling a stochastic process x [ n ] x[n] x[n] via a rational filter H ( z ) H(z) H(z).

Signal Modeling

Instead of minimizing ∑ ∣ e ′ [ n ] ∣ 2 , \sum\left|e^{\prime}[n]\right|^{2}, e[n]2, it makes more sense to minimize
E ( ∣ e ′ [ n ] ∣ 2 ) = E ( ∣ x [ n ] − x ^ [ n ] ∣ 2 ) E\left(\left|e^{\prime}[n]\right|^{2}\right)=E\left(|x[n]-\hat x[n]|^{2}\right) E(e[n]2)=E(x[n]x^[n]2)
As before, minimizing this error is not tractable.
Instead, we can try to match the correlation sequence r x ( k ) r_{x}(k) rx(k) of x [ n ] x[n] x[n] with the correlation sequence of x ^ [ n ] \hat{x}[n] x^[n].

Autoregressive moving average processes

Suppose we filter white noise v ( n ) v(n) v(n) of variance σ v 2 \sigma_{v}^{2} σv2 with the filter
H ( z ) = B ( z ) A ( z ) = ∑ k = 0 q b ( k ) z − k 1 + ∑ k = 1 p a ( k ) z − k H(z)=\frac{B(z)}{A(z)}=\frac{\sum_{k=0}^{q} b(k) z^{-k}}{1+\sum_{k=1}^{p} a(k) z^{-k}} H(z)=A(z)B(z)=1+k=1pa(k)zkk=0qb(k)zk
Assuming that the filter is stable, the output process x ( n ) x(n) x(n) will be wide-sense stationary and the power spectrum of the output x ( n ) x(n) x(n) can then be written as
P x ( z ) = σ v 2 B ( z ) B ∗ ( 1 / z ∗ ) A ( z ) A ∗ ( 1 / z ∗ ) P x ( e j ω ) = σ v 2 ∣ B ( e j ω ) ∣ 2 ∣ A ( e j ω ) ∣ 2 P_{x}(z)=\sigma_{v}^{2} \frac{B(z) B^{*}\left(1 / z^{*}\right)}{A(z) A^{*}\left(1 / z^{*}\right)} \quad P_{x}\left(e^{j \omega}\right)=\sigma_{v}^{2} \frac{\left|B\left(e^{j \omega}\right)\right|^{2}}{\left|A\left(e^{j \omega}\right)\right|^{2}} Px(z)=σv2A(z)A(1/z)B(z)B(1/z)Px(ejω)=σv2A(ejω)2B(ejω)2

  • Such a process is known as an autoregressive moving average process of order ( p , q ) , (p, q), (p,q), or A R M A ( p , q ) \mathrm{ARMA}(p, q) ARMA(p,q).

  • The power spectrum of an ARMA ( p , q ) (p, q) (p,q) process has 2 p 2 p 2p poles and 2 q 2 q 2q zeros with conjugate reciprocal symmetry.

Yule-Walker equations

From the LCCDE between v ( n ) v(n) v(n) and x ( n ) = h ( n ) ∗ v ( n ) x(n)=h(n) * v(n) x(n)=h(n)v(n) :
x ( n ) + ∑ ℓ = 1 p a ( ℓ ) x ( n − ℓ ) = ∑ ℓ = 0 q b ( ℓ ) v ( n − ℓ ) x(n)+\sum_{\ell=1}^{p} a(\ell) x(n-\ell)=\sum_{\ell=0}^{q} b(\ell) v(n-\ell) x(n)+=1pa()x(n)==0qb()v(n)
we can multiply both sides with x ∗ ( n − k ) x^{*}(n-k) x(nk) and take the expectation:
r x ( k ) + ∑ ℓ = 1 p a ( ℓ ) r x ( k − ℓ ) = ∑ ℓ = 0 q b ( ℓ ) E { v ( n − ℓ ) x ∗ ( n − k ) } = ∑ ℓ = 0 q b ( ℓ ) r v x ( k − ℓ ) r_{x}(k)+\sum_{\ell=1}^{p} a(\ell) r_{x}(k-\ell)=\sum_{\ell=0}^{q} b(\ell) E\left\{v(n-\ell) x^{*}(n-k)\right\}=\sum_{\ell=0}^{q} b(\ell) r_{v x}(k-\ell) rx(k)+=1pa()rx(k)==0qb()E{v(n)x(nk)}==0qb()rvx(k)
The cross correlation between v ( n ) v(n) v(n) and x ( n ) x(n) x(n) can further be expressed as
r v x ( k − ℓ ) = E { v ( k ) x ∗ ( ℓ ) } = ∑ m = − ∞ ∞ E { v ( k ) v ∗ ( ℓ − m ) } h ∗ ( m ) = σ v 2 h ∗ ( ℓ − k ) r_{v x}(k-\ell)=E\left\{v(k) x^{*}(\ell)\right\}=\sum_{m=-\infty}^{\infty} E\left\{v(k) v^{*}(\ell-m)\right\} h^{*}(m)=\sigma_{v}^{2} h^{*}(\ell-k) rvx(k)=E{v(k)x()}=m=E{v(k)v(m)}h(m)=σv2h(k)
For k ≥ 0 , k \geq 0, k0, this leads to the Yule-Walker equations
r x ( k ) + ∑ ℓ = 1 p a ( ℓ ) r x ( k − ℓ ) = { σ v 2 ∑ l = 0 q b ( ℓ ) h ∗ ( ℓ − k ) = σ v 2 c ( k ) ; 0 ≤ k ≤ q 0 ; k > q r_{x}(k)+\sum_{\ell=1}^{p} a(\ell) r_{x}(k-\ell)=\left\{\begin{array}{cc} &\sigma_{v}^{2} \sum_{l=0}^{q} b(\ell) h^{*}(\ell-k)=\sigma_{v}^{2} c(k) ;& 0 \leq k \leq q \\ &0 ;& k>q \end{array}\right. rx(k)+=1pa()rx(k)={σv2l=0qb()h(k)=σv2c(k);0;0kqk>q

Autoregressive processes

An A R M A ( p , 0 ) \mathrm {ARMA}(p,0) ARMA(p,0) process is an autoregressive process, or A R ( p ) \mathrm{AR}(p) AR(p):
H ( z ) = b ( 0 ) 1 + ∑ k = 1 ρ a ( k ) z − k P x ( z ) = σ v 2 ∣ b ( 0 ) ∣ 2 A ( z ) A ∗ ( 1 / z ∗ ) , P x ( e j ω ) = σ v 2 ∣ b ( 0 ) ∣ 2 ∣ A ( e j ω ) ∣ 2 H(z)=\frac{b(0)}{1+\sum_{k=1}^{\rho} a(k) z^{-k}}\\ \quad P_{x}(z)=\sigma_{v}^{2} \frac{|b(0)|^{2}}{A(z) A^{*}\left(1 / z^{*}\right)}, \quad P_{x}\left(e^{j \omega}\right)=\sigma_{v}^{2} \frac{|b(0)|^{2}}{\left|A\left(e^{j \omega}\right)\right|^{2}} H(z)=1+k=1ρa(k)zkb(0)Px(z)=σv2A(z)A(1/z)b(0)2,Px(ejω)=σv2A(ejω)2b(0)2

The Yule-Walker equations are given by
r x ( k ) + ∑ ℓ = 1 p a ( ℓ ) r x ( k − ℓ ) = σ v 2 ∣ b ( 0 ) ∣ 2 δ ( k ) ; k ≥ 0 r_{x}(k)+\sum_{\ell=1}^{p} a(\ell) r_{x}(k-\ell)=\sigma_{v}^{2}|b(0)|^{2} \delta(k) ; k \geq 0 rx(k)+=1pa()rx(k)=σv2b(0)2δ(k);k0

Moving average processes

An A R M A ( 0 , q ) \mathrm {ARMA}(0,q) ARMA(0,q) process is a moving average process, or M A ( q ) \mathrm{MA}(q) MA(q):
H ( z ) = B ( z ) = ∑ k = 0 q b ( k ) z − k P x ( z ) = σ v 2 B ( z ) B ∗ ( 1 / z ∗ ) , P x ( e j ω ) = σ v 2 ∣ B ( e j ω ) ∣ 2 H(z)=B(z)=\sum_{k=0}^{q} b(k) z^{-k}\\ P_{x}(z)=\sigma_{v}^{2} B(z) B^{*}\left(1 / z^{*}\right), \quad P_{x}\left(e^{j \omega}\right)=\sigma_{v}^{2}\left|B\left(e^{j \omega}\right)\right|^{2} H(z)=B(z)=k=0qb(k)zkPx(z)=σv2B(z)B(1/z),Px(ejω)=σv2B(ejω)2
The Yule-Walker equations are given by
r x ( k ) = σ v 2 ∑ ℓ = 0 q b ( ℓ ) b ∗ ( k − ℓ ) = σ v 2 b ( k ) ∗ b ∗ ( − k ) r_{x}(k)=\sigma_{v}^{2} \sum_{\ell=0}^{q} b(\ell) b^{*}(k-\ell)=\sigma_{v}^{2} b(k) * b^{*}(-k) rx(k)=σv2=0qb()b(k)=σv2b(k)b(k)
The autocorrelation function is zero outside [ − q , q ] [-q, q] [q,q]
Estimating b ( k ) b(k) b(k) from the Yule-Walker equations is not easy (nonlinear).

Stochastic model: Pade

In matrix form, the Yule-Walker equations become
[ r x ( 0 ) r x ( − 1 ) ⋯ r x ( − p ) r x ( 1 ) r x ( 0 ) ⋯ r x ( − p + 1 ) ⋮ ⋮ ⋮ r x ( q ) r x ( q − 1 ) ⋯ r x ( q − p ) r x ( q + 1 ) r x ( q ) ⋯ r x ( q − p + 1 ) ⋮ ⋮ ⋮ r x ( L ) r x ( L − 1 ) ⋯ r x ( L − p ) ⋮ ⋮ ⋮ ] [ 1 a ( 1 ) a ( 2 ) ⋮ a ( p ) ] = σ v 2 [ c ( 0 ) c ( 1 ) ⋮ c ( q ) 0 ⋮ 0 ⋮ ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}(-1) & \cdots & r_{x}(-p) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}(-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(q) & r_{x}(q-1) & \cdots & r_{x}(q-p) \\ \hline r_{x}(q+1) & r_{x}(q) & \cdots & r_{x}(q-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(L) & r_{x}(L-1) & \cdots & r_{x}(L-p)\\ \vdots & \vdots & & \vdots \\ \end{array}\right]\left[\begin{array}{c} 1 \\ a(1) \\ a(2)\\ \vdots \\ a(p) \end{array}\right]=\sigma_{v}^{2} \left[\begin{array}{c} c(0) \\ c(1) \\ \vdots \\ c(q) \\ \hline 0 \\ \vdots \\ 0\\ \vdots \end{array}\right] rx(0)rx(1)rx(q)rx(q+1)rx(L)rx(1)rx(0)rx(q1)rx(q)rx(L1)rx(p)rx(p+1)rx(qp)rx(qp+1)rx(Lp)1a(1)a(2)a(p)=σv2c(0)c(1)c(q)00
Keep the first p + q + 1 p+q+1 p+q+1 equations when k = 0 , 1 , ⋯   , p + q k=0,1,\cdots,p+q k=0,1,,p+q: (because we only have p + q + 1 p+q+1 p+q+1 unknowns).
[ r x ( 0 ) r x ( − 1 ) ⋯ r x ( − p ) r x ( 1 ) r x ( 0 ) ⋯ r x ( − p + 1 ) ⋮ ⋮ ⋮ r x ( q ) r x ( q − 1 ) ⋯ r x ( q − p ) r x ( q + 1 ) r x ( q ) ⋯ r x ( q − p + 1 ) ⋮ ⋮ ⋮ r x ( q + p ) r x ( q + p − 1 ) ⋯ r x ( q ) ] [ 1 a ( 1 ) ⋮ a ( p ) ] = σ v 2 [ c ( 0 ) c ( 1 ) ⋮ c ( q ) 0 ⋮ 0 ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}(-1) & \cdots & r_{x}(-p) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}(-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(q) & r_{x}(q-1) & \cdots & r_{x}(q-p) \\ \hline r_{x}(q+1) & r_{x}(q) & \cdots & r_{x}(q-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(q+p) & r_{x}(q+p-1) & \cdots & r_{x}(q) \end{array}\right]\left[\begin{array}{c} 1 \\ a(1) \\ \vdots \\ a(p) \end{array}\right]=\sigma_{v}^{2} \left[\begin{array}{c} c(0) \\ c(1) \\ \vdots \\ c(q) \\ \hline 0 \\ \vdots \\ 0 \end{array}\right] rx(0)rx(1)rx(q)rx(q+1)rx(q+p)rx(1)rx(0)rx(q1)rx(q)rx(q+p1)rx(p)rx(p+1)rx(qp)rx(qp+1)rx(q)1a(1)a(p)=σv2c(0)c(1)c(q)00
We can solve a ( 1 ) , ⋯   , a ( p ) a(1),\cdots,a(p) a(1),,a(p) using the bottom half equations,
[ r x ( q ) r x ∗ ( q − 1 ) ⋯ r x ∗ ( q − p + 1 ) r x ( q + 1 ) r x ( q ) ⋯ r x ∗ ( q − p + 2 ) ⋮ ⋮ ⋮ r x ( q + p − 1 ) r x ( q + p − 2 ) ⋯ r x ( q ) ] [ a [ 1 ] a [ 2 ] ⋮ a [ p ] ] = − [ r x ( q + 1 ) r x ( q + 2 ) ⋮ r x ( q + p ) ] \left[\begin{array}{c} r_{x}(q) & r_{x}^{*}(q-1) & \cdots & r_{x}^{*}(q-p+1) \\ r_{x}(q+1) & r_{x}(q) & \cdots & r_{x}^{*}(q-p+2) \\ \vdots & \vdots& & \vdots \\ r_{x}(q+p-1) & r_{x}(q+p-2) & \cdots & r_{x}(q) \end{array}\right]\left[\begin{array}{c} a[1] \\ a[2] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{c} r_{x}(q+1) \\ r_{x}(q+2) \\ \vdots \\ r_{x}(q+p) \end{array}\right] rx(q)rx(q+1)rx(q+p1)rx(q1)rx(q)rx(q+p2)rx(qp+1)rx(qp+2)rx(q)a[1]a[2]a[p]=rx(q+1)rx(q+2)rx(q+p)
and plug them back to obtain c ( 0 ) , ⋯   , c ( q ) c(0),\cdots,c(q) c(0),,c(q):
[ r x ( 0 ) r x ∗ ( 1 ) ⋯ r x ∗ ( p ) r x ( 1 ) r x ( 0 ) ⋯ r x ∗ ( p + 1 ) ⋮ ⋮ ⋮ r x ( q ) r x ( q − 1 ) ⋯ r x ( q − p ) ] [ 1 a [ 1 ] a [ 2 ] ⋮ a [ p ] ] = [ c [ 0 ] c [ 1 ] ⋮ c [ q ] ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}^{*}(1) & \cdots & r_{x}^{*}(p) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}^{*}(p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(q) & r_{x}(q-1) & \cdots & r_{x}(q-p) \end{array}\right]\left[\begin{array}{c} 1 \\ a[1] \\ a[2] \\ \vdots \\ a[p] \end{array}\right]=\left[\begin{array}{c} c[0] \\ c[1] \\ \vdots \\ c[q] \end{array}\right] rx(0)rx(1)rx(q)rx(1)rx(0)rx(q1)rx(p)rx(p+1)rx(qp)1a[1]a[2]a[p]=c[0]c[1]c[q]
Since c ( k ) = b ( k ) ∗ h ∗ ( − k ) c(k)=b(k)*h^*(-k) c(k)=b(k)h(k), where h ∗ ( − k ) h^*(-k) h(k) depends on b ( k ) b(k) b(k) as well, estimating b ( k ) b(k) b(k) from the Yule-Walker equations is not easy.

Stochastic model: Prony

Prony follows by taking L > p + q L>p+q L>p+q; this results in an overdetermined set of equations for estimating a ( k ) a(k) a(k):

[ r x ( 0 ) r x ( − 1 ) ⋯ r x ( − p ) r x ( 1 ) r x ( 0 ) ⋯ r x ( − p + 1 ) ⋮ ⋮ ⋮ r x ( q ) r x ( q − 1 ) ⋯ r x ( q − p ) r x ( q + 1 ) r x ( q ) ⋯ r x ( q − p + 1 ) ⋮ ⋮ ⋮ r x ( L ) r x ( L − 1 ) ⋯ r x ( L − p ) ] [ 1 a ( 1 ) a ( 2 ) ⋮ a ( p ) ] = σ v 2 [ c ( 0 ) c ( 1 ) ⋮ c ( q ) 0 ⋮ 0 ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}(-1) & \cdots & r_{x}(-p) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}(-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(q) & r_{x}(q-1) & \cdots & r_{x}(q-p) \\ \hline r_{x}(q+1) & r_{x}(q) & \cdots & r_{x}(q-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(L) & r_{x}(L-1) & \cdots & r_{x}(L-p) \end{array}\right]\left[\begin{array}{c} 1 \\ a(1) \\ a(2)\\ \vdots \\ a(p) \end{array}\right]=\sigma_{v}^{2} \left[\begin{array}{c} c(0) \\ c(1) \\ \vdots \\ c(q) \\ \hline 0 \\ \vdots \\ 0 \end{array}\right] rx(0)rx(1)rx(q)rx(q+1)rx(L)rx(1)rx(0)rx(q1)rx(q)rx(L1)rx(p)rx(p+1)rx(qp)rx(qp+1)rx(Lp)1a(1)a(2)a(p)=σv2c(0)c(1)c(q)00
From the last L − q L - q Lq equations we have
[ r x ( q ) r x ( q − 1 ) ⋯ r x ( q − p + 1 ) r x ( q + 1 ) r x ( q ) ⋯ r x ( q − p + 2 ) ⋮ ⋮ ⋮ r x ( L − 1 ) r x ( L − 2 ) ⋯ r x ( L − p ) ] [ a ( 1 ) a ( 2 ) ⋮ a ( p ) ] = − [ r x ( q + 1 ) r x ( q + 2 ) ⋮ r x ( L ) ] \left[\begin{array}{cccc} r_{x}(q) & r_{x}(q-1) & \cdots & r_{x}(q-p+1) \\ r_{x}(q+1) & r_{x}(q) & \cdots & r_{x}(q-p+2) \\ \vdots & \vdots & & \vdots \\ r_{x}(L-1) & r_{x}(L-2) & \cdots & r_{x}(L-p) \end{array}\right]\left[\begin{array}{c} a(1) \\ a(2) \\ \vdots \\ a(p) \end{array}\right]=-\left[\begin{array}{c} r_{x}(q+1) \\ r_{x}(q+2) \\ \vdots \\ r_{x}(L) \end{array}\right] rx(q)rx(q+1)rx(L1)rx(q1)rx(q)rx(L2)rx(qp+1)rx(qp+2)rx(Lp)a(1)a(2)a(p)=rx(q+1)rx(q+2)rx(L)
or
R q a ‾ = − r q + 1 \mathbf R_q \overline{\mathbf a}=-\mathbf r_{q+1} Rqa=rq+1
We may find the least squares solution by solving
( R q H R q ) a ‾ = − R q H r q + 1 (\mathbf R_q^H \mathbf R_q)\overline {\mathbf a}=-\mathbf R_q^H \mathbf r_{q+1} (RqHRq)a=RqHrq+1
where ( R q H R q ) (\mathbf R_q^H\mathbf R_q) (RqHRq) is a p × p p\times p p×p Hermitian Toeplitz matrix.

SM-Special case: All-pole modeling

For autoregressive processes, stack the Yule-Walker equations for k = 0 , 1 , … , p k=0,1, \ldots, p k=0,1,,p
[ r x ( 0 ) r x ( − 1 ) ⋯ r x ( − p ) r x ( 1 ) r x ( 0 ) ⋮ r x ( − p + 1 ) ⋮ ⋮ ⋮ r x ( p ) r x ( p − 1 ) ⋯ r x ( 0 ) ] [ 1 a ( 1 ) ⋮ a ( p ) ] = σ v 2 ∣ b ( 0 ) ∣ 2 [ 1 0 ⋮ 0 ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}(-1) & \cdots & r_{x}(-p) \\ r_{x}(1) & r_{x}(0) & \vdots & r_{x}(-p+1) \\ \vdots & \vdots & & \vdots \\ r_{x}(p) & r_{x}(p-1) & \cdots & r_{x}(0) \end{array}\right]\left[\begin{array}{c} 1 \\ a(1) \\ \vdots \\ a(p) \end{array}\right]=\sigma_{v}^{2}|b(0)|^{2}\left[\begin{array}{c} 1 \\ 0 \\ \vdots \\ 0 \end{array}\right] rx(0)rx(1)rx(p)rx(1)rx(0)rx(p1)rx(p)rx(p+1)rx(0)1a(1)a(p)=σv2b(0)2100
It can also be written as
[ r x ( 0 ) r x ∗ ( 1 ) ⋯ r x ∗ ( p − 1 ) r x ( 1 ) r x ( 0 ) ⋯ r x ∗ ( p − 2 ) ⋮ ⋱ ⋮ r x ( p − 1 ) r x ( p − 2 ) ⋯ r x ( 0 ) ] [ a [ 1 ] a [ 2 ] ⋮ a [ p ] ] = − [ r x ( 1 ) r x ( 2 ) ⋮ r x ( p ) ] \left[\begin{array}{cccc} r_{x}(0) & r_{x}^{*}(1) & \cdots & r_{x}^{*}(p-1) \\ r_{x}(1) & r_{x}(0) & \cdots & r_{x}^{*}(p-2) \\ \vdots & \ddots & \vdots \\ r_{x}(p-1) & r_{x}(p-2) & \cdots & r_{x}(0) \end{array}\right]\left[\begin{array}{c} a[1] \\ a[2] \\ \vdots \\ a[p] \end{array}\right]=-\left[\begin{array}{c} r_{x}(1) \\ r_{x}(2) \\ \vdots \\ r_{x}(p) \end{array}\right] rx(0)rx(1)rx(p1)rx(1)rx(0)rx(p2)rx(p1)rx(p2)rx(0)a[1]a[2]a[p]=rx(1)rx(2)rx(p)
This is exactly the same as the normal equations for [all-pole modeling with Prony’s method](# DM-Special case: All-pole modeling) in the deterministic case, except that here r x ( k ) r_{x}(k) rx(k) is a statistical autocorrelation.

If it is estimated from the data, r ^ x ( k ) = 1 N ∑ n = 0 N − 1 x [ n ] x ∗ [ n − k ] \hat{r}_{x}(k)=\frac{1}{N} \sum_{n=0}^{N-1} x[n] x^{*}[n-k] r^x(k)=N1n=0N1x[n]x[nk], the methods become identical. Also: ∣ b [ 0 ] ∣ 2 = r x ( 0 ) + ∑ k = 1 p a [ k ] r x ( k ) |b[0]|^{2}=r_{x}(0)+\sum_{k=1}^{p} a[k] r_{x}(k) b[0]2=rx(0)+k=1pa[k]rx(k)

Summary

Determined model: Pade

Signal Modeling

Determined model: Prony

Signal Modeling

Determined model: Shank

Signal Modeling

DM-Special case: All-pole modeling

Signal Modeling

DM-Special case: All-pole modeling with finite data

Signal Modeling

相关文章: