9.3 Coherent Detection in Fading Channels Based on the EM Algorithm | Wireless Communication Systems: Advanced Techniques for Signal Reception (paperback)

As will be seen below, the maximum- likelihood sequence detector in fading channels typically has prohibitive computational complexity. The EM algorithm is an iterative technique for solving complex maximum-likelihood estimation problems. In this section we discuss sequence detection in fading channels based on the EM algorithm. Both a batch algorithm and a sequential algorithm are discussed.

9.3.1 Expectation-Maximization Algorithm

Suppose that q is a set of parameters to be estimated from some observed data Y . The maximum-likelihood estimate of q is given by

Equation 9.15

where p ( Y q ) denotes the probability density of Y with q fixed. In many cases, an explicit expression for the conditional density p ( Y q ) does not exist. In other cases, the maximization problem above is very difficult to solve, even though the conditional density can be expressed explicitly. The EM algorithm [248, 315] is an iterative procedure for solving the ML estimation problem above in many such situations.

In the EM algorithm, the observation Y is termed incomplete data . The algorithm postulates that one has access to complete data X , which is such that Y can be obtained through a many-to-one mapping. Typically, the complete data is chosen such that the conditional density p ( X q ) is easy to obtain and maximize over q . Starting from some initial estimate q ⁽⁰⁾ , the EM algorithm solves the ML estimation problem (9.15) by the following iterative procedure:

E-step : Compute

Equation 9.16
M-step : Solve

Equation 9.17

It is known that the sequence { q ^{( j )} } obtained in the EM algorithm above monotonically increases the incomplete-data likelihood function:

Equation 9.18

Moreover, if the function Q ( q ; q ') is continuous in both q and q ', all limit points of an EM sequence { q ^{( j )} } are stationary points of p ( Y q ) (i.e., local maxima or saddle points) and p ( Y q ^{( j )} ) converges monotonically to p for some stationary point [248, 315].

9.3.2 EM-Based Receiver in Flat-Fading Channels

We consider the following discrete-time flat-fading channel

Equation 9.19

where { a [ i ]} is the complex Gaussian fading process, { b [ i ]} is a sequence of transmitted phase-shift-keying (PSK) symbols ( b [ i ] = 1), and { n [ i ]} is a sequence of i.i.d. Gaussian noise samples. Define the following notation:

graphics/508fig03.gif

Then (9.19) can be written as

Equation 9.20

Note that both a and n are complex Gaussian vectors:

Equation 9.21

Equation 9.22

where E _s is the average received signal energy. For mobile fading channels, the normalized M x M autocorrelation matrix has elements given by the Jakes model as

Equation 9.23

where B _d T is the symbol-rate-normalized Doppler shift and J () is the Bessel function of the first kind and zeroth order. Hence r in (9.20) has the following complex Gaussian distribution:

Equation 9.24

and the log-likelihood function of r given B is thus given by

Equation 9.25

Note that

Equation 9.26

graphics/09equ026.gif

Equation 9.27

graphics/09equ027.gif

where we have used the facts that BB ^H = B ^H B = I _M and det( B ) det( B ^H ) = det( BB ^H ) = 1, since B is a diagonal matrix containing PSK symbols. Hence the ML estimate of b becomes

Equation 9.28

The optimal solution involves an exhaustive enumeration of all possible PSK sequences of length M , which is certainly prohibitively complex even for moderate M .

The EM algorithm was applied to solve the fading channel detection problem above in [138]. To use the EM algorithm, we define the complete data as consisting of the incomplete data r together with the fading process a [i.e., x = ( r , a )]. Then the log-likelihood function of the complete data is

Equation 9.29

Hence the E-step computes the following quantity:

Equation 9.30

graphics/09equ030.gif

Since given b = b ^{( j )} , r and a are jointly Gaussian, we then have

Equation 9.31

graphics/09equ031.gif

The maximization step becomes

Equation 9.32

graphics/09equ032.gif

An initial estimate of the data symbol sequence b ⁽⁰⁾ can be obtained with the aid of pilot symbols as follows . Suppose we choose M such that M = NL + 1, where N and L are positive integers. Suppose further that the symbols in positions 0, N , 2 N ,..., ( L -1) N are known. Then the initial channel estimates at these positions are given by

Equation 9.33

and the initial channel estimates at other positions are obtained by linear interpolation; that is,

Equation 9.34

Substituting the initial channel estimate a ^(-1) above into (9.32), we obtain the initial symbol estimate b ⁽⁰⁾ .

Finally, we summarize the EM-based pilot-symbol-aided receiver algorithm in flat-fading channels as follows.

Algorithm 9.1: [EM algorithm for pilot-symbol-aided receiver in flat-fading channels]

Initialization: Based on the pilot symbol information, obtain an initial channel estimate a ^(-1) using (9.33) and (9.34). Substitute a ^(-1) into (9.32) and compute the initial symbol estimate b ⁽⁰⁾ .
For j = 1,..., I, iterate the following E and M steps (where I is the number of EM iterations):

E-step: Compute a ^{( j )} according to (9.31) .

M-step: Compute b ^{( j +1)} according to (9.32).

9.3.3 Linear Multiuser Detection in Flat-Fading Synchronous CDMA Channels

The EM-based receiver discussed above can easily be applied to synchronous CDMA systems in flat-fading channels [555]. The basic idea is to use a linear multiuser detector (e.g., the decorrelating detector) to separate the multiuser signals and then to employ an EM receiver for each user to demodulate its data. This procedure is discussed briefly next .

We consider the following simple K -user synchronous CDMA system signaling through flat-fading channels. The received signal during the i th symbol interval is given by

Equation 9.35

where a _k [ i ] is the complex fading gain of the k th user's channel during the i th symbol interval; A _k is the amplitude of the k th user; b _k [ i ] {+1, -1} is the i th bit of the k th user; { s _k ( t ), 0 t T } is the unit-energy spreading waveform of the k th user; and n ( t ) is white complex Gaussian noise with power spectral density s ² . The received signal is correlated with the signature waveform of each user, to obtain the decision statistic:

Equation 9.36

graphics/09equ036.gif

where and . On denoting y [ i ] = [ y ₁ [ i ] y ₂ [ i ] ... y _K [ i ]] ^T , we can write

Equation 9.37

where R = [ r _jk ], A = diag { A ₁ ,..., A _K }, F [ i ] = diag { a ₁ [ i ], ..., a _K [ i ]}, b [ i ] = [ b ₁ [ i ] b ₂ [ i ] ... b _K [ i ]] ^T , and n [ i ] = [ n ₁ [ i ] n ₂ [ i ] ... n _K [ i ]] ^T . Note that n [ i ] ~ N _c ( , s ² R ).

The multiuser signals y [ i ] in (9.37) can be separated by a linear decorrelator, to obtain

Equation 9.38

with u [ i ] ~ N _c ( , s ² R ^-1 ). We can write (9.38) in scalar form as

Equation 9.39

with u _k [ i ] ~ N _c (0, s ² [ R ^-1 ] _kk ). We see that for each user, the output of the decorrelating detector (9.39) is of exactly the same form as (9.19). Hence with the aid of pilot symbols, the EM receiver discussed in Section 9.3.2 can be employed to demodulate the k th user's data { b _k [ i ]} _i , k = 1, ..., K .

An alternative suboptimal receiver structure for demodulating the k th user's data uses a Kalman filter to track the fading channel { a _k [ i ]} _i , based on training symbols or decision feedback [547, 578, 579]. For example, in the simplest setting, the fading coefficients { a _k [ i ]} _i may be modeled by a second-order autoregressive (AR) process:

Equation 9.40

where w [ i ] is a zero-mean white complex Gaussian process. The parameters a ₁ and a ₂ are chosen to fit the spectrum of the AR process to that of the underlying Rayleigh fading process. On the other hand, a statistically equivalent representation of the linear decorrelator output (9.39) is

Equation 9.41

where we have invoked the symmetry of the distribution of u _k [ i ]. Based on the state equation (9.40) and the observation equation (9.41), we can use a Kalman filter to track the fading channel coefficients { a _k [ i ]} _i and subsequentially detect the data symbols. Note that in (9.41), the data symbols { b _k [ i ]} _i are assumed known, corresponding to the case when they are training symbols. When these symbols are unknown, they are replaced by the detected symbols . Such a decision-directed scheme is subject to error propagation, of course, and thus requires periodic insertion of training symbols.

9.3.4 Sequential EM Algorithm

The EM algorithm discussed above is a batch algorithm. We next briefly describe a sequential version of the EM algorithm [482, 563]. Suppose that y ₁ , y ₂ , ... is a sequence of observations with marginal pdf f ( y q ), where q C ^m is a static parameter vector. A class of sequential estimators derived from the maximum-likelihood principle is given by

Equation 9.42

where q ^{( i )} is the estimate of q at the i th step, P ( y _{i +1} , q ^{( i )} ) is an m x m matrix defined later in this section, and

Equation 9.43

is the update score (i.e., the gradient of the log-likelihood function). Let H ( y _i , q ^{( i )} ) denote the Hessian matrix of log f ( y _i q ^{( i )} ):

Equation 9.44

graphics/09equ044.gif

Let x _i denote a "complete" data set related to y _i for i = 1,2, .... The complete data set x _i is selected in the (sequential) EM algorithm such that y _i can be obtained through a many-to-one mapping x _i y _i , and so that its knowledge makes the estimation problem easy [e.g., the conditional density f ( x _i q ) can easily be obtained]. Denote the Fisher information matrices of the data y _i and x _i , respectively, as

Different versions of sequential estimation algorithms are characterized by different choices of the function P ( y _i +1, q ^{( i )} ) in (9.42), as follows.

The sequential EM algorithm :

Equation 9.45

The consistency and asymptotic normality of this algorithm are considered in [482]. Applications of the sequential EM algorithm to communications and signal processing problems are reported in [123, 238, 239, 563, 601, 602].
The NewtonRaphson algorithm :

Equation 9.46
A stochastic approximation procedure :

Equation 9.47

Note that for i.i.d. observations { y _i }, if i in (9.47) is replaced by ( i + 1), we obtain the maximum-likelihood estimator of q for exponential families [482]. The asymptotic distribution of this procedure can be found in [115, 428].
If P ( y _{i +1} , q ^{( i )} ) is a constant diagonal matrix with small elements, then (9.42) is the conventional steepest-descent algorithm. Some related choices of P ( y _{i +1} , q ^{( i )} ) are suggested in [482].
For time-varying parameters { q ( i )}, a conventional approach suggested in [123, 283] is to substitute the converging series 1/ i in (9.45) with a small positive constant l . The new estimator is given by

Equation 9.48

where ( i ) is the estimate of q ( i ).