Appendix 11AMultiuser Detection

Appendix 11A ”Multiuser Detection

Multiuser detection (MUD) is a technique that can be used without any need for standardization or coordination of any type, and simply enhances receiver performance by trying to detect and eliminate all large crosstalking signals. MUD has limited benefit as it typically only works for a few large crosstalkers and has high complexity. Sections 11.3 and 11.4 show DMT systems can migrate naturally to mutually beneficial spectra that simplify the design so much that effectively MUD is not necessary and would add nothing. However, not all DSLs are DMT, thus leaving MUD as the only solution sometimes. Static SM standards attempt to impose some regulation on future DSL systems that do not coordinate in any way. Debate in creation of SM standards, and compromise in setting this debate, led to each type of DSL having considerable crosstalk into and from many other DSL services. Fortunately, multiuser detection can often provide some significant improvement even when there is no coordination and many different types of DSLs share a binder.

The rate regions of Section 11.1 are of less interest in MUD because there is no ability to change the data rate of any particular DSL system. Clearly if the designer of any system did have access to rate regions , then that designer would at least know whether his design was feasible or not. Figure 11.44 shows a revised GDFE structure where for each (downstream or upstream) receiver, the equalizer W accepts one channel input sample stream of N samples in a packet, and outputs up to L signal streams that can each be ( unfortunately ) linear combinations of all U = L users. The GDFE is also not unique in this case, and there are actually an infinite number of possibilities with the same performance. The ensuing ML detector or approximations thereof may be complicated or simplified, depending on the choice. A special case of all users being synchronized and DMT leads to a linear combination of the L user 's tone- n signals only on tone n of the output. The ML detector can be easily implemented with an L -dimensional user-feedback structure augmented by ML detection separately on each tone. Otherwise , any DSL input symbol from any user within the packet time slot can and often will affect all of every other user's samples in the time slot. Nonetheless, given the proliferation of DSL types in use, the latter situation is the most likely at least in existing or older DSL deployments.

Figure 11.44. Use of W, B for MUD with GDFE processing.

graphics/11fig44.gif

Section 11.A.1 discusses some basic prerequisites for the multiuser detection problem, and then defines the problem mathematically. Complexity (except possibly in some cases with DMT as noted above) is then shown to be very large and not feasible. Section 11.A.2 then proceeds with a feasible approximation to ML detection derived by Wu-Fan-Cheong-Choi [15],[16],[17] and abbreviated the soft canceler. Some limitations to this canceller are observed as the number of crosstalkers increases , motivating Zeng's Canceler in Section 11.A.3. All the MUD methods tend to peform increasingly less well as the number of significant crosstalkers exceeds two, motivating simpler and higher-performance methods that require some level of coordination in Sections 11.3 and 11.4.

11.A.1 Basic Receiver Prerequisites for MUD

This subsection introduces some topics that will appear as difficult practical problems for the designer in the use of MUD only. Both synchronization issues and channel-identification issues are discussed. It is important to note up front that these issues ONLY arise when there is NO COORDINATION.

Different DSLs may be asynchronous, even if of the same type. The actual DAC sampling clock may not be synchronized to a central office clock (even if the data are) and stuffing /robbing may be used to align data with such an asynchronous crystal. Fortunately, if a signal is strong enough to be a significant crosstalker, then at the very least its clock sampling phase can be derived by simple phase lock loops . The block model H will need to interpolate for each packet a model of the type:

for each packet, so that H is a time-varying matrix with its values in each successive packet being different if there are clock differences between DSL users' sampling/symbol rates. Assuming that clock differences from packet to packet on any user are relatively small, a frequency domain linear-phase interpolation (as in [1]) or rotor is one mechanism by which to simply interpolate all the individual terms in the model H to the correct values for each block if the relative clock offset is known from basic phase-lock loops (PLLs) for all possible inputs. Two clocks that are relatively close in both frequency and/or phase may interpolate by the same factor until they have accumulated a significant phase difference that can be discerned. This text will not deal further with phase differentiation other than to say this is a problem well understood in the signal processing community, and there are wealth of methods for deriving two clocks that are close in frequency (see [18]). If the two DSLs are different types, clocks will be more easily discerned at widely disparate frequencies (but crosstalk reduction then otherwise becomes more complex).

Crosstalk types can be identified through spectral estimation of the noise/error signals on the line. Both the types of and numbers each of crosstalkers that are significant can be estimated through profiling. Profiling is discussed in [10], [19] at length. Figure 11.45 illustrates the basic concept. A state machine is constructed adaptively for each of the possible line states ever encountered by any DSL receiver. There is always one state that corresponds to no other DSLs. Other states may correspond to situations like 1 ADSL and 1 HDSL of significance to this line/receiver, 2 VDSLs of significance, 1 ISDN and 2 T1's of significance, and so on. Each of these services has a distinct bandwidth use that is defined by earlier spectrum management standards such as T1.417 [5]. Any time a new spectrum is measured for noise (see [1]) that significantly differs by more than some threshold related to desired data rate and margin on the loop of interest, a new state is added to Figure 11.45. The state of the channel is always either the newly defined state or some earlier determined state, depending on essentially the difference between the identified spectrum and that associated with each state. If the difference metric for any existing state is the smallest and below the threshold, the channel is defined to be in that state; then the number, type, and crosstalk coupling for the crosstalkers in that state has been previously defined (or must be determined for a new state, see Section 11.5), and PLLs are used to find timing differences.

PLLs and profiling are used to establish the channel matrix H for each packet. Once H is known, the maximum likelihood estimate of X is simply satisfies.

a very easily written and understood concept that may take an eternity to compute if the packet length is large and the number of inputs crosstalking is significant. Maximum likelihood detection is the best possible MUD (for equally likely inputs) and often is very impressive in terms of its ability to reconstruct a given user in the presence of severe crosstalk. Figure 11.46(a) illustrates an example of a G.pnt home network signal and a VDSL signal that are on adjacent lines in a binder group . Both overlap in the 5 “10 MHz frequency range. If the normal DMT VDSL detector is used, the margin versus length plot in the lower curve of Figure 11.44(b) occurs. Thus, G.pnt dramatically impairs normal VDSL performance, and no effort was made to ensure spectral compatibility, for instance, of these two signals. Thus, they are a good example for multiuser methods. The upper curves in Figure 11.46(b) correspond to no G.pnt crosstalk (obviously an upper bound on performance) and the ML detector. They are indiscernible in terms of VDSL performance, so the ML detector has a very large improvement in range. For more details, see [17].

Figure 11.46(a). VDSL/G.pnt crosstalk problem.

graphics/11fig46a.jpg

Figure 11.46(b). MUD ML detector for VDSL at 26 Mbps.

graphics/11fig46b.gif

11.A.2 Soft Cancellation

Unfortunately, multiuser ML detection becomes extremely complex for most practical situations of interest and provides only a motivation to search for approximate methods that do nearly as well. An aspect of ML detection often not well appreciated is that it minimizes the probability of a packet error as illustrated in Section 11.A.1. In practice, one might argue that minimizing the probability of error separately for each symbol in a packet might be better, and indeed such minimization leads to iterative construction of the best data symbol for each and every user sample within a packet. Iterative decoding recursively constructs the probability distribution of each and every element of the vector of all the users' input symbols in X. The channel matrix H and estimates of the other elements of the vector X are used to update an estimate of the probability distribution of any element in X. The iterative-decoding procedure proceeds sequentially through the elements of X, one by one, each time updating its probability distribution (which amounts to computing its average value and variance about that value in various approximations rather than computing the distribution itself). When each element has been processed once, the cycle continues through each element again. After about 5 “10 passes , the distributions converge to a stationary setting, and then the X estimates themselves (not probabilities) are detected by finding those that maximize each probability distribution. For crosstalking channels, this process will closely approximate ML detection. Figure 11.47 illustrates the basic process.

Figure 11.47. Iterative decoding procedure for soft cancellation MUD.

graphics/11fig47.gif

Iterative decoding for crosstalk and ISI is somewhat more complicated than that for codes (turbo or LDPC) in that the actual probability distribution of symbol values (rather than for bits) is used. The process can be rederived for individual bits as well, but that is a natural extension of the algorithm here, and therefore is not presented here. The basic criterion for each symbol value is

where x _i is a place holding variable for the discrete values in the distribution of the i th symbol x _i in X . Such a detector is called a MAP detector (see [1]) and minimizes probability of symbol error. Realizing that the observed channel output Y is common to the criterion for all possible values of x _i , the criterion can be written instead as

or in its preferred log likelihood form where LR = log( p )

The log-likelihood follows a form that is iteratively computed in interative decoding as

The idea is to get a new estimate of the LR by taking its old estimate and adding an update or "extrinsic" term that is based on the current probability distributions of all the other symbols.

The initializing old value can be the log of a uniform distribution. The extrinsic values can be computed at any iteration by soft cancellation. Soft cancellation observes that the matrix G = WH may be rewritten as G = [g ₁ g ₂ ... g _MN ] so that . If all the other symbols were known exactly, then a "hard" canceler would produce

MAP detection would be simply finding the value of x _i = x _i that minimizes Z _i -g _i x _i ² , essentially a handful of computations , or perhaps a simple slicing. However, the other values of x _j _i are not known, so hard cancellation is not possible ”any kind of ordering and hoping for correction decisions is prone to severe error propagation should any of the previously estimated values be incorrect. Thus, the objective becomes computing the distribution or LR of symbol x _i from the distributions (LRs) of all the other symbols and the old distribution (LR) of this symbol. This can actually be exactly computed, but with enormous effort so soft-cancellation makes an approximation where the soft or average value of each x _j _i is used in the cancellation rather than a poor hard value. Only the LR _new,i ( x _i ) is computed, and no decision is made (yet). Given LR _old,i from the initial uniform distribution assumption, or from a previous interation of the algorithm once started, the decoder then needs to compute only . The decoder computes the average or soft value of any symbol as X _i = E [ x _i ] from the current distribution estimate. Then this value is used to compute the soft value for

and also the variance about the soft value

where is the variance of symbol x _i as computed from its most recent distribution (LR). Consequently the extrinsic LR is then approximated by

Thus, the GDFE structure is used for soft cancellation with the average values (based on iterated distributions) used instead of previous decisions. The new LR is computed by adding the extrinsic to the old LR, and then this new LR can be inverted to the probability distribution, from which the new soft or average value X _i is computed for use in subsequent iterations. This process converges in 5 “10 passes, despite the approximations.

Some Practical Cautions

If the soft value X _i exceeds the maximum value for the constellation significantly, it should be discarded and not used as clearly there is a "bad" value in the distribution that would cause this. Thus, for the pass of all the other symbols that follows, this value is not used in the soft cancellation, and a value of zero contribution to the variance is used. This has been found to speed convergence of the algorithm significantly.

In designs that may not coordinate, it may still be of value to leave some parts of different user's bands nonoverlapping or equivalently to have a symbol or two from one or more of the users in any packet otherwise free of interference. This helps "get the iterative decoder going" in the right direction. Usually the amount necessary is a tiny fraction of the total user bandwidth.

With many, many crosstalkers, iterative decoding and even ML tend to perform poorly. This is not a problem with the soft canceler, but rather the basic information theory fundamentals of the system being such that multiuser detection can not be expected to work. The total transmitted information exceeds the capacity region of the system, or at least is structured so poorly that low-probability-of-error detection of all signals is not possible. Thus, soft cancellation can be expected to work when fundamental information and communication theory limits for the ML detector suggest that near-error-free performance is possible.

Initialization

The very first soft values for cancellation would be based on uniform distributions for the input and therefore somewhat useless. Clearly, more conventional receiver design could be used to obtain estimates for the soft values on the first iteration. This can be achieved by processing the channel output according to

and then limiting the resulting values for any element of X to the largest point in the constellation if it is large and then (and otherwise) using the values as average values for the soft canceler. The initial estimate of can be found from the diagonal elements of s ² WG ^-2 W ^* .

11.A.3 Simplied Cancellation by Grouping of Excess Dimensions

Although a large simplification with respect to ML detection, soft-cancellation's iterative decoding can still be excessively complex for reasonable implementation. Further complexity reduction profits from specific structure of the specific channel that may be obvious in certain situations. Examples include HDSL or SDSL crosstalk into ADSL and also ADSL into VDSL. This concept of this subsection was motivated by Dr. Steve Zeng, but represents a generalization of the concepts in his Stanford dissertation. When viewed in a general perspective, one also finds the work of Pederson and Falconer [20], Pal et al. [21], Im and Werner [22], and Voyan Technology [23] to also be examples of the same general principle, although the general previous view of this work was that each individually was different.

Returning to the GDFE, the settings are a function of ordering of the various elements of the vector X . One possible grouping that can lead to simplification in many cases is to segregate all the values of the channel of interest into a vector X ₁ and all the rest into another vector X ₂ . The basic idea is to try to estimate without iterative methods the net effect of the elements X ₂ in terms of their interference into Y and cancel that interference from Y . Then X ₁ is estimated alone. Clearly, a higher-level iterative decoding could be used once X ₁ is estimated to produce a second better estimate of X ₂ and so on. This kind of method is effective when the second signal has components that are relatively free of crosstalk. Examples occur in DSL when a system such as SDSL or HDSL introduces NEXT that has upperband sidelobes that are captured by the higher sampling rates of ADSL. These bands may be significantly larger than the ADSL signal being transmitted over a long line. Thus, a reasonable estimate of the entire HDSL or SDSL signal may be constructed from just viewing the upper band. If that estimate is accurate, it can be used to reconstruct the NEXT over the entire band of the ADSL signal and eliminate it, thus allowing considerably higher performance.

The channel is modeled specifically as

where H ₁ is the channel from X ₁ to output, and H ₂ is the channel from X ₂ to output.

Figure 11.48 redraws the GDFE with the specific separation into the two components. A GDFE estimates X ₂ and that estimate is then filtered by H ₂ prior to its subtraction from Y before a second GDFE based on the model Y = H ₁ X ₁ + N refines the estimate of X ₁ . If the first signal is a DMT signal, then the second GDFE simplifies to essentially just a "pass-through" if the IFFT and FFT are absorbed into the channel model for H. There are various choices for the feedback section of the first GDFE, and B ₂ = I is the typical choice for this first GDFE and equivalent to those made by Falconer and Pederson, Werner, and Zeng. Zeng investigated for this particular choice to determine whether using decisions, or simply making no decision before passing to the matrix H ₂ , works better. Neither Falconer and Pederson, Pal, nor Werner addressed this concept and in fact did not show that decisions might be made at this point, choosing instead to find a matrix corresponding to W ₂ H ₂ without the possibility of an intermediate decision. Voyan Technology [23] appears to have been so cognizant, and released performance results that are similar to Zeng's without a detailed description of the processing. Zeng found in some cases making decisions works better whereas in other cases it does not. The approach taken here with the GDFE notes that B ₂ can be any lower triangular (monic) matrix and the performance will then be better. However, as the number of users increases beyond two, that first GDFE has increasing problems with error propagation; and even with no decisions, the system often breaks down. Some level of higher-level interative decoding may be used, but generally it also breaks down as does ML detection with large numbers of users.

Figure 11.48. Simplified canceler configuration.

graphics/11fig48.gif

Zeng's final conclusion was that the system works, perhaps with the need of some higher-level iterative decoding as in the soft canceler, as long as fundamental ML would also work. However, this is usually just for one additional crosstalk signal, thus limiting iterative decoding cases. To perform better, some level of coordination is necessary with many users and the gains may be larger.

Previous investigators failed to realize the full potential of the general viewpoint in Figure 11.13. For instance, Figure 11.49 from Zeng [23] illustrates the effect of using part of an HDSL signal out of band to cancel its in-band content. In this case, X ₂ corresponds to signals only in one of the two nulled bands shown and not the entire HDSL signal. Although the method has a very good elimination of signal in that particular band, there are other choices of X ₂ that would lead to yet more complete cancellation of the HDSL signal. Similar observations would occur for all previous work, but nonetheless Figure 11.49 illustrates the concept works very well for one dominant crosstalker. One would find that as the number of crosstalkers increases beyond two, it is exceedingly difficult to recover X ₂ because even ML detection will not do well in this case. Figure 11.50 for this same situation illustrates the range/rate improvement for the situation shown in Figure 11.49. Again, the improvement is very significant and similar improvements have been introduced by Voyan Technology [23].

Figure 11.49. Illustration of reduction of HDSL NEXT into ADSL in the 150 to 200 kHz band that accrues to choosing X ₂ to correspond to signals in the 250 to 350 kHz band. Effect of both decisions and no decisions with B ₂ = 0 is shown.

graphics/11fig49.gif

Figure 11.50. Illustration or rate/range improvement for situation illustrated in Figure 11.24.

graphics/11fig50.gif

Top

Appendix 11AMultiuser Detection