# **Channel Equalization using a Decision Feedback**

## Equalizer

A Project Report

submitted by

## KARTHIK T J

### (EE03B030)

*in partial fulfilment of the requirements for the award of the Dual degrees of* 

### **BACHELOR OF TECHNOLOGY**

and

### MASTER OF TECHNOLOGY

Under the guidance of

### Dr. Nagendra Krishnapura



## DEPARTMENT OF ELECTRICAL ENGINEERING INDIAN INSTITUTE OF TECHNOLOGY, MADRAS.

May 2008

## THESIS CERTIFICATE

This is to certify that the thesis titled **Channel Equalization using a Decision Feedback Equalizer**, submitted by **Karthik Tripurari Jayaraman**, to the Indian Institute of Technology, Madras, for the award of the degrees of **B.Tech** and **M.Tech**, is a bona fide record of the research work done by him under my supervision. The contents of this thesis, in full or in parts, have not been submitted to any other Institute or University for the award of any degree or diploma.

**Dr. Nagendra Krishnapura** Research Guide Assistant Professor Dept. of Electrical Engineering IIT-Madras, 600 036

## ACKNOWLEDGEMENTS

I would like to thank Dr. Nagendra Krishnapura who guided me right through this project and my study of Analog Circuits. He has always had time for my doubts and would entertain discussions whenever I barge into his room even if it is not academic. He replies to mails within minutes which I have found useful in many occasions. Apart from being a good teacher and guide, he is also a good friend of mine.

Dr. Shanthi Pavan is a huge source of inspiration for anyone who is interested in Analog IC Design. With his witty one liners and analogies, he has always kept his classes interesting. His course is the only course in IIT Madras which has not put me to sleep. Dr. Shanthi Pavan has directly or indirectly inspired many of us to become Professors at IIT Madras.

I would also like express my gratitude to Mr. J. Nanda Govind my project teammate and a good friend who has helped me sort out many problems during the project.

I extend this opportunity to thank all my wing mates - Maro (The rich bugger), Kmap (my close friend since school days however an absolute stud unlike me), God (author of most of my codes - I tell algo and he codes), Officer (this is one of the infinite names I have given him. A laughing stock.), Gokul (impractical optimist hence pessimist), Dodo (Another stud), Baba (public relations guru), Puneet (inhouse entrepreneur), Muski (My Hindi guru) and Adida (for gult movie discussions); my classmates - Screwny (Source of fun), Das (Analog guru - so he thinks), Muggu (for his antics) and Mani (my co-traveller in many journeys); and all other friends (I am skipping names due to lack of space) and crushes (I am skipping names for safety purposes) who have made my stay in IIT Madras a memorable one.

Last, but certainly not the least, I am eternally indebted to my parents, sister and grandparents who have played a huge role in shaping my life.

## ABSTRACT

10GBASE-KR (a physical layer in backplane) seems to be of interest in the industry for transmission of ethernet traffic over backplane. Total cost of implementation of a reliable communication link is the major issue when developing a physical layer capable of delivering 10 Gbps across a backplane. One industry position is to make the channel as inexpensive as possible and force the circuit designers to develop to smart circuits to transmit and receive data over this channel.

This project involves the design of the receiver of a 10Gbps transceiver implemented using 65nm CMOS Technology of TSMC for such a channel. The receiver consists of an amplifier system, decision feedback equalizer(DFE), a deserializer and a LMS engine to control the gain of the amplifier and the co-efficients of the DFE.

**Amplifier** system is implemented to amplify the incoming signal to an appropriate level which is large enough for the equalizer to deliver a desired eye-opening. Differential pairs were loaded with resistive, inductive and active inductive loads and bandwidth in each case were observed. In one of the architectures, feedback was used to move the poles away from real axis and increase the bandwidth.

The purpose of a **Decision feedback equalizer** (DFE) is to minimize the error due to Inter Symbol Interference (ISI) and noise. In this design, a four-tap DFE is implemented with resolutions of 5-bits for the first tap and 4-bits for the other taps. Full rate and half rate architectures were explored. In addition to the conventional current summing DFE, a switch-capacitor based DFE was also attempted. It was found that the former produced an eye-opening of 400mV p-p consuming 5mW of power while the latter consumed more power (with ideal switches) for a lesser eye-opening. Hence the DFE based on current-mode summer was chosen over its switch-capacitor counterpart.

It was observed that inductors were not affordable and one of active inductive load or

feedback technique was necessary to achieve high bandwidth. The active-inductive load required a bias point of 1.5V and resistors of 120k. Hence the feedback technique was implemented to improve the bandwidth. The final amplifier system was a cascade of a gain cell implemented using the feedback technique, a differential pair with resistive load and a couple of VGAs. The entire front-end amplifier system along with VGA consumes about 600  $\mu$ W of power.

The **Deserializer** is used to split the high speed data stream into eight parallel low speed data streams. A combination of CML at high frequency and CMOS logic at low frequency were used. The deserializer circuit consumes about 4mW of power.

# TABLE OF CONTENTS

| A  | CKNO  | DWLEDGEMENTS                              | i   |
|----|-------|-------------------------------------------|-----|
| A] | BSTR  | ACT                                       | ii  |
| LI | IST O | F TABLES                                  | vii |
| LI | IST O | F FIGURES                                 | ix  |
| A] | BBRE  | EVIATIONS                                 | 1   |
| 1  | Intr  | oduction                                  | 2   |
|    | 1.1   | Channel                                   | 2   |
|    | 1.2   | Additive White Gaussian Noise Channel     | 2   |
|    | 1.3   | Channel with memory                       | 3   |
|    | 1.4   | Pre-cursor and Post-Cursor                | 4   |
|    | 1.5   | Eye Diagram                               | 4   |
|    | 1.6   | Equalization                              | 4   |
|    | 1.7   | Zero Forcing and MMSE Equalizers          | 6   |
|    | 1.8   | Decision feedback equalizer               | 7   |
|    |       | 1.8.1 Advantages and disadvantages of DFE | 8   |
|    | 1.9   | Block diagram of the 10Gbps receiver      | 9   |
| 2  | Fro   | nt-end Amplifiers                         | 10  |
|    | 2.1   | Introduction                              | 10  |
|    | 2.2   | Differential pair with resistor load      | 11  |
|    |       | 2.2.1 Gain                                | 12  |
|    |       | 2.2.2 Bandwidth                           | 12  |
|    |       | 2.2.3 Guidelines                          | 12  |
|    | 2.3   | Differential pair with inductive load     | 14  |

|   | 2.4  | Differential pair with active inductive load |                                                      |    |
|---|------|----------------------------------------------|------------------------------------------------------|----|
|   |      | 2.4.1                                        | Theory                                               | 15 |
|   |      | 2.4.2                                        | Amplifier using an active inductive load             | 17 |
|   |      | 2.4.3                                        | Results                                              | 18 |
|   | 2.5  | Gain c                                       | ell with feedback                                    | 19 |
|   |      | 2.5.1                                        | Theory                                               | 19 |
|   |      | 2.5.2                                        | The implementation                                   | 20 |
|   |      | 2.5.3                                        | Results                                              | 20 |
|   | 2.6  | Variab                                       | le Gain Amplifier                                    | 21 |
|   | 2.7  | The fir                                      | nal architecture                                     | 23 |
|   | 2.8  | Result                                       | S                                                    | 24 |
| 3 | Desi | gn of d                                      | ecision feedback equalizer                           | 25 |
|   | 3.1  | Basic I                                      | building blocks                                      | 25 |
|   | 3.2  | Latch                                        | design                                               | 26 |
|   |      | 3.2.1                                        | Results                                              | 28 |
|   | 3.3  | Desigr                                       | n of adder                                           | 28 |
|   | 3.4  | 4 Design of DFE using current mode Adder     |                                                      |    |
|   |      | 3.4.1                                        | Sizing the input differential pair and the resistors | 30 |
|   |      | 3.4.2                                        | Importance of the Phase of the Clock                 | 31 |
|   |      | 3.4.3                                        | Designing the feedback taps                          | 31 |
|   |      | 3.4.4                                        | Programmable $G_m$                                   | 33 |
|   |      | 3.4.5                                        | Round Trip Delay of the first feedback tap           | 34 |
|   |      | 3.4.6                                        | Pre-amplifier                                        | 35 |
|   |      | 3.4.7                                        | Maintaining the common mode voltage                  | 36 |
|   | 3.5  | Half R                                       | ate Architecture                                     | 39 |
|   |      | 3.5.1                                        | Basics of Unfolding                                  | 39 |
|   | 3.6  | Switch                                       | ned Capacitor DFE                                    | 42 |
|   |      | 3.6.1                                        | Disadvantages of Switch Capacitor DFE                | 43 |
|   | 3.7  | Eye op                                       | pening                                               | 44 |
|   |      | 3.7.1                                        | Fullrate Architecture                                | 44 |

|   |     | 3.7.2 Half rate Architecture           | 45 |
|---|-----|----------------------------------------|----|
| 4 | Des | erializer                              | 47 |
|   | 4.1 | 1 to 2 Demultiplexer                   | 48 |
|   | 4.2 | Divide by Two - Circuit                | 48 |
|   | 4.3 | Differential to single ended converter | 49 |
| 5 | Ada | ptation Algorithm                      | 50 |
| 6 | Sum | nmary and Future Work                  | 52 |
|   | 6.1 | Future work                            | 52 |

# LIST OF TABLES

| 2.1 | Gain and bandwidth of a differential amplifier loaded with resistors .                | 14 |
|-----|---------------------------------------------------------------------------------------|----|
| 2.2 | Gain and bandwidth of a differential amplifier loaded with active induc-<br>tive load | 18 |
| 2.3 | Gain and bandwidth of cascaded gain cells implementing feedback .                     | 20 |
| 2.4 | Gain and Bandwidth of the frontend Amplifier - Highest Gain setting                   | 24 |
| 2.5 | Gain and Bandwidth of the frontend Amplifier - Lowest Gain setting                    | 24 |
| 3.1 | Sizes of the transistors in the Latch                                                 | 26 |
| 3.2 | Parameters of the latch.                                                              | 28 |
| 3.3 | Variation in signal current of a DFE                                                  | 32 |
| 3.4 | Round Trip Delay around the first tap of DFE                                          | 36 |
| 3.5 | Eye opening for Fullrate Architecture                                                 | 44 |
| 3.6 | Eye opening for Half-rate Architecture                                                | 45 |

# **LIST OF FIGURES**

| 1.1  | AWGN Channel                                                                   | 2  |
|------|--------------------------------------------------------------------------------|----|
| 1.2  | Channel with memory                                                            | 3  |
| 1.3  | Response in a band-limited channel                                             | 3  |
| 1.4  | Eye with no ISI                                                                | 5  |
| 1.5  | Eye with ISI                                                                   | 5  |
| 1.6  | Feed Forward Equalizer                                                         | 6  |
| 1.7  | Front-end of a receiver                                                        | 9  |
| 2.1  | The basic idea in front-end amplifiers                                         | 10 |
| 2.2  | A simple differential pair with resistive load                                 | 11 |
| 2.3  | A simple differential pair with inductive load                                 | 15 |
| 2.4  | Active inductive load                                                          | 15 |
| 2.5  | Impedance curve of the active inductive load, neglecting $g_{ds}$ and $C_{gd}$ | 16 |
| 2.6  | Impedance curve observed                                                       | 16 |
| 2.7  | Amplifier with active inductive load                                           | 17 |
| 2.8  | Structure of each gain cell                                                    | 19 |
| 2.9  | Implementation of a gain cell with feedback                                    | 21 |
| 2.10 | Variable Gain Amplifier                                                        | 22 |
| 2.11 | The final architecture for front-end amplifiers                                | 23 |
| 3.1  | The latch used in DFE                                                          | 27 |
| 3.2  | Current mode adder                                                             | 29 |
| 3.3  | A Half circuit representation of a switch-cap adder                            | 29 |
| 3.4  | A 1-tap DFE using the current mode adder.                                      | 30 |
| 3.5  | Linear $G_m$                                                                   | 31 |
| 3.6  | Effect of phase of sampling clock.                                             | 32 |
| 3.7  | Programmable feedback tap                                                      | 33 |
| 3.8  | Effect of round trip delay                                                     | 34 |

| 3.9  | Circuit to measure the round trip delay of the first tap | 35 |
|------|----------------------------------------------------------|----|
| 3.10 | Speculative DFE                                          | 36 |
| 3.11 | Common mode feedback in DFE                              | 37 |
| 3.12 | Amplifier for the Common mode feedback                   | 37 |
| 3.13 | Fullrate DFE                                             | 38 |
| 3.14 | Unfolding: Example 1                                     | 40 |
| 3.15 | Unfolding: Example 2                                     | 41 |
| 3.16 | Adder with weighted inputs                               | 43 |
| 3.17 | Eye opening in the full rate DFE                         | 44 |
| 3.18 | The eye-diagram for half rate DFE                        | 46 |
| 4.1  | 1 to 8 Deserializer                                      | 47 |
| 4.2  | 1 to 2 Demultiplexer                                     | 48 |
| 4.3  | Divide by Two - Circuit                                  | 49 |
| 4.4  | Differential to single ended converter                   | 49 |
| 5.1  | A one tap filter                                         | 50 |

# **ABBREVIATIONS**

| ISI  | Inter symbol interference                                  |
|------|------------------------------------------------------------|
| DFE  | Decision feedback equalizer                                |
| FFE  | Feed forward equalizer                                     |
| VGA  | Variable gain amplifier                                    |
| GBW  | Gain bandwidth product                                     |
| MMSE | Minimum mean squared error                                 |
| ZF   | Zero forcing                                               |
| LMS  | Least mean square                                          |
| DFF  | D- Flipflop                                                |
| SS   | Refers to corner of slow mosfets and high resistance       |
| sf   | Refers to corner of slow mosfets and low resistance        |
| fs   | Refers to corner of fast mosfets and high resistance       |
| ff   | Refers to corner of fast mosfets and low resistance        |
| tt   | Refers to corner of typical mosfets and typical resistance |
|      |                                                            |

## **CHAPTER 1**

## Introduction

## 1.1 Channel

A channel can be described as a path which the signal takes. Ideally, the signal at the receiver is expected to be an exact replica of the signal which is transmitted. If this is the case, then the channel is said to be an ideal channel. But this is never the case in reality. The channel modifies the signal and the received signal is different from the transmitted one.

### **1.2 Additive White Gaussian Noise Channel**



Figure 1.1: AWGN Channel

The properties of Additive White Gaussian Noise channels or the AWGN channels are explained by the Fig 1.1. The output of the channel is given by

$$r(t) = ca(t - t_0) + n(t)$$

where a(t) is the transmitted signal and n(t) is Gaussian noise. These channels do not exhibit phenomena like fading, interference, dispersion, etc. A good example of this channel is satellite communication link.

## **1.3** Channel with memory



Figure 1.2: Channel with memory

Channel with memory is also known as a band-limited channel, i.e., one where the frequency response is zero above a certain frequency (the cutoff frequency). Passing a signal through such a channel results in the removal of frequency components above this cutoff frequency; in addition, the amplitude of the frequency components below the cutoff frequency may also be attenuated by the channel.



Figure 1.3: Response in a band-limited channel

This filtering of the transmitted signal affects the shape of the pulse that arrives at the receiver. The Fig 1.3 demonstrates this by showing the effects of filtering a rectangular pulse; not only is the shape of the pulse within the first symbol period changed, but it is spread out over the subsequent symbol periods. When a message is transmitted through such a channel, the spread pulse of each individual symbol will interfere with following symbols. This phenomenon where an individual symbol is affected by the neighboring symbols is known as Inter-symbol interference (ISI).

## 1.4 Pre-cursor and Post-Cursor

The effect of inter-symbol interference explained above can be classified into post cursor and pre cursor. The effect of a current bit on all the following bits is known as the post cursor. In an impulse response, it is the part which follows the main cursor. The ISI caused by future bits on the current bit is known as pre-cursor. This is the part which precedes the main cursor.

## 1.5 Eye Diagram

Eye-diagram is a figure which is an indication of ISI in the received data. It is formed by breaking the signal to lengths of 2T (where T is the symbol period) and super-imposing them. Eye-diagram of a signal without ISI looks like the eye in the Fig 1.4.As the ISI increases the eye tends to close and looks more like the eye in the Fig 1.5.

### **1.6 Equalization**

The theoretical optimum detector for recovering a data sequence with ISI is the maximumlikelihood (ML) sequence detector. The Viterbi algorithm is used to obtain the most likely sequence of symbols. The computational complexity of the Viterbi algorithm makes it difficult to implement at this speed. In such cases, sub-optimal methods are used to detect the transmitted symbols in the presence of ISI.



Figure 1.5: Eye with ISI



Figure 1.6: Feed Forward Equalizer

The simplest equalization technique that has been used over the years is linear feedforward equalization. Feed-forward equalization typically involves the use of a linear transversal finite impulse response (FIR) filter as shown in Fig 1.6. The FIR filter consists of adjustable tap coefficients, c0, c1, c2, c3, ... and the output is the summation of input signal and its scaled and delayed versions. Depending on the values of tap coefficients, the equalizer can be used to cancel the pre-cursor or the post-cursor or both.

To understand this in frequency domain, consider the channel response to be C(z). To annul the ISI, we need

$$E(z) = \frac{1}{C(z)}$$

Now E(z) is truncated to finite number of terms and the corresponding FIR filter is implemented.

### **1.7 Zero Forcing and MMSE Equalizers**

The above explained method for obtaining tap co-efficients for an FFE is called the zero forcing equalization. While this might sound like an ideal solution, ZF equalization implies gain in the frequency range where the channel response is small. Any additive noise in that frequency range is also amplified. So in noisy channels with deep spectral nulls, the ZF-FFE can result in very poor SNR at the output. An alternative is

the minimum-mean-square-error (MMSE) criterion that aims to relax the zero-ISI criterion and instead tries to minimize the combined power of ISI components and noise components.

### **1.8 Decision feedback equalizer**



To get back the actual data, unnecessary component is subtracted from the received wave



Decision-feedback equalization was first introduced by M. E. Austin in 1967, who introduced a decision-theory approach to solve the problem of digital communication over known dispersive channels. This work was the first to describe an approach to use knowledge of past decisions to make corrections to current symbols and thereby cancel post-cursor ISI. With advances in integrated circuit technology, it became possible to implement the decision-feedback equalization functionality in silicon. The DFE was used extensively to combat ISI in disk-drive read channels. As data rates in transmission systems increased, DFEs were adopted in multi-Gb/sec data transmission systems to cancel ISI induced by channel loss in copper-based transmission systems.

The decision feedback equalizer is a symbol-spaced FIR filter with tap coefficients set to cancel post-cursor ISI. By definition, decision-feedback equalization can only remove post-cursor ISI, i.e., ISI caused by previous symbols. Therefore, a practical equalizer usually consists of a feed-forward filter that can remove the pre-cursor ISI and provide some eye-opening to the DFE. The decision element makes a decision at each symbol and sends this symbol information to the feedback filter. The feedback filter removes the post-cursor ISI, without enhancing noise, to completely open the eye.

The tap coefficients of the feed-back filters are selected to optimize a desired performance measure. MMSE class of algorithms (LMS algorithm in particular) are often used to obtain DFE tap coefficients. The adaptation engine monitors the channel and adapts the tap coefficients accordingly. Furthermore, the same equalizer circuit can be used for a variety of channels, with the adaptation engine deciding the optimum tap coefficients for each case.

#### **1.8.1** Advantages and disadvantages of DFE

The main advantage of a decision feedback equalizer is that it is immune to noise. When the magnitude of the noise at the input of the slicer is smaller than the magnitude of the signal (which is almost always true), the sign of the input will not change and slicer detects the bit correctly. The output of the non-linear element is either +1 or -1, which implies that noise which was present at the instant of sampling did not have any effect at all. However when the noise amplitudes are not negligible, there is a possibility of error.

The existence of the DFE relaxes the requirements on the feed-forward equalizer. Without DFE, the feed-forward filter taps are set to cancel both pre-and post- cursor ISI and FFE requires more taps to cancel the same number of post cursors. With DFE, the number of feed-forward taps can be decreased or more pre-cursor ISI can be cancelled with the same number of taps.

Error propagation is the main problem with decision feedback equalizers. A single error made by the slicer causes wrong values of post-cursor getting subtracted from the subsequent bits. This may cause further errors and error propagation and BER increase are observed. One another disadvantage with DFE is that pre-cursor cancellation is not feasible.

## **1.9** Block diagram of the 10Gbps receiver

This design is based on Decision Feedback Equalization (DFE). The taps co-efficients of the DFE are set according to MMSE criterion by an adaptive algorithm (LMS). The DFE is driven by an amplifier system with a variable gain. The gain of the amplifier is also set by the adaptive algorithm. Once the data stream is equalized, the 10Gbps stream is split into 8 parallel streams of 1.25Gbps each.



Figure 1.7: Front-end of a receiver

### **CHAPTER 2**

## **Front-end Amplifiers**



Figure 2.1: The basic idea in front-end amplifiers

## 2.1 Introduction

Front-end amplifiers are high bandwidth, open-loop cascaded amplifiers with variable gain. The amplifier is expected not to introduce any non-linearity or ISI. Hence as a thumb-rule, the 3-db bandwidth of the cascaded system is designed to be around 10GHz and the third harmonic to be 30-dB below the signal level. In addition, the amplifier

must have a variable gain to compensate the gain variations across corners and temperature and the amplitude variations in the signal. Hence the amplifier is implemented in two parts.

- The component with constant gain.
- The component with variable gain.

The initial part of the chapter is a discussion on the implementation of constant gain component. The variable gain component is implemented making minor modification to the constant gain part and it is explained in the latter parts of this chapter. The final architecture for the front end amplifiers is described at the end of the chapter.

## 2.2 Differential pair with resistor load



Figure 2.2: A simple differential pair with resistive load.

#### 2.2.1 Gain

The gain of this circuit can be written as

$$A_{dc} = g_m R$$
$$= \frac{2IR}{V_{GS} - V_T}$$

It can be observed that gain is directly proportional to the drop across the resistance and inversely proportional to  $(V_{GS} - VT)$ . When N such amplifiers are cascaded, the dc-gain of the system is  $A_{dc}^{N}$ .

#### 2.2.2 Bandwidth

The 3-db bandwidth is given by,

$$f_{3-db} = \frac{1}{2\pi RC}$$

where C is the sum of all parasitic capacitances at the output of the amplifier. This includes the input capacitance of the next stage, capacitance due to wiring and the output capacitance of the amplifier itself. Upon cascading the amplifiers, the bandwidth is expected to go down. Assuming the amplifiers to be first order systems, the resulting 3-db bandwidth can be calculated to be

$$\omega_{3-db,N} = \omega_{3-db,cell}\sqrt{2^{1/N} - 1}$$

### 2.2.3 Guidelines

#### Increasing the gain bandwidth product

The gain bandwidth product can be increased by increasing the current density in the differential pair. This can be done by

- Increasing the current keeping the sizes constant. This increases  $g_m$  keeping the capacitance constant. The value of dc-gain  $g_m R$  increases while the pole remains at  $\frac{1}{2\pi RC}$ .
- Decreasing the size of the differential pair (say N-times). This decreases  $g_m$  by  $\sqrt{N}$  times and capacitance by N-times. Similarly the dc-gain is reduced by  $\sqrt{N}$  times and the new bandwidth is N-times the old one.

#### Increasing the gain at a given GB product

The gain of the differential pair can be increased at given GB product by,

- Increasing the resistance keeping the other parameters unaltered.
- Increasing the current without affecting the current density.

The bandwidth can be increased by doing the opposite.

### **Operating point**

The operating point of this circuit was found to vary widely across process corners and temperature. One of the reasons was the  $\pm 30\%$  variation in the value of the resistance. In a cascaded amplifiers system, the output common mode voltage of one stage becomes the input common mode of the following stage. When the resistance is high, the common mode value reduces and the transistors of the following stages do not operate in correct region. Hence, the tail current used in this circuit is made to vary according to the resistance. This biasing shall be referred to as "Constant - IR" biasing. The constant-IR biasing fixes the quiescent output common mode at 650mV. The current consumed by each amplifier varies across corners from  $50\mu A$  to  $100\mu A$ .As explained previously, increasing the quiescent drop across the resistor results in higher gain. However, it was observed that a common mode of less than 650mV leaves very little headroom for the tail current source. Hence the value was fixed to 650mV.

### Results

A cascade of four amplifiers was tested. The results have been tabulated in table 2.1.

| Corner | Temperature | Gain | Bandwidth |
|--------|-------------|------|-----------|
| SS     | 0           | 35.2 | 7.5       |
| sf     | 0           | 27.5 | 12.4      |
| fs     | 0           | 34.3 | 8.3       |
| ff     | 0           | 27.2 | 13.5      |
| tt     | 0           | 31.8 | 9.7       |
| SS     | 100         | 32.5 | 7.7       |
| sf     | 100         | 23.9 | 12        |
| fs     | 100         | 32   | 8.6       |
| ff     | 100         | 24.5 | 13.4      |
| tt     | 100         | 29.2 | 10        |

Table 2.1: Gain and bandwidth of a differential amplifier loaded with resistors

This is the simplest of all designs. Its gain and bandwidth vary a lot across corners and temperature. Since the bandwidth of the system is bound to become lower after layout, other techniques were experimented to increase the bandwidth.

### 2.3 Differential pair with inductive load

The issue with loading the differential pair with resistor is that the bandwidth is not very high. One way of increasing the bandwidth of such circuits is by using an inductor in series with resistors. Increasing the value of this inductance will increase the bandwidth and for a particular inductance, the response will become maximally flat. If the inductance is increased further, the response starts to peak. So, the inductance value can be chosen to be around the value where the response becomes maximally flat. The value of inductors for near maximally flat response was found to be around 5nH. This makes this scheme an impractical one since each amplifier would require two inductors. A cascade of three amplifiers would occupy a huge area. Moreover the presence of other inductors for amplifiers.



Figure 2.3: A simple differential pair with inductive load.

# 2.4 Differential pair with active inductive load

## 2.4.1 Theory



Figure 2.4: Active inductive load

Neglecting  $g_{ds}$  and  $C_{gd}$  the looking in impedance at the source of the transistor in the Fig 2.4 is

$$Z_{in} = \frac{1 + sCR}{g_m + sC}$$



Figure 2.5: Impedance curve of the active inductive load, neglecting  $g_{ds}$  and  $C_{gd}$ 

Though the assumptions are highly inaccurate in 65nm technology and at a frequency of 10GHz, it has been observed that the circuit in Fig 2.4 shows peaking in the impedance versus frequency curve as shown in Fig 2.6.



Figure 2.6: Impedance curve observed

### 2.4.2 Amplifier using an active inductive load



Figure 2.7: Amplifier with active inductive load

The Fig 2.2 shows the implementation of the amplifier with active inductive load. The PMOS transistors in parallel with the NMOS transistors of the load are used as current bleeders. In their absence, all the current flows through the NMOS transistors of the load which results in

- high  $g_m$  of the load NMOS.
- low common mode voltage.

It can be shown that dc-gain of this amplifier is roughly  $\frac{g_{m,in}}{g_{m,load}}$ . High value of  $g_{m,load}$  reduces the dc-gain. The low common mode voltage causes the following stages to operate sub-optimally.

The parallel current source was designed to bleed off roughly 50% of the current. A higher percentage would increase the dc-gain further but at the expense of more prominent mismatch effects between the NMOS transistors. The tail current used for this circuit was constant  $250\mu A$ . The bias voltage at the gate of NMOS is 1.5V.

#### 2.4.3 Results

The bandwidth and gain of a cascade of four stages of the above described amplifier are tabulated in the table 2.2

| Corner | Temperature | Gain  | Bandwidth |
|--------|-------------|-------|-----------|
| SS     | 0           | 29.7  | 10        |
| sf     | 0           | 29.7  | 9.1       |
| fs     | 0           | 25    | 11.5      |
| ff     | 0           | 25.8  | 13        |
| tt     | 0           | 27.8  | 12        |
| SS     | 100         | 30.8  | 8.8       |
| sf     | 100         | 30.87 | 7.7       |
| fs     | 100         | 25.8  | 13        |
| ff     | 100         | 25    | 10.5      |
| tt     | 100         | 28.2  | 10        |

 Table 2.2: Gain and bandwidth of a differential amplifier loaded with active inductive load

This idea has been used for front-end amplification by Krishnapura *et al.* (2005). The disadvantages with this scheme are that it requires additional circuitry to hold the gate of the load NMOS at a voltage higher than  $V_{dd}$  and the resistor required at the gate of the NMOS is of the order of 120k. Bandwidth is observed to be lower in sf corner because the resistance is lower. Upon increasing the resistance, too much peaking was observed in high resistance corners.

A bandwidth increase of the order of 10% has been observed. The variation in common mode is not negligible across process corners and temperature. In certain corners, the tail current sources get crushed and the tail current deviated a lot from the designed value. I predict the variation in bandwidth has to be attributed to this reason. However the variation in dc-gain across process corners and temperature has been found to be less since the gain is ratio the transconductances.

## 2.5 Gain cell with feedback

### 2.5.1 Theory

This circuit is an inspiration from Galal and Razavi (2002). The authors claim that the bandwidth upon cascading many gain cells goes down as

$$\omega_{3-db,tot} = \omega_0 \sqrt[4]{2^{1/N} - 1}$$

if the gain cell is a second order one as compared to  $\omega_0 \sqrt{2^{1/N} - 1}$  for a first order cell. Hence the bandwidth required of each gain cell is less stringent.



Figure 2.8: Structure of each gain cell

The structure of the gain cell is shown in the Fig 2.8. For simplicity let us consider capacitances C in parallel with each of the resistors R. The overall transfer function of this circuit can be reduced to,

$$H(s) = \frac{G_m^2 R^2}{(1 + sRC)^2 + G_m G_{mf} R^2}$$

$$A_{dc} = \frac{G_m^2 R^2}{(1 + G_m G_{mf} R^2)}$$

It can be observed that by varying  $G_{mf}$  the poles can be made complex and approximately near butterworth poles for a second order thereby making the response maximally flat.

For the reasons of convenience, this architecture will be referred to as second order gain cell or simply gain cell in this report.

#### 2.5.2 The implementation

The common modes at all stages of the circuit are maintained at 650mV using the constant-IR biasing. The circuit diagram is shown in Fig 2.9

Each gain cell consumes a current of  $125\mu A$  to  $300\mu A$  depending on the corner.

#### 2.5.3 Results

The bandwidth and gain of a cascade of three stages of the above described amplifier are tabulated in the table 2.3

| Corner | Temperature | Gain | Bandwidth |
|--------|-------------|------|-----------|
| SS     | 0           | 35.2 | 8.8       |
| sf     | 0           | 27   | 9.7       |
| fs     | 0           | 34   | 13.5      |
| ff     | 0           | 28.8 | 18.6      |
| tt     | 0           | 32.8 | 13.6      |
| SS     | 100         | 36.4 | 8.9       |
| sf     | 100         | 29.4 | 10.6      |
| fs     | 100         | 34   | 10.6      |
| ff     | 100         | 29   | 15        |
| tt     | 100         | 33   | 11        |

Table 2.3: Gain and bandwidth of cascaded gain cells implementing feedback



Figure 2.9: Implementation of a gain cell with feedback

### 2.6 Variable Gain Amplifier

The idea behind the variable gain amplifier circuit shown in Fig 2.10 is to vary the resistance in the differential path without affecting the quiescent operating conditions. The gain of the amplifier is maximum when all the switches are open and it is the least when the lower most switch is closed. The bandwidth varies inversely with gain.

In this design, the VGA was designed with amplifier in the Fig 2.2 as the starting point. The VGA will be driving the DFE whose input common mode is 800mV.



Figure 2.10: Variable Gain Amplifier

Hence the value of the resistance was reduced such that the common mode is 800mV. This resistor is then divided into nine parts and switches are connected as shown in the Fig 2.10. The switches are implemented using PMOS transistors since they act as better switches at voltage levels closer to  $V_{dd}$ . These switches add capacitances at every node, degrading the bandwidth. To overcome this effect the tail current was increased and resistance was decreased to increase the bandwidth keeping the common mode constant.

The limitation on the lowest gain achievable is imposed by the number of switches that can be used. For a smaller gain the number of switches to be used are more and as the number of switches increase, the capacitance at each node increases. This causes the bandwidth to deteriorate drastically. For achieving higher gain range, cascade two such amplifiers was used.

A difference of about 9db between the maximum and minimum gains was observed over all corners. However the absolute the gain range varied from [-6.5db, +2db] to [-4db, +5db].

## 2.7 The final architecture

The final architecture could have a combination of above described amplifiers depending on the range of gain required for the system. For more range in gain, more than one VGA can be cascaded. In this design, it is a cascade of two gain cells , a simple differential pair with resistive load and couple of VGAs.



Figure 2.11: The final architecture for front-end amplifiers

# 2.8 Results

The power consumed by the amplifier system is nominally  $600\mu W$ . A variation of  $\pm 200\mu W$  was observed across corners.

| Corner | Temperature | Gain(db) | Bandwidth (GHz) |
|--------|-------------|----------|-----------------|
| sf     | 0           | 22.7     | 12.7            |
| SS     | 0           | 33.7     | 9.7             |
| fs     | 0           | 32.3     | 13              |
| ff     | 0           | 24.5     | 18.6            |
| tt     | 0           | 29.6     | 14.6            |
| SS     | 100         | 32.7     | 9.6             |
| sf     | 100         | 22.6     | 11.7            |
| fs     | 100         | 32.1     | 11              |
| ff     | 100         | 23.4     | 15.6            |
| tt     | 100         | 28.4     | 12.4            |

 Table 2.4: Gain and Bandwidth of the frontend Amplifier - Highest Gain setting

 Corner L Temperature

 Coin(db)

 Pandwidth (GHz)

Table 2.5: Gain and Bandwidth of the frontend Amplifier - Lowest Gain setting

| Corner | Temperature | Gain(db) | Bandwidth (GHz) |
|--------|-------------|----------|-----------------|
| sf     | 0           | 8.7      | 15              |
| SS     | 0           | 17.5     | 15              |
| fs     | 0           | 13.6     | 17.6            |
| ff     | 0           | 7.3      | 23              |
| tt     | 0           | 12.4     | 17.8            |
| SS     | 100         | 16.5     | 11.6            |
| sf     | 100         | 9        | 15              |
| fs     | 100         | 14.4     | 14              |
| ff     | 100         | 7.3      | 20              |
| tt     | 100         | 11.8     | 15              |

## **CHAPTER 3**

# Design of decision feedback equalizer



# 3.1 Basic building blocks

The important blocks in a DFE are

- Slicer.
- Delay Element.
- Adder (or Subtracter).
- Scaling element.

The function of slicer is taken care of by a latch and hence no separate slicer is necessary.

### 3.2 Latch design

In high speed circuits, it is customary to implement in CML latches instead of the traditional  $C^2MOS$  latches or TSPC latches. The advantages with CML latch are that the voltage swing is not required to be rail to rail and the input capacitance is considerably lesser. In terms of power,  $C^2MOS$  latches consume  $C_LV_{dd}^2f$ , while CML latches consume  $V_{dd}I_{tail}$  which is frequency independent. At high frequencies such as this,  $C^2MOS$  logic does not offer any great advantage in terms of power. These factors make CML the most preferred choice.

The circuit of a CML latch is shown in Fig 3.1. The functioning of this latch can be divided into two phases. When the clock is high, regenerating differential pair is turned off and all the current is steered through the sampling differential pair. The input is sampled in this phase. When the clock goes from high to low, all the current is steered through the regenerative branch and the output amplitude grows (till all the current flows in M2a/M2b), because of the positive feedback.

The common mode level chosen for data is 800mV and the common mode level of clock is 600mV. Each latch uses about  $250\mu A$  of current and can drive 5fF of capacitance.

| Component  | Value                             |
|------------|-----------------------------------|
| M1, M2, M3 | $10\left(\frac{240n}{60n}\right)$ |
| R          | 2k                                |
| M0         | $20\left(\frac{240n}{60n}\right)$ |
| M4,M5,M6   | $2\left(\frac{240n}{60n}\right)$  |

Table 3.1: Sizes of the transistors in the Latch

### Important parameters of a latch

#### Setup time

The input to the latch is expected to stabilize some time before the clock edge. The minimum time before which the input has to stabilize is known as the setup time. The setup time is dependent on the  $g_m$  of sampling differential pair and the capacitance at



Figure 3.1: The latch used in DFE

the output node.

#### Hold time

The minimum time for which the input has to be held unaltered after the clock edge. The hold time depends on the time required for the current to completely switch from sampling branch to regeneration branch.

#### Clock to Q delay

The time difference between the instant clock crosses zero and the data crosses zero. By definition, this is a function of the sizes of sampling differential pair because the output can switch direction only during sampling phase and not during the regeneration phase.

#### Sensitivity

The smallest amplitude of the input which the latch regenerates to full logic levels without an error is called the sensitivity. This is directly dependent on the size of the sampling differential pair.

#### 3.2.1 Results

The latch was tested for performance using a sinusoidal clock of period 85ps and 400mV amplitude. The sensitivity across corners and temperature was observed to be around 40 mV and the setup-time was observed to be negligible. Other parameters of the latch are tabulated in the table 3.2.

|        | 10010 J.2. 1 ulu |             |           |
|--------|------------------|-------------|-----------|
| Corner | Temperature      | Clk-Q delay | Hold Time |
| SS     | 0                | 17.6p       | 16p       |
| sf     | 0                | 11.8p       | 20p       |
| fs     | 0                | 17p         | 12p       |
| ff     | 0                | 10.9p       | 13p       |
| tt     | 0                | 14.5p       | 14p       |
| SS     | 100              | 18.7p       | 19p       |
| sf     | 100              | 12p         | 20p       |
| fs     | 100              | 17p         | 14p       |
| ff     | 100              | 12p         | 15p       |
| tt     | 100              | 15p         | 16p       |

Table 3.2: Parameters of the latch.

### **3.3 Design of adder**

The addition operation can be implemented in a couple of ways. One idea is to convert the voltages into currents using transconductors. Currents can be summed by connecting all the branches. The summed current is passed through a resistor to convert it to voltage.

The alternative method is to use switch capacitor technique where voltages are converted to charges and charges are added/subtracted. The operation of this scheme is explained with the help of the figure 3.3. For simplicity, only the half circuit of the



Figure 3.2: Current mode adder

Figure 3.3: A Half circuit representation of a switch-cap adder



differential structure is shown in this circuit.

The operation of this circuit can be divided into two phases. During the first phase, switches - S1a and S1b are closed. The charge on the sampling capacitor,  $C_s$ , is  $C_s \times (V_1 - V_{cm})$ . Now S1a and S1b are opened and S2 is closed. Applying charge conservation to the plates of all capacitors connected to  $V_out$  (note that there is no dcpath to gnd from here),

$$V_{out} = \frac{C_s(V_2 - V_1)}{C_s + C_p}$$
(3.1)

where  $C_p$  is the input parasitic capacitance of the next stage.

The signal S1a is slightly delayed with respect to S1b. S1b is turned OFF slightly earlier than S1a. By disconnecting and floating one of the plates of capacitor, the overall charge of this capacitor stays almost constant. Therefore, the sampling duration of the input signal ends when switch S1b is turned OFF, and the switching time of S1a, which depends on the input signal, is not critical. Moreover, by fixing the charge, the signal dependent charge injection from switch S1a also does not disturb the sampled value. This helps to make the charge injection and sampling time signal independent.

### **3.4** Design of DFE using current mode Adder



Figure 3.4: A 1-tap DFE using the current mode adder.

#### **3.4.1** Sizing the input differential pair and the resistors

The design of DFE starts of with the determination of values of resistors (R) and transconductance  $(G_{m0})$ . The transistors in the  $G_m$ -block and the resistors were designed such that the the gain from input to the output of the transconductor was maximum. The input tap of the DFE is required to operate in linear region to save all the information ISI. This implies that the tail current in transconductor,  $G_{m0}$  should never switch completely. One way to improve linearity is to increase the overdrive  $(V_{GS} - V_T)$  of the transistors of the differential pair.

Gain is directly proportional to the quiescent voltage across the resistors and inversely proportional to the overdrive of the transistors of the differential pair of the transconductor. To increase the gain, the common mode voltage at the output of the transconductor is reduced and the overdrive of differential pair is increased to improve the linearity.



Figure 3.5: Linear  $G_m$ 

In this design, the value of common mode was chosen to be 600mV and the value of  $V_{GS} - V_{TH}$  was chosen to be the highest possible with the given headroom.

#### **3.4.2** Importance of the Phase of the Clock

The signal in this design is sampled at baud-rate, that is once every symbol (or bit in this case). The baud rate equalizers are extremely sensitive to phase offsets in the clock. Figure 3.6 illustrates how a data stream of alternating ones and zeros can totally be misinterpreted if the sampling instants are shifted.

In the process of designing the DFE, clock phase has to be constantly monitored and adjusted if necessary such that the sampling edge falls in the middle of the data. On chip, a clock recovery circuit is implemented to ensure that the clock edge is aligned to the middle of the bit.

### 3.4.3 Designing the feedback taps

Once the sizing of  $G_{m0}$  and the value of resistor is fixed, the value of currents that the feedback taps inject had to be determined. The feedback taps will act as current



Figure 3.6: Effect of phase of sampling clock.

switches and the amount of current they switch is determined by the tap weight. Before fixing the magnitude of current for each bit, the signal current ( $I_0$ ) generated by  $G_{m0}$  was measured (Table 3.3). With a knowledge of resolution required for each tap and the range of values for the post-cursor, the value of LSB current was calculated.

In this design the signal current was nominally  $64\mu A$ . The first tap is expected to remove a post cursor of up to half the main cursor and this translates  $32\mu A$  in current. For a resolution of 5-bits, the value of LSB current in this design is  $1\mu A$ .

| Corner | Temperature | Signal current ( $\mu A$ ) |  |
|--------|-------------|----------------------------|--|
| SS     | 0           | 66                         |  |
| sf     | 0           | 74                         |  |
| fs     | 0           | 56                         |  |
| ff     | 0           | 67                         |  |
| tt     | 0           | 65                         |  |
| SS     | 100         | 59                         |  |
| sf     | 100         | 66                         |  |
| fs     | 100         | 52                         |  |
| ff     | 100         | 61                         |  |
| tt     | 100         | 57                         |  |

Table 3.3: Variation in signal current of a DFE

### **3.4.4 Programmable** $G_m$

The programmability in  $G_m$  is introduced by varying the tail current and it is implemented as shown in the Fig 3.7. The size of the input differential pair does not matter as long as they do not crush the tail current source. This is because the differential pairs operate in completely switched regions and not in the linear region. Hence it might be prudent to keep the sizes small to reduce the parasitics. But one has to make sure that output voltage of the latch completely switches the current in the differential pair to one side.



Figure 3.7: Programmable feedback tap

Another possible method is to have weighted differential pairs connected in parallel

and turn few of them on depending on the requirement. But this scheme adds high parasitic capacitance decreasing the eye-opening. Hence it was not adopted.

#### **3.4.5** Round Trip Delay of the first feedback tap

The round trip delay of the first feedback tap is the sum of clk-Q delay of the latch and delay across all other elements in the loop. This quantity can be around  $\frac{T}{2}$  (where T is the bit period). The reason for this is explained using the Fig 3.8.



Figure 3.8: Effect of round trip delay

The input is assumed to arrive at every positive edge of the clock and the round trip delay is assumed to  $t_0$ . Hence the input to the flip-flop is ready only  $t_0$  after the clock edge. An ideal flip-flop samples the input at the shaded instants. Considering setup time, this time is even lesser. During this time, the master latch has to sample the data correctly, i.e, this time should be long enough for the output of the latch to be able to switch. Hence it is desired that  $t_0$  is not greater than  $\frac{T}{2}$ .

The measurement of the round trip delay was performed using the circuit 3.9. A replica of the DFE is made and the input terminals of the DFE are shorted, there by making the signal component zero. The output of the latch is fed into an ideal VCVS of gain 1 which drives the first feedback of the replica circuit. The delay is the time difference between the instant the clock crosses zero and feedback crosses zero.



Figure 3.9: Circuit to measure the round trip delay of the first tap.

If the round trip delay is high, a speculative feedback scheme can be used for the first feedback. Parallel calculations are made for the two anticipated decisions: either 'HIGH' or 'LOW'; and then a multiplexer (MUX) is used to select the actual choice.

### 3.4.6 Pre-amplifier

A pre-amplifier is used in front of the first latch. This serves two purposes.

|       | 1 7           | 1                |
|-------|---------------|------------------|
| Corne | r Temperature | Round Trip Delay |
| SS    | 0             | 36.6р            |
| sf    | 0             | 26.9p            |
| fs    | 0             | 34.3p            |
| ff    | 0             | 22.5p            |
| tt    | 0             | 29.7p            |
| SS    | 100           | 37.8p            |
| sf    | 100           | 24.6p            |
| fs    | 100           | 38p              |
| ff    | 100           | 22.5p            |
| tt    | 100           | 30.9p            |

Table 3.4: Round Trip Delay around the first tap of DFE



Figure 3.10: Speculative DFE

- Amplifies the eye.
- Shifts the common-mode from 600mV to 800mV.

The designer should be careful about the parasitic capacitance that is added at the current summing node and the amount of extra delay it introduces in the loop. The round trip delay values tabulated includes this additional delay.

#### 3.4.7 Maintaining the common mode voltage

The common mode voltage at the input of the pre-amplifier depends on the co-efficients of the feedback taps. To overcome this problem a common mode feedback is implemented such that the common mode is maintained at 600mV.



Figure 3.11: Common mode feedback in DFE



Figure 3.12: Amplifier for the Common mode feedback

The resistors used to find the common mode voltage were  $25k\Omega$ . A lower value of these resistor decreased the differential mode gain and a higher value made the feedback less stable. To ensure the stability of the loop was compensated using a 100fF capacitor connected between the output of the amplifier and positive input. The amplifier used in common feedback was implemented using a single stage differential pair loaded by differential to single ended converter.



Figure 3.13: Fullrate DFE

## 3.5 Half Rate Architecture

The previous section discussed full rate DFE architecture. It is the most intuitive architecture and a very simple one to implement. However at higher speeds, this architecture poses following issues.

- The switching in the feedback taps might not be clean.
- The design of latch is tougher. The sampling stage of the latch should be able to switch its output in time before regeneration starts.
- Eye opening is small.

Due to the above reasons, a Half rate architecture was experimented. The Halfrate architecture is a result of unfolding the loop of full rate DFE. Since the process of unfolding does not increase the number of delay elements, the half rate DFE would also have only four flip flops. Since majority of the power consumption in this circuit is in the delay elements, the Half rate DFE consumes only a little more power than a full rate DFE.

#### 3.5.1 Basics of Unfolding

In essence unfolding is a version of parallelism. A better throughput is achieved at the cost of more hardware. Consider the following equation,

$$y[n] = ay[n-2] + x[n]$$

This can be reduced to two parallel processes,

$$y[2k] = ay[2k-2] + x[2k]$$
$$y[2k+1] = ay[2k-1] + x[2k+1]$$



Figure 3.14: Unfolding: Example 1

As the next example let us consider a more complicated filter and how it is reduced into two parallel subsystems.

$$y[n] = a_1 y[n-1] + a_2 y[n-2] + x[n]$$
  

$$y[2k] = a_1 y[2(k-1)+1] + a_2 y[2(k-1)] + x[2k]$$
  

$$y[2k+1] = a_1 y[2k] + a_2 y[2(k-1)+1] + x[2k+1]$$



Figure 3.15: Unfolding: Example 2

Now extending this example to this design,

$$y[n] = a_1 y[n-1] + a_2 y[n-2] + a_3 y[n-3] + a_4 y[n-4]$$
  

$$y[2k] = a_1 y[2(k-1)+1] + a_2 y[2(k-1)] + a_3 y[2(k-2)+1] + a_4 y[2(k-2)] + x[2k]$$
  

$$y[2k+1] = a_1 y[2k] + a_2 y[2(k-1)+1] + a_3 y[2(k-1)] + a_4 y[2(k-2)+1] + x[2k+1]$$

In the unfolded version, the latches and feedback operate at half the previous speed and hence have a enough time to switch better. The eye-opening is observed to be better (the results are tabulated). One more hidden advantage is that the deserializer will have one stage lesser (the highest speed stage). Hence there could be power savings overall. All these come at the cost of more complicated routing of feedback taps.

### **3.6 Switched Capacitor DFE**

The main distinction between the previous architecture and the switched capacitor DFE is in the implementation of the adder. The main motivation behind using this scheme is to reduce the power consumption since it involves only passive components and switches [Emami Neyestanak *et al.* (2007)].

The basic concept of adder was described previously. It was described previously that the quantity getting added/ subtracted is the charge. In order to be able to add weighted ratios of the inputs, the charge has to weighted according and this can be either charging the capacitance to a proportional voltage or charging a part of a capacitor bank to a given voltage. The two schemes can be summarized as

- Simple Adder with weighted inputs.
- A more complicated adder with capacitance connected are varied to get different ratios. Inputs are just +1 or -1.

### **Adder with Weighted Inputs**

The circuit in Fig 3.16 illustrates the first of the two schemes. For simplicity only the half circuit has been represented. The terms v[n]p are either +V or -V (representing  $\pm 1$ ), and the DAC does the operation of scaling.

#### Adder with the Capacitor Bank

This scheme does not require the DAC, but requires a capacitor bank to be connected in place of every capacitor. Each capacitor in the Fig 3.16 is replaced by  $2^n$  capacitors where n is the number of bits of resolution. Depending on the ratio of the feedback,



Figure 3.16: Adder with weighted inputs

number of capacitors connected to the feedback can be varied. For instance, if feedback co-efficient of that particular tap is required to be half as maximum, only half the capacitors in that bank are connected to feedback while the remaining are connected to common mode voltage.

#### **3.6.1** Disadvantages of Switch Capacitor DFE

The switch capacitor DFE relies on matching of capacitors. This is a reliable scheme only if the capacitors are 20fF or more. In such a case, the driving stage, either DAC or the Latch depending on the scheme, will have to drive huge capacitance and hence the power consumed by these go up. This beats the purpose of attempting such an architecture in first place. Secondly the eye-opening in this scheme can not be higher than the input voltage. These issues make switched capacitor not a very good choice for a four tap DFE.

# 3.7 Eye opening

## 3.7.1 Fullrate Architecture

| L | able 5.5. Eye opening for Fundae Architecture |             |                  |
|---|-----------------------------------------------|-------------|------------------|
|   | Corner                                        | Temperature | Eye Opening (mV) |
|   | SS                                            | 0           | 400mV            |
|   | sf                                            | 0           | 320mV            |
|   | fs                                            | 0           | 300mV            |
|   | ff                                            | 0           | 240mV            |
|   | tt                                            | 0           | 280mV            |
|   | SS                                            | 100         | 320mV            |
|   | sf                                            | 100         | 280mV            |
|   | fs                                            | 100         | 210mV            |
|   | ff                                            | 100         | 200mV            |
|   | tt                                            | 100         | 240mV            |
|   |                                               |             |                  |

Table 3.5: Eye opening for Fullrate Architecture



Figure 3.17: Eye opening in the full rate DFE

## 3.7.2 Half rate Architecture

| Corner | Temperature | Eye Opening (mV) |
|--------|-------------|------------------|
| SS     | 0           | 410mV            |
| sf     | 0           | 400mV            |
| fs     | 0           | 370mV            |
| ff     | 0           | 370mV            |
| tt     | 0           | 360mV            |
| SS     | 100         | 430mV            |
| sf     | 100         | 340mV            |
| fs     | 100         | 390mV            |
| ff     | 100         | 320mV            |
| tt     | 100         | 380mV            |

Table 3.6: Eye opening for Half-rate Architecture



46

## **CHAPTER 4**

## Deserializer

The purpose of this circuit is to split the data stream into 8 parallel paths. This was implemented in a three stage tree like structure wherein data stream is split into two parallel streams at every stage. The following diagram illustrates the method. The first stage of demultiplexing which converts the 10Gbps stream into two 5Gbps is implemented in CML logic while the subsequent stages are implemented in CMOS logic.



Figure 4.1: 1 to 8 Deserializer

## 4.1 1 to 2 Demultiplexer

The purpose of this circuit is to split the incoming data stream into two streams. The clock frequency to be used in this frequency is half the bit-rate of incoming data stream.



Figure 4.2: 1 to 2 Demultiplexer

The latches used in this circuit could be CMOS/ CML depending on the speed of operation, but the idea behind demultiplexing is the same in all the stages of the deserializer.

## 4.2 Divide by Two - Circuit

The divide by two circuit is implemented by feeding back  $\overline{Q}$  to D in a D-flipflop. After converting the clock into lower frequency one would have four phases of the clock separated by 90 degrees are available and the proper one has to be selected.



Figure 4.3: Divide by Two - Circuit

## 4.3 Differential to single ended converter



Figure 4.4: Differential to single ended converter

### **CHAPTER 5**

## **Adaptation Algorithm**

The co-efficients of the DFE are set by an adaptation engine which tries to minimize the error due to the sum of ISI and noise. The adaptation engine works on Least Mean Square (LMS) principle.



Figure 5.1: A one tap filter

For simplicity let us consider a one-tap filter in the Fig 5.1 whose feedback coefficient at an instant is c[n]. The input to the slicer, y[n] is

$$y[n] = x[n] - c[n]y1[n-1]$$

where x[n] is the input. The output of the slicer is sgn(y[n])

$$y1[n] = sgn(y[n])$$

Error due to ISI and noise is the difference between y[n] and y1[n].

$$e[n] = y[n] - y1[n]$$
  
 $e^{2}[n] = (x[n] - c[n]y1[n - 1] - y1[n])^{2}$ 

The LMS algorithm tries to find a co-efficient which minimizes the square of the error. We can see that  $e^2(c)$  is a parabola with a minima for a certain  $c_0$ . For any  $c[n] > c_0$ , the slope of  $e^2(c)$  is positive and vise versa. Slope of  $e^2$  can be shown to be

$$\frac{de^2}{dc} = -2e[n]y1[n-1]$$

To reach  $c_0$ , c[n] should be incremented if the slope is negative and decrease if the slope is positive. Hence,

$$c[n+1] = c[n] - \mu \frac{de^2}{dc}$$
 (5.1)

$$= c[n] + \mu sgn(e[n])y1[n-1]$$
(5.2)

where  $\mu$  is a constant which determines the rate of convergence. For large values of  $\mu$  the algorithm converges faster but there are more oscillations in the value of c[n] around the optimum  $c_0$ . For smaller values of c[n], the rate of convergence is slow but the oscillations around  $c_0$  is also smaller.

The equation 5.2 describes the algorithm for adaption of the coefficients of a single tap DFE. This concept can be extended to multi-tap and the modified equation for the  $k^{th}$  tap is,

$$c_k[n+1] = c_k[n] + \mu sgn(e[n])y1[n-k]$$

## **CHAPTER 6**

## **Summary and Future Work**

| Feature                           | Value                |
|-----------------------------------|----------------------|
| Clock frequency                   | 5GHz                 |
| Bitrate                           | 10Gbps               |
| Power Consumption                 | 10mW                 |
| Eye-opening                       | 400mV p-p diff       |
| Input signal amplitude range      | 30 to 50 mV p-p diff |
| no. of feedback taps              | 4                    |
| max 1 <sup>st</sup> post-cursor   | 0.5                  |
| max $2^{nd} - 4^{th}$ post-cursor | 0.25                 |

## 6.1 Future work

This design is certainly not the most power efficient one. The latches used in the halfrate architecture were designed for performance with 10GHz clock but they are required to operate only with 5Ghz. Hence a lot of power savings can be achieved there.

One challenging task is to design a switched capacitor equalizer. As of now a fullrate does not seem to be possible at a lower power compared to the current mode. However a half rate or even a quarter rate equalizer could be experimented. Half-rate or a quarter-rate eliminates the most power consuming stages in deserializer. A design of 4tap DFE for 10Gbps using switch-capacitor circuits has not been presented which makes the task all the more interesting.

# REFERENCES

- 1. **Krishnapura, N.** *et al.*, A 5 Gbps NRZ transceiver with adaptive equalization for backplane transmission. *In IEEE International Solid State Circuits Conference*. 2005.
- Emami Neyestanak, A. et al. (2007). A 6.0-mW 10.0 Gbps receiver with switchedcapacitor summation DFE. *IEEE Journal of Solid-State Circuits*, Volume: 42, Issue: 4, 889–896.
- Galal, S. and B. Razavi (2002). 10-Gb/s limiting amplifier and laser/modulator driver in 0.18 μm cmos technology. *IEEE JOURNAL OF SOLID-STATE CIRCUITS*, VOL. 38, NO. 12, 2138–2146.