# Design of a High Speed Equalizer with an Eye Opening Monitor

A THESIS

submitted by

# HARIHARAN SRINIVASAN

for the award of the degree

of

## **MASTER OF SCIENCE**

(by Research)



Department Of Electrical Engineering Indian Institute of Technology Madras, India

December 2011

# CERTIFICATE

This is to certify that the thesis titled **Design of a High Speed Equalizer with an Eye Opening Monitor**, submitted by **Hariharan Srinivasan**, to the Indian Institute of Technology Madras, for the award of the degree of **Master of Science**, is a bonafide record of the work done by him under my supervision. The contents of this thesis, in full or in parts, have not been submitted to any other Institute or University for the award of any degree or diploma.

#### Dr. Y. Shanthi Pavan

Research Advisor, Professor, Dept. of Electrical Engineering, IIT-Madras, Chennai 600 036

Place: Chennai Date:

### ACKNOWLEDGEMENTS

I would like to express my sincere gratitude to my advisor Dr. Y. Shanthi Pavan for providing me an opportunity to work on analog circuits. I thank him for his technical guidance and for the review of this thesis. I have always admired his way of teaching analog circuits and will remember it for a long time. I also thank him for providing me with state of the art instruments for my chip measurements and for funding my latter portion of my stay at IIT-M.

I would like to thank Dr. Nagendra Krishnapura for his excellent Broadband Communication Circuits course and for various other discussions. This course helped me a lot during the course of my project and was a good starting point for my MS work. I take this opportunity to thank Dr. M. Manivannan for serving on my graduate committee. I thank one and all of the EE office staff for being very cordial.

I thank my thesis reviewers Dr. Chetan Parikh and Dr. Bharadwaj Amruthur for taking time to review my thesis and for giving their valuable comments.

I want to thank every member of TI lab during various periods of time for keeping the lab active and busy. Specifically, I want to thank Shankar, Deepa, Ashwin, Prabu Sankar, Mrunmay, Nanda Govind, Vallab, Vikas, Prashanth, Animesh, Amrith, Vikas, Kunal, Sumit, Radha, IR, Naga and the list goes on. I thank Lokesh for discussions on testing, layout and for helping me test Siva's chip so as to check the ideas relating to frequency response measurements. Special thanks to Shankar for spending time in maintaining the tools and systems in TI lab. I also thank him for helping me in the digital IC flow that helped me save time in tool related issues. I thank Mrunmay for giving me company during tapeouts, for reviewing my thesis and for many discussions. I thank Animesh for taking care of computers and systems in lab during the final stages of my MS. Thanks to Mohan for helping me take die-photograph.

I want to express my sincere gratitude towards Madras Institute of Technology (MIT) where I did my undergraduation. The inspiration I got from my seniors at MIT kindled my spirits and passion for taking up a career in analog. I also thank my professors at MIT viz. Dr. C.N. Krishnan, Dr. Mala John and Dr. D. Meganathan for creating interest in electronics in me. Special thanks also go to my seniors Lakshmanan, Srikanth, Perumal, KK, Deepa and Sundar from MIT with whom I have had many useful discussions during my BE and MS. Lakshmanan has always been patient towards me and has been my guiding force and torch bearer all along and I will continue to cherish his friendship with me for a very long time to come. I thank Dr. P.V. Ramakrishna for many technical discussions. I thank Shankar (Texas Instruments) for various discussions on chip testing and PCB design. I thank Mr. Giri Rangan (IBM) for his inputs on wrting my MS thesis.

I thank Texas Instruments for providing free samples of their chips which were used in our test-boards. I also thank Centellax online-application engineers for their prompt responses to my queries that helped me in measurements. I thank Mr. Narahari of Shree International for helping me solder QFN packages. I thank Mr. Shashiraj of HiQ electronics for manufacturing good quality PCBs for my chip testing. I thank the engineers from Primeasure for arranging demo of Lecroy oscilloscope.

Last but not the least, it is my bounden duty to thank my parents and my sister for solidly standing behind me during my ups and downs. Without them, I can't imagine writing this thesis and I take this opportunity to dedicate this work to them. They are the ones who give me constant motivation and unconditional support.

# ABSTRACT

The data rate of signals transmitted over channels like backplane, optical fiber, cables etc is limited by inter-symbol interference, cross talk and noise. Equalization is done to circumvent this problem. Implementing such equalizers completely with digital circuits dissipate a lot of power. Also, the ADCs that are required upfront, at multi gigabits per second will be power hungry. Of late, there has been an increasing interest to explore analog techniques to address this problem. This thesis describes an implementation of a continuous time equalizer architecture suitable for broadband digital data communication at 10 Gbps data rate.

High speed links are characterized by plotting an eye diagram of its output. A high frequency oscilloscope which is generally used for this purpose is an expensive option. In this work, an on-chip eye opening monitor is used to capture eye diagrams of high speed signals.

The equalizer is implemented in UMC 0.13  $\mu$ m CMOS process along with the eye opening monitor. Equalizer dissipates 90 mW of power from a 2 V supply, while eye opening monitor dissipates 75 mW of power from a 1.2 V supply and they together occupy an active area of 2.134 mm<sup>2</sup>. Measurements from the chip show that the equalizer and the eye opening monitor are functional upto 5 Gbps data rate.

# TABLE OF CONTENTS

| A  | BSTRACT       |         |                                 | iv |  |
|----|---------------|---------|---------------------------------|----|--|
|    | LIS'          | T OF T  | ABLES                           | ix |  |
|    | LIS           | T OF F  | IGURES                          | x  |  |
| Al | ABBREVIATIONS |         |                                 |    |  |
| 1  | Intr          | oductio | n                               | 1  |  |
|    | 1.1           | Motiva  | ation                           | 1  |  |
|    | 1.2           | Organ   | ization                         | 2  |  |
| 2  | Con           | tinuous | s time equalizer                | 3  |  |
|    | 2.1           | Equali  | zer terminology                 | 3  |  |
|    |               | 2.1.1   | Channel                         | 3  |  |
|    |               | 2.1.2   | Inter symbol interference (ISI) | 3  |  |
|    |               | 2.1.3   | Eye diagram                     | 4  |  |
|    |               | 2.1.4   | Bit error rate (BER)            | 5  |  |
|    | 2.2           | Equali  | zer                             | 5  |  |
|    |               | 2.2.1   | Equalization at the transmitter | 6  |  |
|    |               | 2.2.2   | Equalization at the receiver    | 7  |  |

|   |      | 2.2.2.1 Analog equalizers                                  | 8  |
|---|------|------------------------------------------------------------|----|
|   | 2.3  | Minimum mean square error solution to find the tap weights | 12 |
|   | 2.4  | Simulation results                                         | 14 |
| 3 | Desi | ign of equalizer in 0.13 $\mu$ m CMOS technology           | 17 |
|   | 3.1  | Implementation                                             | 17 |
|   |      | 3.1.1 Tunable transconductors                              | 18 |
|   |      | 3.1.2 Second stage transconductor                          | 20 |
|   |      | 3.1.3 Input transconductor                                 | 21 |
|   |      | 3.1.4 Realization of LC filter                             | 22 |
|   | 3.2  | Equalizer frequency response measurement                   | 26 |
|   |      | 3.2.1 Design of test buffer                                | 28 |
|   |      | 3.2.2 2:1 MUX                                              | 28 |
|   | 3.3  | Digital control block                                      | 29 |
|   | 3.4  | Simulation results                                         | 30 |
| 4 | Desi | ign of Eye Opening Monitor                                 | 33 |
|   | 4.1  | Principle of operation                                     | 33 |
|   | 4.2  | Implementation                                             | 35 |
|   | 4.3  | High speed blocks                                          | 36 |
|   |      | 4.3.1 High speed comparator                                | 36 |
|   |      | 4.3.2 Phase interpolator                                   | 38 |
|   | 4.4  | Low speed blocks                                           | 42 |

|   |      | 4.4.1 Design of a 5-bit DAC                            | 43 |
|---|------|--------------------------------------------------------|----|
|   |      | 4.4.2 Design of averaging circuit                      | 44 |
|   |      | 4.4.3 Design of differential to single-ended converter | 45 |
|   |      | 4.4.4 Design of a 10-bit SAR ADC                       | 49 |
|   |      | 4.4.4.1 Capacitor DAC                                  | 50 |
|   |      | 4.4.4.2 Comparator                                     | 52 |
|   |      | 4.4.4.3 Simulating the ADC                             | 53 |
|   | 4.5  | Digital control for EOM                                | 55 |
|   | 4.6  | Simulation results                                     | 56 |
| 5 | Desi | gn of a 10 Gbps PRBS                                   | 59 |
|   | 5.1  | PRBS                                                   | 59 |
|   | 5.2  | Integration of blocks                                  | 62 |
|   | 5.3  | Layout techniques                                      | 63 |
|   |      | 5.3.1 Layout of the capacitor array in SAR ADC         | 64 |
| 6 | Mea  | surement techniques and IC characterization            | 67 |
|   | 6.1  | Test setup                                             | 67 |
|   | 6.2  | Making differential measurements with a two port VNA   | 69 |
|   | 6.3  | PRBS outputs                                           | 71 |
|   | 6.4  | Testing the equalizer                                  | 73 |
|   | 6.5  | Testing the EOM                                        | 80 |
| 7 | Con  | lusion                                                 | 85 |

|   | 7.1   | Future work         | 85 |
|---|-------|---------------------|----|
| A | Pin o | letails of the chip | 87 |

# LIST OF TABLES

| 3.1 | Inductor dimensions                       | 26 |
|-----|-------------------------------------------|----|
| 3.2 | Summary of equalizer performance          | 32 |
| 4.1 | Specifications for various blocks in EOM  | 36 |
| 4.2 | Summary of ADC performance                | 55 |
| 4.3 | Summary of EOM performance                | 57 |
| 5.1 | Summary of PRBS performance               | 62 |
| 6.1 | Summary of measured equalizer performance | 80 |
| 6.2 | Comparison with other EOMs                | 83 |
| A.1 | Functionality of pins                     | 87 |

# **LIST OF FIGURES**

| 2.1  | ISI                                                   | 3  |
|------|-------------------------------------------------------|----|
| 2.2  | General block diagram of a serial link                | 4  |
| 2.3  | Eye diagram                                           | 5  |
| 2.4  | ISI cancellation                                      | 6  |
| 2.5  | A 3-tap FFE                                           | 7  |
| 2.6  | A digital equalizer                                   | 7  |
| 2.7  | A 3-tap DFE                                           | 8  |
| 2.8  | Transversal equalizer                                 | 9  |
| 2.9  | Travelling wave equalizer                             | 9  |
| 2.10 | Equalization at the receiver with CTE                 | 10 |
| 2.11 | Singly terminated LC ladder filter                    | 11 |
| 2.12 | Tap pulse responses                                   | 12 |
| 2.13 | Eye diagram without equalization                      | 15 |
| 2.14 | Eye diagram with equalization                         | 16 |
| 3.1  | Implementation of CTE                                 | 18 |
| 3.2  | Tunable transconductor                                | 19 |
| 3.3  | Absolute sum of tap weights Vs summing node bandwidth | 21 |
| 3.4  | Second stage transconductor                           | 22 |

| 3.5  | CMFB opamp                               | 23 |
|------|------------------------------------------|----|
| 3.6  | Input transconductor                     | 24 |
| 3.7  | Tunable 100 $\Omega$ load                | 24 |
| 3.8  | Fully differential LC ladder             | 25 |
| 3.9  | Circuit model of differential Inductor   | 25 |
| 3.10 | Layout of a fully differential inductor  | 26 |
| 3.11 | Equalizer frequency response measurement | 28 |
| 3.12 | Test buffer                              | 29 |
| 3.13 | Source followers and 2:1 MUX             | 30 |
| 3.14 | Equalizer with test circuits             | 30 |
| 3.15 | Eye diagram without equalization         | 31 |
| 3.16 | Eye diagram with equalization            | 31 |
| 4.1  | EOM - operation                          | 34 |
| 4.2  | Block diagram of EOM                     | 35 |
| 4.3  | Comparator                               | 37 |
| 4.4  | Master latch                             | 38 |
| 4.5  | Slave latch                              | 39 |
| 4.6  | Multiple phase clock generation          | 40 |
| 4.7  | Coarse phase generation                  | 41 |
| 4.8  | 8:1 MUX                                  | 42 |
| 4.9  | Fine phase generation                    | 43 |
| 4.10 | First 14 phases of phase interpolator    | 44 |

| 4.11 | A 5-bit DAC                             | 45 |
|------|-----------------------------------------|----|
| 4.12 | DAC reference generation                | 45 |
| 4.13 | Fully differential opamp                | 46 |
| 4.14 | First stage CMFB                        | 46 |
| 4.15 | Resistor tracking current source / sink | 47 |
| 4.16 | Amplifier used in resistor tracking     | 47 |
| 4.17 | Averaging circuit                       | 48 |
| 4.18 | Differential to Single-ended conversion | 48 |
| 4.19 | Opamp used in D/S                       | 49 |
| 4.20 | Block Diagram of SAR ADC                | 50 |
| 4.21 | Capacitor DAC                           | 50 |
| 4.22 | Capacitor DAC control signals           | 52 |
| 4.23 | Pre-amplifier                           | 53 |
| 4.24 | Latch                                   | 54 |
| 4.25 | ADC output spectrum                     | 54 |
| 4.26 | Timing of various blocks                | 55 |
| 4.27 | Input eye diagram                       | 56 |
| 4.28 | Reconstructed eye diagram               | 56 |
| 4.29 | Input eye diagram                       | 57 |
| 4.30 | Reconstructed eye diagram               | 57 |
| 5.1  | PRBS-7                                  | 59 |
| 5.2  | Flip-flop used in PRBS                  | 60 |
|      |                                         |    |

| 5.3  | X-OR used in PRBS                                                                  | 61 |
|------|------------------------------------------------------------------------------------|----|
| 5.4  | CML Buffer                                                                         | 61 |
| 5.5  | Eye diagram at the output of PRBS                                                  | 62 |
| 5.6  | Block diagram of the chip                                                          | 63 |
| 5.7  | Layout of the CTE_EOM chip                                                         | 63 |
| 5.8  | Layout of the capacitor array                                                      | 64 |
| 6.1  | Pin diagram of the chip                                                            | 68 |
| 6.2  | Die photograph of the chip                                                         | 69 |
| 6.3  | Test setup for eye diagram measurement                                             | 70 |
| 6.4  | Test setup for frequency response measurement                                      | 71 |
| 6.5  | Snapshot of the test board                                                         | 72 |
| 6.6  | Differential measurements with baluns                                              | 72 |
| 6.7  | Differential measurements                                                          | 73 |
| 6.8  | PRBS output at 4.4 Gbps, Scale: 65 mV/div, 50 ps/div                               | 73 |
| 6.9  | Frequency response of individual taps                                              | 75 |
| 6.10 | Comparison with Ideal seventh order Butterworth response                           | 76 |
| 6.11 | Frequency response of channel, equalizer and channel+equalizer                     | 77 |
| 6.12 | Pulse response of tap cascaded with channel                                        | 78 |
| 6.13 | Pulse response of channel+equalizer                                                | 78 |
| 6.14 | Eye diagram without equalization, Scale: $20 \text{ mV/div}$ , $50 \text{ ps/div}$ | 79 |
| 6.15 | Eye diagram with equalization, Scale: 10 mV/div, 50 ps/div                         | 79 |
| 6.16 | Eye diagram from EOM, without equalization for a 4 Gbps PRBS input                 | 81 |

| 6.17 | Eye diagram from EOM, without equalization for a 5 Gbps PRBS input | 82 |
|------|--------------------------------------------------------------------|----|
| 6.18 | Eye diagram from EOM, for a 5 GHz sinusoidal input                 | 82 |
| 6.19 | Eye diagram from EOM, after equalization for a 5 Gbps PRBS input . | 83 |

# ABBREVIATIONS

ADC Analog to Digital Converter DAC Digital to Analog Converter CTE Continuous Time Equalizer SAR Successive Approximation Register EOM Eye Opening Monitor CMOS Complementary Metal Oxide Semiconductor NRZ Non Return to Zero LSB Least Significant Bit MSB Most Significant Bit MOSFET Metal Oxide Semiconductor Field Effect Transistor **Operational Amplifier** Opamp FIR Finite Impulse Response FPGA Field Programmable Gate Array Signal to Noise and Distortion Ratio **SNDR SNR** Signal to Noise Ratio Fast Fourier Transform FFT

# **CHAPTER 1**

## Introduction

# **1.1 Motivation**

The demand for high data rate has increased over the last decade. Advancements in Integrated Circuit (IC) fabrication processes along with efficient design techniques enable designers to make faster ICs. Present day microprocessors are already working at multiples of gigahertz frequencies. But the transmission media pose several challenges that limit the data rate. For example, the data rate in a high speed chip to chip communication link via copper traces on a Printed Circuit Board (PCB) is limited by frequency dependent loss of the traces. In an optical communication system, the optical fibers suffer from dispersion which leads to Inter-Symbol Interference (ISI). The medium of transmission is in general called as the channel. Equalization is done to mitigate such channel impairments. Equalization can be implemented in analog or digital domain. For data rates of about 10 Gbps, fully digital equalizers dissipate a lot of power. Also, the analog to digital converter (ADC) that is required upfront will be power hungry. Hence, the trend now is to explore analog techniques for solving these problems.

The signal integrity of high speed links is tested by plotting eye diagrams. Eye diagrams are typically measured with high frequency oscilloscopes which are expensive. Also, such measurements rely on the bandwidth of circuits that follow the Device Under Test (DUT) to be much larger than the data rate in order to make accurate measurements. This makes testing of high speed circuits difficult. To overcome the problem, the key is to make most of the high speed measurements on-chip and take the measured data off-chip at a lower rate. Eye Opening Monitor (EOM) is one such solution. Since the measurements are now done on-chip, the on-chip high speed drivers with 50  $\Omega$  load can potentially be eliminated there by leading to lesser power dissipation. These test circuits can be shut down after measurements are taken.

Traditionally all the high speed designs are implemented in bipolar or compound semiconductor processes since it offers a higher transition frequency  $(f_T)$ . However, due to the recent advancements in the CMOS scaling, faster devices could be obtained from CMOS process itself. Moreover, CMOS processes are less expensive and the high speed analog blocks can be easily integrated with the digital designs on the same chip. Presently, many researchers are putting in lot of efforts in implementing these high speed designs on CMOS processes.

In this work, a 10 Gbps equalizer based on [1] is implemented in UMC 0.13  $\mu$ m CMOS technology. This chip has an EOM [2] to plot the eye diagram of the equalizer and an on-chip Pseudo Random Bit Sequence (PRBS) generator to generate random digital data.

# 1.2 Organization

Rest of the thesis is organized as follows.

**Chapter 2** introduces the concepts and terminologies involved in equalizers. A brief summary of different equalizer topologies is given. It then discusses various aspects of the working of the continuous time equalizer in detail.

**Chapter 3** presents the design of the equalizer in  $0.13 \,\mu\text{m}$  CMOS technology giving circuit details.

Chapter 4 covers the design of EOM.

**Chapter 5** discusses the implementation of an on-chip PRBS and discusses layout techniques used in this chip design.

Chapter 6 throws light on PCB design, test setup and gives measurement results.

Chapter 7 concludes the thesis giving directions for future work.

# **CHAPTER 2**

# **Continuous time equalizer**

# 2.1 Equalizer terminology

#### 2.1.1 Channel

The path which a signal takes to traverse from a transmitter to a receiver is called as channel. Practical channels are copper traces on a PCB, optical fibers, air in case of wireless transmission, etc. Digital data is transmitted into the channel. Ideally, the received signal should be a delayed replica of the transmitter output. But, this can happen only if the channel has infinite bandwidth. Since all practical channels are band-limited, the received signal is no more an exact replica of the transmitted signal.

#### 2.1.2 Inter symbol interference (ISI)

ISI is a phenomenon, by which a transmitted symbol interferes with the previously transmitted symbols as well as with the subsequent symbols. Figure 2.1 represents ISI pictorially.



Figure 2.1: ISI

A general block diagram of a serial link is shown in Figure 2.2 [1]. The source is modelled as a random impulse train with symbols taking values  $+\frac{1}{2}$  or  $-\frac{1}{2}$ . The input data rate is  $f_b = \frac{1}{T_b}$  and we assume it to be 10 Gbps. p(t) models the pulse shape of the input data convolved with the impulse response of a fourth order Bessel filter of bandwidth 7 GHz that takes into account the finite rise time of the data. The pulse shape is assumed to be Non Return to Zero (NRZ). At the receiver, the sampler is typically preceded by an anti-aliasing filter (not shown in Figure).

If the channel is ideal, the received signal will exactly match with that of the transmitted signal (Of course, there will be a delay in the received signal). Since a practical channel has finite bandwidth, the received signal will have finite rise and fall times as shown in Figure 2.1 resulting in the spreading of the pulse, thereby causing bit errors. The effect it has on previously transmitted symbols is called as pre-cursor ISI and the effect it has on subsequent symbols is called as post-cursor ISI as shown in Figure 2.1. Let us consider a single NRZ pulse being fed to the channel. Ideally in an ISI-free channel, the sampled output of the channel contains only the cursor term while all other samples are zero.



Figure 2.2: General block diagram of a serial link

#### 2.1.3 Eye diagram

Eye diagram is a figure which indicates the amount of ISI in the received signal. An eye diagram is constructed by breaking the received signal into lengths of  $T_b$  and superimposing them, where  $T_b$  is the bit period of the signal. A typical eye diagram is shown in Figure 2.3.



Figure 2.3: Eye diagram

#### 2.1.4 Bit error rate (BER)

In digital data transmission, a bit error is said to have occurred if the received bit is not equal to that of the transmitted bit. BER is defined as the ratio of number of bit errors to the total number of bits transmitted during the period of observation. ISI and noise in the received signal cause bit errors.

# 2.2 Equalizer

Equalizer is a circuit which is used to mitigate the effects of ISI on the received signal. Typically most of the practical channels are low pass in nature. Equalizer can be thought of as a circuit which gives high frequency boost so that the overall frequency response (channel+equalizer) is flat in the band of interest. In time domain, it can be understood with the help of Figure 2.4. Let us say the output of the channel for a single NRZ pulse

input is y(t). Let us also consider that we have  $y(t - T_b)$ ,  $y(t - 2T_b)$  also available. If they are scaled and added as shown in Figure 2.4, the ISI terms are minimized, thus 'equalizing' the channel.



Figure 2.4: ISI cancellation

Equalization can be done at the transmitter or at the receiver or at both the ends.

#### 2.2.1 Equalization at the transmitter

Equalizers at the transmitter are called as Feed Forward Equalizers (FFEs) and this way of equalization is often known as pre-emphasis, as the transmitted data is pre-distorted before feeding it to the channel. Figure 2.5 shows a 3-tap FFE. The delays  $z^{-1}$  are implemented with a flip-flop. Each delay output (or tap output) is scaled and added in such a way as to cancel the ISI terms. FFE can cancel both pre-cursor and post-cursor ISI. However, FFEs are not suitable for adaptation, since the channel information is not known. In case it has to be adapted, a reverse path after the channel has to be fed to the transmitter to vary the tap weights. Hence, most of the FFEs implemented rely on apriori knowledge of the channel characteristics.



Figure 2.5: A 3-tap FFE

#### 2.2.2 Equalization at the receiver

Equalizers at the receiver are often adaptive - it can be tuned to different channels with the help of a Least Mean Square (LMS) engine. Equalization at the receiver can be done in multiple ways. One such way is shown in Figure 2.6. The received signal is first sampled and quantized with an Analog to Digital Converter (ADC) followed by a Finite Impulse Response filter (FIR). The tap weights in the digital FIR filter are adapted with the help of a LMS engine. This ADC has to work at 10 Gbps! ADCs at 10 Gbps sampling rate are power hungry. Also this type of equalizer is not easily scalable to higher data rates like 40 Gbps.



Figure 2.6: A digital equalizer

Another way of carrying out equalization at the receiver is through the Decision Feedback Equalizer (DFE) [3] as shown in Figure 2.7. It uses the previous decisions to estimate and cancel the ISI introduced by the channel. One of the main advantages of



Figure 2.7: A 3-tap DFE

using a DFE is that it does not amplify the noise, while other equalizer topologies at the receiver suffer from noise enhancement [3]. However, a DFE can only cancel post cursor ISI terms [3]. Hence, DFE in many practical implementations is designed along with an FFE.

#### 2.2.2.1 Analog equalizers

Equalization at the receiver can be performed using continuous time circuits as well. There are a number of ways of realizing analog equalizers. In [4], a passive boost stage is used to achieve equalization. In few equalizer architectures, the tap delay line structure is implemented using analog hardware [5]. These delays are usually implemented with on-chip transmission lines. In some implementations, it is mimicked with passive LC circuits. The output of each delay cell will be referred to as the basis response or tap response and these delays are typically  $\frac{T_b}{2}$ . Such equalizers are called as Fractionally Spaced Equalizers (FSE). FSE has the advantage of being insensitive to the phase of the sampling clock [5]. The tap delay line structure is predominantly realized using either transversal or travelling wave (TWA) architectures. They are shown in Figures 2.8 and 2.9 [6]. The tap responses are scaled and added using tunable transconductors.



Figure 2.8: Transversal equalizer



Figure 2.9: Travelling wave equalizer

Some of the issues related to transversal and TWA equalizers are as follows. A doubly terminated delay line architecture suffers from attenuation by a factor of two. Hence, in order to account for this, the transconductors have to be scaled up leading to increased

power dissipation. A transversal equalizer is severely affected by transmission line nonidealities like series loss and mistermination of the transmission lines [6]. Transconductors' input parasitic capacitance load the delay lines.

Equalizers at the receiver should be suitable for LMS adaptation. The complexity of the LMS adaptation is less, when the basis responses are available. In the case of a transversal equalizer, the basis responses are readily available but not in a TWA equalizer. Alternative equalizer architectures addressing the problems in TWA can be found in [6] and [7].

The reader will now be introduced to another equalizer architecture which we will call as the 'Continuous Time Equalizer (CTE)' that is used at the receiver. It is based on [1]. To understand the working of this architecture, we will re-draw the Figure 2.2 with the equalizer at the receiver included as shown in Figure 2.10.



Figure 2.10: Equalization at the receiver with CTE

Here, p(t) represents the impulse response of the transmitter pulse shape convolved with the impulse response of the channel.  $x_i(t)$  represents the impulse response of each tap in the equalizer. Figure 2.10 can be used to analyze most of the analog equalizers at the receiver. Different architectures differ only by their tap responses. n(t) represents the input referred noise of the equalizer and it is assumed to be white with double sided spectral density of  $\frac{N_0}{2}$ .

In a FSE, the different tap impulse responses used for equalization, differ from each

other by a delay of  $\frac{T_b}{2}$ . Also, these tap responses hardly span more than a bit period of the input. Hence, the circuit that generates them should possess larger bandwidth. This can be tough to achieve at high data rates and even if it is possible to implement, it can lead to larger power dissipation.

If it is possible to get a set of tap impulse responses which last over a number of bit periods that does equalization, then we can potentially implement them at a much lower power since the bandwidth requirements are now relaxed. One such solution is given in [1]. Here, the first five state variables (capacitor voltages and inductor currents) of a singly terminated seventh order Butterworth filter of bandwidth 5 GHz are used for equalization. The ladder is shown in Figure 2.11, while the pulse response of individual taps are shown in Figure 2.12.



Figure 2.11: Singly terminated LC ladder filter

These responses are linearly combined with appropriate tap weights to accomplish equalization. The state responses in a singly terminated LC ladder filter are orthogonal [1]. The CTE has several advantages in comparison to a transversal or TWA architecture. They are described as follows.

Since the bandwidth of the filter used to generate these tap responses is low, there is no necessity of the anti-alias filter before the sampler. The tap responses can easily be scaled and added using tunable transconductors. The way to sense inductor current will be discussed in Chapter 3. The input parasitic capacitance of the transconductor



Figure 2.12: Tap pulse responses

is not a problem as it can be 'absorbed' into the ladder capacitors. Since all the tap responses are readily available in the LC ladder filter, this architecture is suitable for LMS based adaptation. A singly terminated ladder is used in the CTE and hence there is no attenuation by a factor of two in the tap responses as it is in a doubly terminated ladder and hence lesser power dissipation. The CTE is insensitive to sampling time phase similar to an FSE. The reader is referred to [1] for a comparison of CTE and FSE and for further explanations on the working of the CTE.

# 2.3 Minimum mean square error solution to find the tap weights

One of the ways to find the tap weights is through 'Zero forcing' [3], where the sampled output after equalization is ensured to have only the cursor term, while other samples are

made zero. However this method suffers from noise enhancement. To minimize noise enhancement, the Minimum Mean Square Error (MMSE) solution is used. This is dealt with in detail in [1]. It is repeated here briefly for completeness. Let us consider the case of equalization done at the receiver as shown in Figure 2.10. The tap responses have to be scaled with appropriate weights and then added in order to perform equalization. The tap weights are computed in such a way so as to minimize the mean square error between the transmitted symbols a(n) and the sampled equalizer output y(n).

The output of equalizer is given by

$$y(t) = \sum_{i=1}^{N} \sum_{k=-\infty}^{+\infty} w_i a(k) c_i(t - kT_b) + \sum_{i=1}^{N} w_i n(t) * x_i(t)$$
(2.1)

where  $c_i(t) = p(t) * x_i(t)$  and \* denotes convolution. The equalizer output y(t) is sampled at integral multiples of  $T_b$ . Let  $t_0$  be the phase offset between the transmitter and receiver clocks. The sampled output of the equalizer is

$$y(n) = \sum_{i=1}^{N} \sum_{k=-\infty}^{n} w_i a(k) c_i (nT_b + t_0 - kT_b) + \sum_{i=1}^{N} w_i (n(t) * x_i(t))|_{t=nT_b + t_0}$$
(2.2)

This can be cast in matrix form as

$$y(n) = a^{T}(n) \begin{bmatrix} c_{1}(0.T_{b}) & \dots & c_{N}(0.T_{b}) \\ c_{1}(1.T_{b}) & \dots & c_{N}(1.T_{b}) \\ \vdots & \ddots & \ddots & \vdots \\ \vdots & \ddots & \ddots & \vdots \\ c_{1}(L.T_{b}) & \dots & c_{N}(L.T_{b}) \end{bmatrix} \begin{bmatrix} w_{1} \\ w_{2} \\ \vdots \\ \vdots \\ \vdots \\ w_{N} \end{bmatrix} + \eta^{T}(n) \begin{bmatrix} w_{1} \\ w_{2} \\ \vdots \\ \vdots \\ \vdots \\ \vdots \\ w_{N} \end{bmatrix}$$
$$y(n) = a^{T}(n)Cw + \eta^{T}(n)w$$
(2.3)

The various matrices used in Equation 2.3 are described as follows. C is a (L+1)×N matrix, where N is the number of taps used and L is the number of bit periods up to which  $c_i(t)$  exists and after which it is negligible. The  $i^{th}$  column of matrix C corresponds to

the sampled pulse response of the channel cascaded with  $i^{th}$  tap or the samples of  $c_i(t)$ taken at  $nT_b + t_0$ .  $a^T(n) = [a(n) \ a(n-1)^{\cdots} a(n-L)]$  denotes the transmitted symbols and  $\eta^T(n) = [(n(t) * x_1(t))_{|t=nT_b+t_0}^{\cdots} (n(t) * x_N(t))_{|t=nT_b+t_0}]$ . w is a N×1 matrix denoting the weights for each tap.

The tap weights are found as ones that minimize the mean square error between the samples of the equalizer output y(n) and the ideal output  $h_{\delta}^{T} = [00100...0]$ .  $h_{\delta}^{T}$  is a  $1 \times (L+1)$  matrix. The ideal output is  $a(n - \delta)$ , where  $\delta$  is the delay after which the output is expected.

The optimum tap weights  $(w_{opt})$  and Minimum Mean Square Error (MMSE) are given by [1]

$$w_{opt} = A^{-1}C^T h_\delta \tag{2.4}$$

$$MMSE = \sigma^2 h_\delta^T (I - CA^{-1}C^T) h_\delta$$
(2.5)

where  $A=C^TC+\frac{M}{\sigma^2}$ ,  $\sigma$  is the standard deviation of the input. M is given by

The optimum delay corresponds to the minimum diagonal element of  $I - CA^{-1}C^{T}$  [1].

# 2.4 Simulation results

The performance of CTE is analyzed by using it to equalize a known channel in simulation. A channel model corresponding to Polarization Mode Dispersion (PMD) in single-mode optical fiber is used for this purpose. The impulse response of the channel is given by

$$h(t) = 0.5\delta(t) + 0.5\delta(t-\tau)$$
(2.6)

where  $\tau$  is called as Differential group Delay (DGD) and for this simulation  $\tau$  is assumed to be  $\frac{12}{16}T_b$ . The weights for the five taps in the CTE are computed for the PMD channel as mentioned in Section 2.3. The equalizer is tested by transmitting a PRBS data into the channel and the eye diagram after equalization is plotted. The eye diagrams with and without equalization are shown in Figures 2.13 and 2.14 respectively.



Figure 2.13: Eye diagram without equalization



Figure 2.14: Eye diagram with equalization

# **CHAPTER 3**

# Design of equalizer in 0.13 $\mu$ m CMOS technology

The CTE described in Chapter 2 is implemented in UMC 0.13  $\mu$ m CMOS process. It uses the first five state responses from a seventh order Butterworth filter of bandwidth 5 GHz. These state responses have to be scaled and added to equalize the channel. The tap weights for equalization are found using the method described in Section 2.3.

In this chapter, we will discuss how the CTE is implemented, the various challenges involved, before concluding with the simulation results of the design. All the designs carried out are fully differential unless otherwise mentioned.

## 3.1 Implementation

The first major task is - how do we implement the weighted summation of the state responses ? The capacitor voltages can be easily sensed with transconductors. Transconductors are often realized using differential pairs. Sensing inductor current is not a trivial task. One can think of introducing a resistor in series with the inductor and sensing the voltage across it. However, this will alter the transfer function of the filter (which can be taken care of by pre-distorting the transfer function) and also introduces additional thermal noise. A simple idea that employs duality was proposed in [1]. In this method, a dual ladder is used to sense the inductor current. Inductor current in a normal network appears as a capacitor voltage in a dual network. Now, all the state variables appear as capacitor voltages which can be sensed with transconductors. Addition of these scaled responses are performed by just shorting the transconductor outputs.

The bond-wires which connect the equalizer inputs to the package will alter the transfer function of the ladder, since typical bond-wire inductances are of the order of couple of nH and are comparable to that of the ladder inductors. This can be fixed as follows. The input voltage to the equalizer can be fed to a transconductor and its output current is pumped into the ladder. This is shown in Figure 3.1.



Figure 3.1: Implementation of CTE

#### **3.1.1** Tunable transconductors

The state responses are scaled and added using transconductors. These transconductors have to be tunable since the CTE should be able to adapt to different channels. A differential pair operating in its linear region operates as a transconductor - the incremental output current is proportional to the incremental input voltage. The linearity of the transconductor is better, if the transconductors are biased at a higher overdrive. It is usually biased at a smaller overdrive in case of using a differential pair as a Current Mode Logic (CML) buffer [3]. In this design typical overdrives are of the order of 250 mV. To change the transconductance of the differential pair, four binary weighted differential pairs are connected in parallel and are controlled by four bits. Each differential pair can be selectively enabled by controlling the gate of the tail transistor as shown in Figure 3.2. When a differential pair is to be enabled, the gate node of the tail transistor is connected to the biasing node and to disable it, the gate node is connected to ground.

The tap weights of the equalizer can be positive or negative depending upon the channel characteristics. Hence, a provision to change the polarity of the transconductance is added. This is done by having two four bit transconductors one with normal inputs and the other transconductor with the inputs reversed. Now, the tunable transconductor for every tap is controlled by eight bits.



Figure 3.2: Tunable transconductor

Various trade-offs in this implementation will be discussed as follows. The summing point of the transconductors is subjected to bandwidth limitations. All the tunable

transconductor outputs are shorted and converted to voltage by passing the current through a resistor. The other end of the resistor is connected to supply in order to bias the differential pair properly. Since the summing node is loaded by several blocks, it suffers from poor bandwidth. A simple solution to this will be to reduce the load resistance. But this will reduce the dc-gain of the differential pairs. This can be taken care of by increasing the transconductance by increasing its current. But this will lead to increased power dissipation. Transconductance can be increased by increasing the device size. However, this will again load the summing node. So, the key to reduce power dissipation, is to maximize the load resistance as much as possible and reduce the device parasitics to get good bandwidth. The load resistance was chosen to be  $100 \Omega$ .

The finite bandwidth at the summing node is still a problem, which when not taken care of, will result in a smaller eye opening in the equalized eye diagram. The effect of finite bandwidth at the summing node is similar to that of the channel. Hence the new tap weights can be computed taking into account this 'additional' channel.

A comparison of the absolute sum of tap weights for different bandwidths at the summing node is shown in the Figure 3.3. The motivation to carry out this simulation will be evident shortly. For a given data rate, the absolute sum of tap weights will increase with increase in ISI. The absolute sum of tap weights being large means that more differential pairs (in the weighted summation block) are turned ON, which implies that more current is drawn from the supply through the load resistor there by resulting in increased power dissipation. Hence, it always helps to have as high a bandwidth as possible at the summing node.

#### 3.1.2 Second stage transconductor

It is difficult to achieve the required gain in a single stage of differential pair, since the tap weights are large enough to be implemented with just a single stage amplifier. Only a scaled down version of the tap weights was implemented in the first stage. This is not a problem since the ISI cancellation depends upon the ratio of the tap weights


Figure 3.3: Absolute sum of tap weights Vs summing node bandwidth

rather than the absolute numbers. Hence, the first stage output will only be a scaled down version compared to the ideal output. So, the signal at the summing point of the transconductors is amplified by using another differential pair. The circuit schematic of this differential pair is shown in Figure 3.4. The opamp used in the Common Mode Feedback (CMFB) is shown in Figure 3.5. The opamp is Miller compensated to ensure stability of the amplifier [8]. Ideally, a differential pair with resistive load does not need a CMFB. The reason why a CMFB is introduced at the output of the differential pair will be evident from Section 3.2.

#### **3.1.3** Input transconductor

The input transconductor is implemented using a simple differential pair. It has a tunable load which brings the resistance close to its required value of  $100 \Omega$  to account for process variations. The circuit schematic for input transconductor is shown in Figure 3.6. The schematic for tunable load is shown in Figure 3.7. The control bits b[3:0] come from a binary to thermometric code converter. The cascode transistor is biased



Figure 3.4: Second stage transconductor

through a resistor to Vdd. This is done to avoid any common mode oscillation of the cascode that can happen with the bond-wire inductance.

### 3.1.4 Realization of LC filter

The main component in the realization of LC ladder is the design of inductor. Inductors are designed using a CAD tool called ASITIC [9]. On-chip inductors are realized as spirals of various shapes. ASITIC is used to find the dimensions of the spirals. For the obtained dimensions, the parasitics are extracted to be used in circuit design. Since



Figure 3.5: CMFB opamp

we have a fully differential operation, rather than using separate inductors for the two ladders, mutual inductance is exploited to realize the same inductance in a smaller area as shown in Figure 3.8. The two inductors are laid out such that their fluxes add up. The finite quality factor 'Q' of the inductor is taken care of in the ladder by introducing shunt capacitive loss [1]. This technique is widely used in transmission line based equalizers. The loss in dc gain due to this can be corrected by appropriately scaling the tap weights in the tunable transconductors.

The circuit model for inductor used in simulations is shown in Figure 3.9. Capacitor  $C_p$  is added from nodes  $v_{ip}$  to  $v_{op}$  and from  $v_{im}$  to  $v_{om}$ . This is done to take care of the coupling capacitors  $C_p$  from nodes  $v_{ip}$  to  $v_{om}$  and from  $v_{im}$  to  $v_{op}$ . The dimensions of inductors found from ASITIC are given in Table 3.1, where dout denotes the outer diameter of the spiral, W is the width of each turn, S is the spacing between successive



Figure 3.6: Input transconductor



Figure 3.7: Tunable  $100 \Omega$  load

turns, N is the number of turns of the spiral, L is the self inductance, M is the mutual inductance and R is the series resistance of the inductor.

In this design, inductors are laid out in Metal-8 (top most metal layer in the process). The thickness of the top metal layer is about  $2\,\mu m$  and the sheet resistance is  $15\,m\Omega$ .

Since the inductors are laid out in the top metal layer, it offers less parasitic capacitance to ground and gives a better Q due to its lower sheet resistance.



Figure 3.8: Fully differential LC ladder



Figure 3.9: Circuit model of differential Inductor





Figure 3.10: Layout of a fully differential inductor

| S.No | $dout(\mu m)$ | $W(\mu m)$ | <b>S</b> (μm) | N   | L(nH) | M(nH) | $\mathbf{R}(\Omega)$ |
|------|---------------|------------|---------------|-----|-------|-------|----------------------|
| 1    | 113           | 4.5        | 2             | 3.5 | 1.21  | 0.88  | 3.9                  |
| 2    | 127           | 3          | 2             | 4.5 | 2.45  | 1.983 | 8.4                  |
| 3    | 140           | 2.8        | 2             | 4.5 | 3.12  | 2.58  | 10.4                 |
| 4    | 133           | 3          | 2             | 4.5 | 2.70  | 2.202 | 8.9                  |
| 5    | 138           | 3          | 2             | 4.5 | 2.92  | 2.39  | 9.4                  |
| 6    | 120           | 3.6        | 2             | 4.5 | 1.86  | 1.46  | 6.2                  |
| 7    | 92            | 7.1        | 2             | 2.5 | 0.45  | 0.25  | 1.4                  |

Table 3.1: Inductor dimensions

# 3.2 Equalizer frequency response measurement

The equalizer works at 10 Gbps. It is hard to ensure that the bandwidth of circuits that follow the equalizer (the bond-wire inductance and package parasitics) to be greater than 10 GHz in order to introduce negligible ISI. Hence, the following method is used to characterize the frequency response of the equalizer, which takes into account the transfer functions of circuits that follow the equalizer.

In this measurement, the equalizer is followed by a test buffer terminated with 50  $\Omega$  to

drive off-chip loads. The idea used here to measure the frequency response is similar to the one described in [10]. This is shown in Figure 3.11. The technique usually employed to measure the frequency response is as follows. First, the frequency response of the equalizer and the test buffer in cascade  $V_{eqbuff}(f)$  is measured. Then, the frequency response of the test buffer alone  $V_{buff}(f)$  is measured. The ratio of these two frequency responses give the frequency response of equalizer  $V_{eq}(f)$ .

$$V_{eq}(f) = \frac{V_{eqbuff}(f)}{V_{buff}(f)}$$
(3.1)

This method is effective in characterizing the frequency response of filters in the pass band. However, in the stop band of filters this method cannot accurately characterize the frequency response. One of the reasons for this is the feedthrough in package [10]. By having a simple polarity reversal in the gain of the test buffer, the package feedthrough effects can be mitigated to a great degree [10]. There is a provision in the test buffer to change the polarity of its gain. i.e. the output of the test buffer will be either  $k(v_{ip} - v_{im})$ or  $k(v_{im} - v_{ip})$ , where k is the dc gain of test buffer and  $v_{ip}$ ,  $v_{im}$  are the inputs of the test buffer.

Now, four measurements have to be taken to measure the frequency response of the equalizer. First, the frequency response of equalizer and test buffer in cascade is measured  $V_{eqbuff}(f)$ . Then the polarity of the test buffer gain is changed and the frequency response of equalizer and test buffer in cascade  $V_{eqbuffb}(f)$  is measured. The difference between these two measurements gives the frequency response of equalizer and test buffer in cascade. Similarly, the test buffer's frequency response  $V_{buff}(f) - V_{buffb}(f)$  is measured. The ratio of these two values gives the frequency response of equalizer that is free from package feedthrough effects [10].

$$V_{eq}(f) = \frac{V_{eqbuff}(f) - V_{eqbuffb}(f)}{V_{buff}(f) - V_{buffb}(f)}$$
(3.2)

CMFB used at the output of the second stage transconductor, ensures that the input

common mode of the test buffer in the filter path and that in the direct path are the same so that the frequency response of the two test buffers is identical.



Figure 3.11: Equalizer frequency response measurement

The following are circuits that are designed to enable the testability of the equalizer.

#### **3.2.1** Design of test buffer

Test buffer helps in measuring the frequency response of equalizer. The circuit diagram for test buffer is shown in Figure 3.12. The test buffer is implemented using two differential pairs whose outputs are shorted together. Input to one of the differential pairs is reversed.

Both the differential pairs can be selectively enabled which helps in changing the polarity of the gain. This is done by connecting the tail transistor gate to either ground or to the biasing node.

#### 3.2.2 2:1 MUX

The second stage transconductor drives a test buffer and a source follower. The source follower is used to level shift the output of equalizer so that it can drive the EOM which operates with a 1.2 V supply. The output of source follower drives one of the inputs of a 2:1 multiplexer (MUX). The other input of 2:1 MUX is driven by the input of equalizer. This helps in standalone testing of EOM. The schematics of source follower and the 2:1 MUX are shown in Figure 3.13. 2:1 MUX is implemented using two differential



Figure 3.12: Test buffer

pairs with their outputs shorted together with each input given to a differential pair. To select a particular input, the gate node of the cascode transistor of the corresponding differential pair is connected to Vdd. The overall test-setup of the equalizer is shown in Figure 3.14.

## **3.3 Digital control block**

Each tunable transconductor requires a 8 bit digital code. Hence for 5 taps, 40 bits are required. The tunable bits for the equalizer are brought from off-chip through a single pin. It is converted to parallel data using an on-chip serial to parallel converter. The serial to parallel converter is coded using Verilog and synthesized using 'Design Compiler' from Synopsys. Place and Route is done using 'SOC Encounter' from Cadence.



Figure 3.13: Source followers and 2:1 MUX



Figure 3.14: Equalizer with test circuits

## **3.4** Simulation results

Equalizer is simulated by passing a PRBS-7 sequence through a Bessel filter of bandwidth 7 GHz (models the rise time of input data) and then through a Polarization Mode Dispersion (PMD) Channel whose impulse response is given in Equation 2.6. The eye diagram at the input of equalizer and at the output of equalizer are shown in Figures 3.15 and 3.16. A summary of the simulated performance of the equalizer is given in Table 3.2.







Figure 3.16: Eye diagram with equalization

| Table 3.2: Summary of equalizer performance |                      |
|---------------------------------------------|----------------------|
| Parameters                                  | Achieved performance |

| Parameters                      | Achieved performance             |
|---------------------------------|----------------------------------|
| Data rate                       | 10 Gbps                          |
| Power dissipation               | 90 mW                            |
| Technology                      | $0.13\mu\mathrm{m}\mathrm{CMOS}$ |
| Supply voltage (Analog/Digital) | 2 V/1.2 V                        |

## **CHAPTER 4**

## **Design of Eye Opening Monitor**

An Eye Opening Monitor (EOM) is a system which gives an estimate of the eye diagram of a signal. Initially, EOMs were used for measuring BER and later one dimensional (1-D) EOMs were reported that measure the vertical eye opening. In [11], EOM is part of an adaptive equalizer that is used to compensate for the dispersion in an optical fiber. A way to measure the horizontal eye opening was demonstrated in [12]. Recently, [13] illustrated a method to capture two-dimensional (2-D) map of the eye diagram. They used 2-D rectangular masks of various sizes that are swept across the eye. The number of signal transitions inside the mask is recorded as an error. The contour of rectangular masks of the same mask error gives the 2-D map of the eye.

Basically, an eye diagram is made up of several bit periods of the input laid one over the other. Hence to capture the eye diagram it is not necessary to know the waveform completely. Rather, the input can be sampled once in few bit periods and still the eye diagram can be captured. This idea of undersampling is used in the present design [2].

### 4.1 **Principle of operation**

The working of EOM is as follows. Let us consider an eye diagram as shown in Figure 4.1 (a) and assume that there are finite number of phases between 0 and  $T_b$  and finite voltage levels in the vertical axis. The input signal (for which eye diagram is to be captured) is sampled at a particular phase of the clock and compared with a voltage level. Sampling and comparison are done for several bits of the input and the output of comparator is averaged for a large number of bits. This is repeated for every voltage level for a given phase. The averaged values thus obtained is nothing but the Cumulative Distribution Function (CDF) or the probability of the input signal lying below

a particular level. When we differentiate the CDF, we obtain the Probability Density Function (PDF) or the probability of the input waveform lying between two particular voltage levels at a given phase. The same procedure is repeated for every phase from 0 to  $T_b$ . Thus, we obtain a PDF matrix where each column of the matrix corresponds to the PDF of the input data for a particular phase. An intensity mapping is done with this matrix to get a grey scale image of the eye diagram where the intensity of a point depends upon the PDF calculated as above. The number of phases within  $T_b$  and the number of reference levels used, directly correlate to the clarity of the eye diagram generated.



Figure 4.1: EOM - operation

To illustrate the principle of EOM, consider phases P1 and P2 as shown in Figure 4.1 (a). The PDF of the input waveform at phases P1 and P2 are shown in Figures 4.1 (b) and 4.1 (c) respectively. As expected, the probability is high for voltage levels where

the possibility of the waveform lying between those levels is high. As we traverse along the vertical axis at phase P1, we find that that the input waveform lies predominantly around four voltage levels and hence we see four peaks in Figure 4.1 (b). A similar explanation can be given for the PDF at phase P2. An image which is developed with this PDF will have a brighter pixel where the probability is high and a darker pixel where the probability is low. The reconstructed eye diagram is shown in Figure 4.1 (d).

# 4.2 Implementation



Figure 4.2: Block diagram of EOM

The block diagram of EOM is shown in Fig. 4.2. A clocked comparator performs sampling and comparison. Different voltage levels are obtained from a Digital to Analog Converter (DAC). Multiple phases of the clock are obtained using a phase interpolator. Averaging is done using a continuous time low pass filter. To get a digital estimate of the average value, the output of the low pass filter is quantized using an Analog to Digital Converter (ADC). The only high speed blocks in the EOM are the comparator and the phase interpolator. Comparator, phase interpolator and low pass filter are fully differential whereas the ADC takes single-ended input. Hence, the fully differential output from

| Blocks             | Specifications |  |
|--------------------|----------------|--|
| Comparator         | 5 bits         |  |
| DAC                | 5 bits         |  |
| Phase Interpolator | 32 phases      |  |
| ADC                | 8 bits         |  |

Table 4.1: Specifications for various blocks in EOM

low pass filter is converted to single-ended before driving the ADC. Differentiation and intensity mapping are done with the quantized output from the ADC using MATLAB. The tuning of reference level using DAC, phase selection, analog to digital conversion are all automated with the help of a digital control logic. The details of specifications for various blocks derived after system level simulations of the EOM in MATLAB are given in Table 4.1 [2].

## 4.3 High speed blocks

#### 4.3.1 High speed comparator

Comparator plays a vital part in the working of EOM. The working of the comparator is described as follows. First, the difference between the input (for which eye diagram is to be reconstructed) and the DAC output is amplified using a differential pair. The circuit diagram of this differential pair is shown in Figure 4.3. This circuit works well even when there is a difference in the common mode voltages of the DAC and the input signal. The output of this differential pair is fed to a flip-flop which samples it at the rising edge of a clock which comes from a phase interpolator. The comparator is similar to the one implemented in [13]. The flip-flop is realized using master-slave topology. The latches are implemented using standard current mode logic (CML) architecture [3]. The circuit schematics for the master and slave latches are shown in Figures 4.4 and 4.5 respectively. The working of the master latch is as follows. When clkp is at logic '1', the differential pair amplifies the input difference  $(v_{ip} - v_{im})$ . When clkp is at logic '0'



Figure 4.3: Comparator

(clkm='1'), the output regenerates with the help of transistors M1 and M2. The working of slave latch is similar to that of the master latch.

The input amplitude to the slave latch is quite large compared to that in the first latch. Hence it takes less time for slave latch to settle to its final value. So, to reduce power dissipation, the load resistance in the slave latch is increased to  $600 \Omega$  and tail current is reduced to 0.5 mA.



Figure 4.4: Master latch

#### 4.3.2 Phase interpolator

Phase interpolator generates the multiple phases required by the EOM. The simplest way to generate multiple phases of a clock is to pass the clock through a series of buffers having delays equal to the phase resolution required. It is found that the delays are larger than the resolution required. Hence, interpolation is used to generate finer phases. In this design, fifty six phases are generated against thirty two phases (from specifications) to account for on-chip delay variation. The basic idea of phase interpolator is shown in Figure 4.6. The input clock goes through a series of buffers which generate eight coarse phases. Seven phases have to be generated between each of these coarse phases. To do this, one of the eight coarse phases is selected using an 8:1 MUX. The output of



Figure 4.5: Slave latch

MUX goes through another set of buffers which are identical to the coarse buffers. The outputs of two adjacent buffers are fed to the fine interpolator. These two outputs are identical to two adjacent coarse phases. The fine interpolator generates seven phases between these two. These are then amplified before being fed to the comparator. The approach of the fine interpolation used herein is analogous to that in [14] except that the interpolator gain is varied with the help of binary weighted transconductors.

The time span of the reconstructed eye diagram generated by the EOM is equal to the sum of the delays in the coarse buffer chain. Thus, the eye diagram generated need not have one complete bit period of the input due to the on-chip delay variation with process. Hence, the buffer chain was designed considering the worst case delay variation. The

circuit diagrams of coarse buffer chain, 8:1 MUX are shown in Figures 4.7 and 4.8 respectively.



Figure 4.6: Multiple phase clock generation

The phase interpolator is controlled by six bits; three MSBs are used to select the coarse phase and the three LSBs are used to select the fine phase. The three MSBs are fed to a 3 to 8 decoder. Only one of the eight outputs of the decoder is high at a time. The MUX shown in Figure 4.8 is implemented using eight differential pairs whose outputs are connected together. The eight coarse phases are given as inputs to these differential pairs. The selection of a particular coarse phase is done by connecting the cascode node of that differential pair to Vdd. All other differential pair cascodes are connected to ground. This way of realizing the MUX helps in isolating inputs from the output of MUX. The output of decoder drives the cascode nodes of the differential pairs.



Figure 4.7: Coarse phase generation

outputs are given as input to the fine interpolator. The operation of the fine interpolator can be thought of as a weighted summation of two inputs  $V_1$  and  $V_2$ ,  $\alpha$  being the weight of  $V_1$  and  $(1-\alpha)$  being the weight of  $V_2$ .

$$V_{out} = \alpha V_1 + (1 - \alpha)V_2 \tag{4.1}$$

where  $0 \le \alpha \le 1$ . Let  $V_2 = V_1 e^{j\phi}$ . For small values of  $\phi$ , the shift in output phase  $\theta$  with respect to  $V_1$  is linearly related to the phase difference  $\phi$  between the two inputs.

$$\theta \approx (1 - \alpha)\phi \tag{4.2}$$

Note that the fine interpolator is controlled by three bits and hence it is possible to generate 8 phases with the fine interpolator. But, only the first seven phases are used. It is found that skipping the last phase generated by the fine interpolator (for a particular coarse phase) ensures monotonicity of the phases generated by the phase interpolator

which is critical for the clarity of the reconstructed eye diagram that is generated.



Figure 4.8: 8:1 MUX

Fine interpolator is implemented using two 3-bit binary weighted tunable transconductors whose outputs are shorted together. Tunable transconductor is shown in Figure 4.9. In order to get linear phases, the transconductance should scale linearly with weights. The two tunable transconductors used in the fine interpolator are controlled by complementary signals. It ensures that the current through the summing resistor is the same for all the codes there by eliminating any non linearity issues related to varying bandwidth in the fine interpolator. The first 14 phases generated using the phase interpolator are shown in Figure 4.10.

## 4.4 Low speed blocks

The following Sections give the circuit details of low speed blocks in EOM.



Figure 4.9: Fine phase generation

### 4.4.1 Design of a 5-bit DAC

The input to the comparator is differential. Hence the tunable reference to the comparator is also made differential. The DAC is realized using a resistor string due to its simple implementation and inherently monotonic characteristic. To generate a differential output from the DAC, two such ladders are used for ease of layout. The schematic of the DAC is shown in Figure 4.11. Each resistor ladder has thirty two resistors and requires a differential reference. The differential references are generated by passing currents into a fully differential transimpedance amplifier as shown in Figure 4.12. These currents are generated in such a way that they track an on-chip resistor. Thus, a reference voltage which is insensitive to resistor variations is obtained.

A fully differential opamp is used in the transimpedance amplifier. It is a two stage opamp. Each stage uses a separate CMFB circuit. The circuit diagram of the opamp is shown in Figure 4.13. The CMFB for the second stage is provided by transistors M1 and M2. The first stage CMFB is shown in Figure 4.14. Both the CMFB circuits do not load



Figure 4.10: First 14 phases of phase interpolator

the amplifier resistively. The resistor tracking current sources are obtained by having an amplifier in negative feedback as shown in Figure 4.15. It uses the same resistor as the feedback resistor in the transimpedance amplifier to ensure better tracking. The amplifier used in resistor tracking is shown in Figure 4.16.

### 4.4.2 Design of averaging circuit

The output of the high speed comparator is averaged using a low pass filter. The bandwidth of the low pass filter is 75 KHz. The output of low pass filter is reset at the end of every DAC level. The circuit diagram of the averaging circuit is shown in Figure 4.17. Dummy switches are added to take care of charge injection of the reset switch.



Figure 4.11: A 5-bit DAC



Figure 4.12: DAC reference generation

# 4.4.3 Design of differential to single-ended converter

The output of the averaging circuit is differential. The ADC is designed for a full scale of 1.2 V and it takes single-ended input. The differential to single-ended converter (D/S) is designed for a gain of two since the (p-p) output swing of the comparator is 600 mV. The circuit diagram of the D/S is shown in Figure 4.18.



Figure 4.13: Fully differential opamp



Figure 4.14: First stage CMFB

The source follower is introduced to ensure that the D/S circuit does not load the averaging circuit. The source followers are biased with a large enough current to ensure



Figure 4.15: Resistor tracking current source / sink



Figure 4.16: Amplifier used in resistor tracking

linearity. From the system level simulations of the EOM, the resolution of the quantizer that follows the averaging circuit should be 8 bits.

Hence, the signal to noise ratio from the output of the averaging circuit to the output of the ADC should be atleast 48 dB. So the D/S is designed to meet this noise specification. The resistors are chosen considering their noise contribution and linearity of the source follower. The opamp used in this circuit is shown in Figure 4.19. The integrated noise



Figure 4.17: Averaging circuit



Figure 4.18: Differential to Single-ended conversion

at the output of the D/S is 490  $\mu$ Vrms  $\approx 0.4$  LSB, where 1 LSB =  $\frac{1.2V}{1024}$ .



Figure 4.19: Opamp used in D/S

#### 4.4.4 Design of a 10-bit SAR ADC

D/S drives the SAR (Successive Approximation Register) ADC. Low speed ADCs are often implemented using SAR architecture due to its simple implementation. It works based on the binary search algorithm. A typical SAR ADC takes N cycles per conversion, where N is the resolution of the ADC. The main components of a SAR ADC are the sample and hold (SAH), DAC, comparator and the digital state machine. SAH and DAC are implemented in a single block. The DAC is implemented using an array of binary weighted capacitors. The capacitor DAC is segmented as 5 bits (MSBs) and 5 bits (LSBs) to reduce the area of the capacitor array. The overall block diagram of the ADC is shown in Figure 4.20. The input to the ADC can vary from 0 to 1.2 V. The ADC takes 12 cycles per conversion. The design works for a minimum clock period of 80 ns, though in practice it is 1  $\mu$ s.



Figure 4.20: Block Diagram of SAR ADC

#### 4.4.4.1 Capacitor DAC



Figure 4.21: Capacitor DAC

Two identical 5 bit DACs are used in this ADC which are connected together through

a coupling capacitor. The circuit diagram of the DAC is shown in Figure 4.21. The working of the ADC is as follows.

**Sampling :** First, the input is sampled on the capacitor array for one clock cycle. The bottom plates of all the capacitors are connected to the input. The MSB DAC output is connected to  $V_{bias} = 0.6 \text{ V} = \frac{V_{dd}}{2}$  through an nmos switch as shown in Figure 4.20. By the end of the sampling phase, the nmos switch that connects to  $V_{bias}$  is turned off before the sampling switch. This is done to eliminate signal dependent charge injection just as it is done in a bottom plate sampling. Sampling switches are implemented using simple transmission gates.

Bit cycling : After the input is sampled, the quantized bits are obtained from Most Significant Bit (MSB) to Least Significant Bit (LSB) one by one in each clock cycle. After sampling, the bottom plate of the 16C capacitor in the MSB array is connected to  $V_{ref}$ . The input is disconnected from the DAC. All other capacitor bottom plates are connected to ground. After the DAC settles, the comparator samples it and compares it with  $V_{bias}$  (0.6 V). This decision of the comparator is the MSB of the quantized value. When the output of the comparator is high, the bottom plate of the 16C capacitor in MSB DAC remains connected to  $V_{ref}$  during the following cycles. Else, it is connected to ground at the rising edge of the following clock cycle and remains connected for the following cycles. This operation is nothing but finding if the input is between 0 to  $\frac{V_{ref}}{2}$  or if it is between  $\frac{V_{ref}}{2}$  to  $V_{ref}$ . Other bits are computed similar to the MSB. An nmos switch is used to connect the bottom plates of capacitors to ground and pmos is used to connect to  $V_{ref}$ . It can be seen from Figure 4.21 that the switch sizes are binary weighted to ensure equal time constants for all bits. The control signals p[9:0] and n[9:0] control the pmos and nmos transistors respectively. These are generated by the digital logic which controls various blocks of the EOM. After the 10 bits are resolved, all the pmos and nmos switches are turned off for a clock cycle. This is only a dummy phase that ensures proper working of the converter. More detailed explanation on the working of the ADC can be found in [15], [16] and [17]. Control signals for the first two capacitors are shown in Figure 4.22 for  $V_{in} = 0$  V.

The unit capacitance is chosen considering  $\frac{kT}{C}$  noise associated with sampling and the mismatch between the capacitors. In actual implementation, all the capacitors are realized as multiples of a unit capacitor which is chosen as 50 fF.



Figure 4.22: Capacitor DAC control signals

#### 4.4.4.2 Comparator

The MSB DAC output goes to one of the inputs of the comparator, whose other input is 0.6 V. The comparator is implemented as pre-amplifier followed by latch. The circuit diagram of pre-amplifier is shown in Figure 4.23. It is fully differential. CMFB is provided by the pmos transistors M1 and M2. The dc-gain of the pre-amplifier is 32 dB. The latch is implemented using back to back connected inverters. The schematic of the latch is shown in Figure 4.24. When the clock to the latch is high, the back to back

connected inverters regenerate the input and when the clock is low, the inputs (outputs of pre-amplifier) are shorted. The output of latch is sampled using a D-flip-flop to ensure that the output of latch does not change within a clock cycle.



Figure 4.23: Pre-amplifier

#### 4.4.4.3 Simulating the ADC

The ADC is tested by giving a sinusoid of 1.2 V (p-p) at  $\frac{f_s}{2}$ . As already mentioned, the ADC takes 12 cycles per conversion; hence  $f_s$  here is  $12 \times 80$  ns = 960 ns. The input frequency is chosen as

$$\frac{f_{in}}{f_s} = \frac{511}{1024} \tag{4.3}$$

so that the input tone falls on a single bin. The signal to (quantization+distortion) ratio (SNDR) of the ADC is found to be 61 dB while taking a 1024 point FFT. It varies by less than 1 dB across process corners. The spectrum of the ADC is shown in Figure 4.25. A summary of the ADC performance is given in Table 4.2.



Figure 4.25: ADC output spectrum

| Parameters        | Achieved performance |
|-------------------|----------------------|
| Input swing       | 0 to 1.2 V           |
| Sampling rate     | 1 MS/s               |
| SNDR              | 61 dB                |
| Power dissipation | $300\mu\mathrm{W}$   |

Table 4.2: Summary of ADC performance

The differential to single-ended converter loaded with the ADC is tested by giving a sinusoid of 600 mV (p-p differential) at the above mentioned frequency. The signal to (quantization+distortion) ratio at the output of the ADC is found to be 55.33 dB.

## 4.5 Digital control for EOM



Figure 4.26: Timing of various blocks

The working of the ADC, DAC, low pass filter, phase interpolator are controlled by a digital control block. ADC should sample the output of the averaging circuit (after D/S) by the end of each DAC level. After the ADC samples the value, the averaging circuit is reset. During the reset period, the DAC output settles to the next level. This is shown in Figure 4.26. The ADC gives a serial digital data output. To identify the start of each quantized value, the 'sample' signal is brought off-chip. For each of the 56 phases, the

average value for 32 DAC levels is quantized by the ADC. The digital logic is coded using Verilog and synthesized using 'Design Compiler' from Synopsys. The place and route is done using 'SOC Encounter' from Cadence.

### 4.6 Simulation results

The EOM consists of several blocks which operate at different speeds. For example, the phase interpolator works at 5 GHz where as the DAC works at 1 MHz. Hence, simulating the entire EOM becomes time consuming. So the high speed and the low speed blocks are simulated separately. The high speed blocks are tested by giving different eye diagrams to the input with the low frequency blocks replaced by their ideal equivalents.

The input eye diagrams are generated by passing a PRBS-7 sequence through a filter. To speed up the simulation, each DAC level now lasts for only 24 ns against 12  $\mu$ s in practice. The low pass filter (averaging circuit) is reset using an ideal switch. Figure 4.27 is the eye diagram at the input of EOM when PRBS data is passed through a fourth order Bessel filter of bandwidth 7.5 GHz. The output of the filter is given to the 2:1 MUX which drives the EOM. The reconstructed eye diagram is shown in Figure 4.28.





Figure 4.28: Reconstructed eye diagram

Figure 4.29 is the eye diagram at the input of the EOM when the PRBS data is passed through a Polarization Mode Dispersion (PMD) channel, the channel model which is used for equalizer simulations. The reconstructed eye diagram is shown in Figure 4.30.


Figure 4.29: Input eye diagram

Figure 4.30: Reconstructed eye diagram

| Parameters        | Achieved performance |
|-------------------|----------------------|
| Input data rate   | 10 Gbps              |
| Power dissipation | 72 mW                |
| Technology        | $0.13 \ \mu m CMOS$  |
| Supply            | 1.2 V                |

Table 4.3: Summary of EOM performance

The results of the EOM are summarized in Table 4.3.

## **CHAPTER 5**

#### **Design of a 10 Gbps PRBS**

#### 5.1 PRBS

A Pseudo Random Bit Sequence (PRBS) is used to test the performance of equalizers. A PRBS-7 is designed to work at 10 GHz clock rate, where '7' indicates that the length of the PRBS is  $2^7 - 1$ . The implementation of the PRBS is shown in Figure 5.1.



Figure 5.1: PRBS-7

The output of the PRBS is fed to a cascade of two CML buffers, to amplify the signal before driving off-chip loads. The first buffer has a load of  $250 \Omega$  and the second buffer which drives off-chip load has a  $50 \Omega$  load. The flip-flops shown in Figure 5.1 are realized using a master-slave topology. The latches are implemented using standard CML architecture. Though a single-ended version of the PRBS is shown in Figure 5.1 for simplicity, implementation is fully differential. The circuit schematics for flip-flop, X-OR and CML buffers are shown in Figure 5.2, 5.3 and 5.4 respectively.

Figure 5.1 shows that the clock is routed in a direction opposite to that of data. This is done to ensure that there is no timing violation in the PRBS. PRBS has a problem with regard to starting up. When the outputs of all the flip-flops are at logic '0', PRBS

gets stuck to that state and it is not possible to generate a PRBS sequence. Hence a provision has been made in the X-OR to give a trigger to the input of the first flip-flop. This can be seen in Figure 5.3. When 'trig' is high, the load resistors are disconnected from Vdd. The gate of the tail transistor is pulled low and it is turned off. Now,  $V_{op}$  is pulled high and  $V_{om}$  is pulled low. This makes the output of X-OR high when 'trig' is applied. When 'trig' goes low, the load resistors are connected to Vdd, tail transistor gate is connected to bias thereby working as normal X-OR. The eye diagram of the output of PRBS is shown in Figure 5.5. A summary of achieved performance is shown in Table 5.1.



Figure 5.2: Flip-flop used in PRBS



Figure 5.4: CML Buffer



Figure 5.5: Eye diagram at the output of PRBS

Table 5.1: Summary of PRBS performance

| Parameters        | Achieved performance             |
|-------------------|----------------------------------|
| Data Rate         | 10 Gbps                          |
| Power Dissipation | 31 mW                            |
| Technology        | $0.13\mu\mathrm{m}\mathrm{CMOS}$ |
| Supply            | 1.2 V                            |

#### 5.2 Integration of blocks

The equalizer, EOM, PRBS are integrated in a single-chip which will be called 'CTE\_EOM'. The block diagram of the chip is shown in Figure 5.6. The output of PRBS drives a channel which is off-chip. The output of the channel is fed to the equalizer. There are provisions on the chip to test the frequency response of the equalizer and capture the eye diagram of the output of the equalizer. Frequency response is measured with the help of test buffers shown in Figure 5.6. The eye diagram is captured using the EOM. EOM can also be tested in stand-alone mode by feeding the input to the 2:1 MUX.



Figure 5.6: Block diagram of the chip

# 5.3 Layout techniques



Figure 5.7: Layout of the CTE\_EOM chip

Layout of the various circuit blocks is extremely critical for the performance of the chip. Especially, the high frequency blocks have to be laid out carefully in order to meet the bandwidth requirements. Some of the techniques that were used in the layout of high speed circuit blocks will be described in this section.

All the high frequency routings are done with Metal-8. This metal layer has a smaller sheet resistance owing to its greater thickness compared to the other layers. Also, since

it is far away from the substrate (which is grounded), the parasitic capacitance to ground is smaller. The differential output routings were kept as far apart as possible in order to minimize the coupling capacitance between them. This coupling capacitance is particularly important since the actual capacitance to ground from one of the differential net is twice that of the coupling capacitance. Each half of a differential pair is laid out as symmetric as possible with respect to the other in order to get good common mode rejection. Unlike in a conventional layout, the signal carrying nets are not shielded in order to minimize the parasitic capacitance to ground. Nets which carry high currents were made wider to minimize IR drop and also to meet the electro migration rules. Other low speed blocks were routed using conventional layout techniques. The active area of the CTE\_EOM chip is  $2.134 mm^2$ . The chip is fabricated in UMC 0.13  $\mu$ m CMOS process through Europractice.



5.3.1 Layout of the capacitor array in SAR ADC

Figure 5.8: Layout of the capacitor array

The layout of capacitor array in the DAC of SAR ADC is critical to the operation of the

ADC. It is illustrated in Figure 5.8. Care was taken to ensure that the parasitic capacitance from the top plate of the capacitor to the bottom plate are also binary weighted. The top plates of the capacitors are shielded to minimize coupling to other signals. Binary weighted capacitors were realized as multiples of unit capacitors connected in parallel.

## **CHAPTER 6**

#### **Measurement techniques and IC characterization**

#### 6.1 Test setup

The CTE\_EOM chip is packaged with a Quad Flat No leads (QFN) package with fifty six pins. The pin diagram and the die photograph of the chip are shown in Figures 6.1 and 6.2 respectively. The test setup for transient measurements is illustrated in Figure 6.3.

PRBS data at different frequencies is generated from Centellax TG1B1-A Bit Error Rate Tester (BERT). This instrument can also be used as an error detector to find the BER of the given signal. Output of EOM, which is a serial low speed digital data, is captured with Agilent logic analyzer 1682-AD. The serial data is then taken to a PC where it is processed in MATLAB to obtain the reconstructed eye diagram. The high speed clock required by the phase interpolator in EOM and the on-chip PRBS is provided by Centellax TG1C1-A clock synthesizer. The PRBS source and the EOM are clocked by the same clock synthesizer. It is also possible to get a divided clock from the synthesizer which is synchronized to the high speed clock. The eye diagram of the output of equalizer and the on-chip PRBS are probed with Lecroy WaveExpert 100H oscilloscope.

Figure 6.4 shows the test setup for frequency response measurement. Single-ended measurements were done since only a two port Vector Network Analyzer (VNA) was available. However, the provision to change the polarity of the gain of the test buffers help in making differential measurements. This will be explained in Section 6.2. Each of the five tap frequency responses was measured by appropriately setting the tap weights



Figure 6.1: Pin diagram of the chip

in the on-chip serial to parallel converter with the help of Xilinx-Virtex-5 Field Programmable Logic Array (FPGA).

A four layer PCB is used to test the chip. The dielectric height between first two layers is 6 mils and the board thickness is 0.8 mm. All the high speed inputs and outputs are ac-coupled and are routed with impedance controlled PCB traces to match  $50 \Omega$ characteristic impedance in the top most (first) layer. Edge mount SMA connectors are used for high speed inputs and outputs. Vias are avoided in the high speed signal traces to avoid impedance discontinuities. The dielectric material used in the PCB is FR-4 glass epoxy. The first layer and the fourth layer of the PCB contain signal routings;



Figure 6.2: Die photograph of the chip

the second layer is a dedicated ground plane while the third layer is used for various supplies needed by the chip. The different supply voltages needed by the chip are generated using on-board Low Drop-out regulators (LDOs). A snapshot of the test board is shown in Figure 6.5.

# 6.2 Making differential measurements with a two port VNA

The equalizer in the chip is a fully differential structure. In order to characterize it (measuring its frequency response), one would require a four port network analyzer which is expensive. Another way to accomplish differential measurements is through the use of broadband baluns and a two port VNA as shown in Figure 6.6. The input from Port-1 of the VNA is converted to differential and fed to the equalizer, while the differential outputs from the equalizer are converted to single-ended to be fed to Port-2. However, baluns working for a wideband are again expensive. The following method is an economical way of doing differential measurements with only a two port VNA.

Let the differential outputs of the equalizer (before the test buffers) be  $v_{op}$  and  $v_{om}$  and that of the test buffer be  $v_{op1}$  and  $v_{om1}$  as shown in Figure 6.7. One of the differential



Figure 6.3: Test setup for eye diagram measurement

inputs of the equalizer is excited with a single-ended signal while the other input is tied to the input common mode voltage. There will be a finite common mode component in both  $v_{op}$  and  $v_{om}$  which has to be subtracted in order to get the desired response. Let the differential dc-gain of the test buffer be k. It can be changed between +k and -k, a provision in the test buffer that helped us eliminate the package feedthrough effects. The frequency response at the test buffer's output  $v_{op1}$  or at  $v_{om1}$  is measured, first with the test buffer gain equal to +k,  $V_{eqbuffp}(f)$  and then with gain being -k,  $V_{eqbuffm}(f)$ . When the difference between these two responses are taken, we get the differential response in the filter path. The same procedure is repeated in the direct path to obtain  $V_{buffp}(f)$  and  $V_{buffm}(f)$ . Once the direct path and the filter path responses are known, the equalizer's



Figure 6.4: Test setup for frequency response measurement

frequency response can be found easily.

$$V_{eqdiff}(f) = \frac{V_{eqbuffp}(f) - V_{eqbuffm}(f)}{V_{buffp}(f) - V_{buffm}(f)}$$
(6.1)

It is to be noted that this procedure does not disturb the existing test setup and requires no additional hardware to make measurements. This method was used in all the frequency domain measurements carried out in this chip.

# 6.3 PRBS outputs

The on-chip PRBS was tested at different frequencies. The PRBS was found to work up to 4.4 Gbps. The input clock to the PRBS was getting attenuated to a great degree due to the I/O parasitics that the PRBS could work up to 4.4 Gbps against 10 Gbps in



Figure 6.5: Snapshot of the test board



Figure 6.6: Differential measurements with baluns

simulation. The PRBS needs a differential clock having an amplitude of atleast 350 mV (p-p single-ended) as per the design. The input clock amplitude for frequencies above



Figure 6.7: Differential measurements



Figure 6.8: PRBS output at 4.4 Gbps, Scale: 65 mV/div, 50 ps/div

4.4 GHz was found to be lower. This was verified with the results obtained from the EOM while testing it for sinusoidal inputs. The eye diagram of the PRBS output at 4.4 Gbps captured using Lecroy oscilloscope is shown in Figure 6.8. The vertical eye opening is 152.8 mV and the horizontal eye opening is 164 ps.

## 6.4 Testing the equalizer

Testing the equalizer involves the following steps. The sampled NRZ pulse response of the channel cascaded with each of the taps have to be obtained. This is then used to

compute the appropriate tap weights for the equalizer for a given channel. Getting the sampled NRZ pulse response can be carried out in a number of ways. The simplest way one can think of, is to feed a periodic NRZ pulse train having a small duty cycle (ON time is 100 ps for 10 Gbps data rate) to the channel-tap cascade and obtain the sampled response from the Lecroy oscilloscope. The 'OFF' time of the pulse train should be long enough so that the pulse response of the circuit settles before the arrival of the next pulse. However, generating a low duty cycle pulse train is not a trivial task. The instruments that were available to us did not have the capability to generate a pre-defined pattern. However, a PRBS source was available. If we are able to get the step response, we can easily compute the pulse response. The step response can be obtained with the help of the PRBS source as follows. One can feed a low frequency PRBS signal to the channel-tap cascade and obtain the response of it with the Lecroy sampling oscilloscope that locks to the PRBS signal. The response to a long string of zeros followed by long string of ones can be obtained, which is basically the step response. It was possible to get the samples of the step response with very small sampling time (0.78125 ps) with the Lecroy oscilloscope.

Another way to obtain the pulse response is from the frequency response. The frequency response of each channel-tap cascade is measured using techniques mentioned in Section 6.2. The frequency response thus obtained is multiplied with the frequency response of an NRZ pulse. Inverse Fourier transform is then computed numerically to obtain the pulse response.

Both the methods were found to work well when the measurements were taken. The direct path in the chip was used for measuring the frequency response of the equalizer alone de-embedding the test buffer. However, the direct path is actually the channel which the equalizer has to compensate for. The channel basically includes the cable from the PRBS source to the board, the PCB trace upto the chip input on the board, parasitics up to the on-chip equalizer input and then from the output of the equalizer upto the oscilloscope. The frequency response of the channel can be obtained by taking

the frequency response of the direct path buffer.

The frequency response of the individual taps de-embedding the frequency response in the direct path are shown in Figure 6.9. The frequency response is shown only up to 6.5 GHz since the responses became noisier after that. From these frequency responses, the pulse responses were computed and the tap weights were found and programmed into the on-chip serial to parallel converter. The equalizer was found to work up to 5 Gbps equalizing the channel in the direct path.



Figure 6.9: Frequency response of individual taps

A comparison of tap1's normalized frequency response with that of an ideal seventh order Butterworth response having a 3-dB bandwidth of 5 GHz is shown in Figure 6.10. The 3-dB bandwidth of tap1 is 1 GHz. Ideally, the bandwidth of tap1 is 5 GHz. Even in the worst case process corner (in simulation), the bandwidth of tap1 was found to be 2.5 GHz. The possible reason for the reduction in bandwidth is the higher series resistance of the ladder inductors. Also, only a narrow band model was used for inductor

in simulations involving the ladder filters. A more accurate, broadband model should be used to model the inductors.

As one can see from Figure 6.10 the high frequency roll-off of tap1 is not as sharp as it is in the Butterworth response. Hence we could get the equalizer to work up to 5 Gbps even though the bandwidth of tap1 has almost reduced by a factor of five.



Figure 6.10: Comparison with Ideal seventh order Butterworth response

The normalized frequency responses of the channel, equalizer and the cascaded response of channel and equalizer are shown in Figure 6.11. The attenuation of the channel is 15.5 dB at 2.5 GHz and 27.5 dB at 5 GHz. Equalizer gives a boost of 10.22 dB at 2.5 GHz and 7.16 dB at 5 GHz.

The pulse response of each tap cascaded with the channel is shown in Figure 6.12, while the pulse response of channel cascaded with equalizer is shown in Figure 6.13. These pulse responses were obtained from Lecroy oscilloscope by feeding in a low frequency PRBS data to the equalizer input. The amplitude of the pulse responses are smaller



Figure 6.11: Frequency response of channel, equalizer and channel+equalizer

because of the smaller dc-gain of the test buffer.

Equalizer was tested with a PRBS-31 sequence at 5 Gbps data rate and the eye diagram was plotted with Lecroy oscilloscope. The eye diagram before and after equalization are shown in Figures 6.14 and 6.15 respectively. The vertical eye opening in the equalized eye is 21.5 mV while the horizontal eye opening is 104.9 ps. The reason for the small vertical eye opening is because of the smaller dc-gain in the test buffers.

To measure the BER of the equalizer, each of the differential outputs of the equalizer are fed to the differential inputs of the receiver (error detector) in the Centellax-TG1B1A BER tester (BERT). Since the BERT has a finite input sensitivity, the equalizer outputs could not be directly fed to the BERT since the output amplitude is smaller. Hence they are first amplified with a broadband amplifier ZKL-2R5 from Minicircuits before driving the BERT. The BERT has a provision to find the optimum sampling point in the received signal that gives the lowest BER. The BER of the equalizer at the optimum



Figure 6.12: Pulse response of tap cascaded with channel



Figure 6.13: Pulse response of channel+equalizer



Figure 6.14: Eye diagram without equalization, Scale: 20 mV/div, 50 ps/div



Figure 6.15: Eye diagram with equalization, Scale: 10 mV/div, 50 ps/div

sampling point was found to be  $2.31 \times 10^{-8}$ . However, the bandwidth of the broadband amplifier is only 2.5 GHz. Hence the actual BER of the equalizer will be smaller than

| Parameters               | Achieved performance  |
|--------------------------|-----------------------|
| Data rate                | 5 Gbps                |
| Channel loss @ Nyquist   | 15.5 dB               |
| BER without equalization | $4.81 \times 10^{-1}$ |
| BER with equalization    | $1.63 \times 10^{-9}$ |

 Table 6.1: Summary of measured equalizer performance

the measured one. To fix this problem, the finite bandwidth of the broadband amplifier was treated similar to that of the channel and the equalizer taps were recomputed for the new channel. Now the measured BER was  $1.63 \times 10^{-9}$ . However, it was found from measurements that the amplifier was beginning to saturate even for an input swing of 35 mV (p-p single-ended). So, there is a possibility of the amplifier becoming non-linear and hence the actual BER can be potentially lower than that is reported here. The BER in the channel (without equalization) is  $4.81 \times 10^{-1}$ . A summary of the performance of equalizer measured from the fabricated chip is given in Table 6.1.

#### 6.5 Testing the EOM

The *CTE-EOM* chip has a provision to test the EOM in standalone mode and also in a mode where it gives the eye diagram of the equalizer output (before the test buffer). The test setup in Figure 6.3 is used for this measurement. The provision to get a divided clock from the clock synthesizer enabled us to test the undersampling feature exploited by the EOM. The on-chip digital logic which controls various blocks in the EOM is controlled by FPGA. The on-chip digital logic is run at a very low frequency in order to ensure that the averaging of the comparator output (that is done on-chip) is carried out for a large number of input bits.

The reconstructed eye diagram obtained from the EOM in stand-alone mode for a 4 Gbps and 5 Gbps PRBS data are shown in Figures 6.16 and 6.17 respectively. It was found that for data rates more than 5 Gbps the eye diagram generated by the EOM is completely closed because of the high frequency components of the input getting at-

tenuated by the channel up to the EOM input. It is to be noted that the eye diagram generated by the EOM in the stand-alone mode, is nothing but the eye diagram at the input of the equalizer since both the EOM and the equalizer share their inputs. The reconstructed eye diagram at the input of equalizer, for a 5 GHz sinusoidal input having an amplitude of 2 V (p-p single-ended) is shown in Figure 6.18. The eye diagram at the output of equalizer, generated by the EOM for a 5 Gbps data is shown in Figure 6.19.



Figure 6.16: Eye diagram from EOM, without equalization for a 4 Gbps PRBS input

The time span of the generated eye diagram depends on the sum of the delays in the coarse buffer chain in the multiple phase generator used in the EOM. Hence, for PRBS data at frequencies lower than 4 Gbps, only a fraction of a bit period of the eye is obtained. To find the time span of the generated eye diagram, a sinusoid of known frequency is given as input to the EOM. By finding what fraction of a period of the sine wave is generated, one can estimate the time span. Alternatively, from the PDF matrix obtained for the sinusoidal input, we can get the samples of the input continuous time signal. This is done as follows. For each column in the PDF matrix one can find the voltage level for which the PDF is maximum and appropriately assign a voltage level.



Figure 6.17: Eye diagram from EOM, without equalization for a 5 Gbps PRBS input



Figure 6.18: Eye diagram from EOM, for a 5 GHz sinusoidal input

When this is repeated for all phases, we obtain the samples of the input sinusoid. Fast Fourier Transform (FFT) can then be applied to the sampled sequence to find the time



Figure 6.19: Eye diagram from EOM, after equalization for a 5 Gbps PRBS input

step (time span/number of phases), since the record length and the input frequency are known. The time step of the EOM was found to be 3.3 ps.

A summary of EOM performance in comparison with the state-of-the art designs is given in Table 6.2.

| Ref. | Data   | Power | Feature | Supply |
|------|--------|-------|---------|--------|
|      | rate   | (mW)  | size    |        |
|      | (Gbps) |       | (µm)    |        |
| [12] | 10     | 4950  | 50 GHz  | 5 V    |
|      |        |       | $f_T$   |        |
|      |        |       | SiGe    |        |
| [13] | 10     | 330   | 0.13    | 1.2 V  |
| [18] | 10     | 171   | 0.18    | 1.8 V  |
| This | 5      | 72    | 0.13    | 1.2 V  |
| work |        |       |         |        |

Table 6.2: Comparison with other EOMs

#### **CHAPTER 7**

## Conclusion

A new equalizer architecture proposed in [1] has been successfully designed and implemented in 0.13  $\mu$ m CMOS technology. This continuous time equalizer architecture is more power efficient than a complete digital implementation of the equalizer. By exploiting mutual inductance, the area occupied by the spiral inductors in the LC ladder filter is reduced. The tested equalizer chip is functional upto 5 Gbps data rate achieving a BER of  $1.63 \times 10^{-9}$ . The on-chip eye opening monitor which helps us to capture the eye diagram of the equalized signal, gives us an economical alternative to the use of sophisticated instruments like a high frequency oscilloscope. The EOM dissipates less power compared to the state of the art designs. Simple measurement techniques proposed in Chapter 6 are found to work well and can be widely used in a variety of applications.

#### 7.1 Future work

The main contributors to power dissipation in the equalizer are the input transconductors and the output buffer. Power efficient implementation of transconductors can be researched. The same equalizer architecture can be implemented in different ways. For example, at higher data rates like 40 Gbps one can try a distributed implementation of the singly terminated ladder like that in a travelling wave amplifier (TWA) [7], since the parasitic capacitances can be 'absorbed' into the ladder design. It also takes away the problem associated with the finite bandwidth at the summing node.

The inductors occupy most of the area in the equalizer. To reduce the area of the LC ladder, the ladder inductors can be realized as a single large inductor whose value is

equal to the sum of all inductors in the ladder. Inductors of various values are obtained from the bigger spiral by appropriately tapping it at different points [19]. This will result in huge savings in area than realizing each inductor with a separate spiral.

In order to sense the inductor currents, we had used a dual ladder where the inductor current in the normal ladder appears as a capacitor voltage in the dual ladder. However, this resulted in an additional ladder to be implemented and hence increases the area. Alternative ways of sensing currents in the inductors can be explored that can reduce the area occupied by the ladder.

The equalizer that is implemented in the current work can be adapted manually to work for different channels. An LMS feedback engine that can automatically change the tap weights depending upon the channel can be tried.

Implementing the EOM at high data rates will be challenging. Though the same architecture can be used, the high frequency blocks will pose several challenges. For example, obtaining fine phase resolutions will be difficult at high data rates. Techniques to achieve this can be explored.

High speed blocks are predominantly realized as some form of the differential pair. Hence, most bandwidth related problems can be fixed by improving the bandwidth of a differential pair. There are several techniques to do that. One such technique is inductive peaking [20]. One can also try 'Negative Miller compensation' [21] to enhance bandwidth.

Though we were able to implement the equalizer, the advantages of this equalizer architecture can be reaped only if a complete receiver chain is implemented and that remains a future work.

# **APPENDIX A**

# Pin details of the chip

# Table A.1: Functionality of pins

| Pin         | Name                    | Functionality                                          |
|-------------|-------------------------|--------------------------------------------------------|
| 1,2         | DAC_IOm,DAC_IOp         | To probe DAC's output                                  |
| 3,4,20,     |                         |                                                        |
| 50,51       | Vddd                    | 1.2 V supply voltage for EOM and digital logics        |
| 5           | vavg_pin                | Output of differential to single-ended converter       |
| 6,9,12,     |                         |                                                        |
| 21,24,27,   |                         |                                                        |
| 37,40,53,56 | gnda                    | Ground                                                 |
| 7,8         | vom, vop                | Differential outputs of equalizer                      |
| 10,11       | NA                      | Not applicable                                         |
| 13          | sdataout_EOM            | Serial data output from EOM                            |
| 14          | sample_offchip          | Indicates the start bit of each 12 bit word            |
| 15          | clk1M_EOM_input         | 1 MHz clock for digital control logic for EOM          |
| 16          | EOM_reset               | Synchronous reset for digital control logic for EOM    |
| 17          | adcpowerdown            | Active high signal to power down the SAR ADC           |
| 18          | EOM_sdataout_clk        | Synchronous clock for serial data output.              |
| 19          | done_offchip            | Indicates end of eye capture.                          |
| 22,23       | prbsm,prbsp             | Differential outputs of PRBS                           |
| 25,26       | clk10Gm,clk10Gp         | Differential 10 GHz clock input to PRBS                |
| 28          | prbstrig                | Trigger to start the PRBS                              |
| 29,30,33,41 | Vdda                    | 2 V supply for Equalizer                               |
| 31,32       | testbuff[0],testbuff[1] | controls the direct path and filter path buffers       |
| 34          | currin                  | $100\mu\text{A}$ current reference                     |
| 35,36       | ch2[0],ch2[1]           | Controls variable input termination resistor           |
| 38,39       | vim,vip                 | Differential inputs to equalizer                       |
| 42,43       | res2[0],res2[1]         | Controls ladder termination resistance                 |
| 44          | eq                      | To test EOM in standalone mode/equalizer mode          |
| 45          | CTE_sdataout            | One of the outputs of the serial to parallel converter |
| 46          | CTE_reset               | Asynchronous reset for serial to parallel converter    |
| 47          | CTE_clk_digital         | Input clock for the serial to parallel converter       |
| 48          | CTE_sdatain             | Serial input data to the serial to parallel converter  |
| 49          | CTE_shiftenable         | Enable signal for the serial to parallel converter     |
| 52          | cm_600mV                | 0.6 V reference voltage                                |
| 54,55       | clk5Gm,clk5Gp           | 5 GHz clock for EOM                                    |

#### REFERENCES

- S. Pavan, "Power and Area Efficient Adaptive Equalization at Microwave Frequencies," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 55, no. 6, pp. 1412–1420, Jul. 2008.
- [2] Mrinmay Vyankatesh Talegaonkar, "Electronic Eye Diagram Reconstruction System for 10 Gbps Data Transmission Systems," Master's thesis, Indian Institute of Technology Madras, 2007.
- [3] N. Krishnapura, "EE685, VLSI Broadband Communication Circuits," Indian Institute of Technology Madras, Aug-Dec 2007.
- [4] S. Gondi and B. Razavi, "Equalization and Clock and Data Recovery Techniques for 10-Gb/s CMOS Serial-Link Receivers," *IEEE Journal of Solid-State Circuits*, vol. 42, no. 9, pp. 1999–2011, Sep. 2007.
- [5] S. Pavan and S. Shivappa, "Nonidealities in Traveling Wave and Transversal FIR Filters Operating at Microwave Frequencies," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 53, no. 1, pp. 177–192, Jan. 2006.
- [6] S. Pavan and R. Tiruvuru, "Analysis and Design of Singly Terminated Transmission-Line FIR Adaptive Equalizers," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 54, no. 2, pp. 401–410, Feb. 2007.
- [7] J. Sewter and A. Carusone, "A CMOS Finite Impulse Response Filter With a Crossover Traveling Wave Topology for Equalization up to 30 Gb/s," *IEEE Journal of Solid State Circuits*, vol. 41, no. 4, pp. 909–917, Apr. 2006.
- [8] B. Razavi, *Design of Analog CMOS Integrated Circuits*. McGraw-Hill, Inc. New York, NY, USA, 2000.

- [9] A. Niknejad and R. Meyer, "Analysis, Design, and Optimization of Spiral Inductors and Transformers for Si RF ICs," *IEEE Journal of Solid-State Circuits*, vol. 33, no. 10, pp. 1470–1481, Oct. 1998.
- [10] S. Pavan and T. Laxminidhi, "Accurate Characterization of Integrated Continuous-Time Filters," *IEEE Journal of Solid-State Circuits*, vol. 42, no. 8, pp. 1758–1766, Aug. 2007.
- [11] F. Buchali, S. Lanne, J. Thiery, W. Baumert, and H. Bulow, "Fast Eye Monitor for 10 Gbit/s and its Application for Optical PMD Compensation," in *Proc. Optical Fiber Communication Conference and Exhibit (OFC)*, vol. 2, Anaheim, CA, 2001, pp. Tu5/1–Tu5/3.
- [12] T. Ellermeyer, U. Langmann, B. Wedding, and W. Pohlmann, "A 10-Gb/s Eye-Opening Monitor IC for Decision-Guided Adaptation of the Frequency Response of an Optical Receiver," *IEEE Journal of Solid-State Circuits*, vol. 35, no. 12, pp. 1958–1963, Dec. 2000.
- [13] B. Analui, A. Rylyakov, S. Rylov, M. Meghelli, and A. Hajimiri, "A 10-Gb/s Two-Dimensional Eye-Opening Monitor in 0.13-μm Standard CMOS," *IEEE Journal* of Solid-State Circuits, vol. 40, no. 12, p. 2689, Dec. 2005.
- [14] S. Sidiropoulos and M. Horowitz, "A Semidigital Dual Delay-Locked Loop," *IEEE Journal of Solid-State Circuits*, vol. 32, no. 11, pp. 1683–1692, Nov. 1997.
- [15] P.K. Sundararajan, "E3 239-Advanced VLSI Design." [Online]. Available: http://sindhu.ece.iisc.ernet.in/vlsilab/VlsiLab\_Courses\_E3\_239.html
- [16] B. Ginsburg and A. Chandrakasan, "Dual Time-Interleaved Successive Approximation Register ADCs for an Ultra-Wideband Receiver," *IEEE Journal of Solid State Circuits*, vol. 42, no. 2, p. 247, Feb. 2007.
- [17] N. Verma and A. Chandrakasan, "An Ultra Low Energy 12-bit Rate-Resolution

Scalable SAR ADC for Wireless Sensor Nodes," *IEEE Journal of Solid State Circuits*, vol. 42, no. 6, pp. 1196–1205, Jun. 2007.

- [18] D. Bhatta, K. Lee, H. Kim, E. Gebara, and J. Laskar, "A 10 Gb/s Two Dimensional Scanning Eye Opening Monitor in 0.18 μm CMOS process," in *Microwave Symposium Digest*, 2009. MTT'09. IEEE MTT-S International. IEEE, pp. 1141–1144.
- [19] N. Krishnapura, V. Gupta, and N. Agrawal, "Compact lowpass ladder filters using tapped coils," in 2009 International Symposium on Circuits and Systems (ISCAS), 31 May-2 Jun. 2010, pp. 53–56.
- [20] T. Lee, *The Design of CMOS Radio-Frequency Integrated Circuits*. Cambridge University Press, 1998.
- [21] S. Galal and B. Razavi, "10-Gb/s Limiting Amplifier and Laser/Modulator Driver in 0.18-μm CMOS Technology," *IEEE Journal of Solid State Circuits*, vol. 38, no. 12, pp. 2138–2146, Dec. 2003.