Replica Bias Scheme for Efficient Power Utilization in High-Frequency CMOS Digital Circuits

Saravanan Kathiah and Sankaran Aniruddhan
Dept. of Electrical Engineering, Indian Institute of Technology Madras, Chennai, India
Email: ee13d007@ee.iitm.ac.in, ani@ee.iitm.ac.in

Abstract—Digital circuits exhibiting rail-to-rail voltage swings display large spreads in current consumption and delay over variations in process, voltage and temperature (PVT). A circuit technique is proposed to enable optimal current consumption and low delay distribution in high frequency digital circuits. A typical RF application is chosen at 5 GHz frequency, for which a divider is designed and simulated in a UMC 130nm CMOS process. With the proposed scheme, the circuit shows up to 52% reduction in current, while the relative variation in delay over PVT reduces by 70%.

I. INTRODUCTION

Radio frequency (RF) transceiver designs favour standard CMOS processes for ease of integration and to reduce cost and power consumption. With ever-reducing gate delays, increasingly larger number of high frequency digital functions prefer CMOS circuits that swing completely between the rails. These circuits scale readily and lend themselves to easier implementation compared with those that work under limited swings. While achieving the required functions with minimal power dissipation is always desirable, this is highly important over all operating conditions in mobile RF applications to reduce recharge cycles.

Variations in process corners reduce the advantages gained through technology scaling and this is predicted to get worse in the future [1]. Changes in temperature and supply voltage also affect cell delays significantly. Raising the current consumption to accommodate the large spread in cell delays is the straightforward but inefficient solution.

This work proposes a circuit technique that reduces the spread in delay and power consumption and applies it to a frequency divider used in RF frequency synthesizers. In general, any standard CMOS digital circuit could be made efficient and robust through the proposed approach.

In this paper, section II introduces a typical application for the proposed technique and explains the usual implementation of the digital circuit. Section III explains the variations in key parameters over PVT and section IV introduces the proposed architecture and discusses the results.

II. FEEDBACK DIVIDER IN RF FREQUENCY SYNTHESIZERS

In an RF system, the Local Oscillator (LO) generates the carrier required to perform modulation and demodulation. The frequency synthesizer multiplies the clean frequency reference by the division ratio to synthesize the required LO output. The frequency divider operates on the highest frequency on the transceiver and hence consume large power.

A. Conventional Divider Design

The chosen RF application targets the 2.4 GHz ISM band. This requires the VCO and divider to operate nominally at 4.8 GHz (to generate I and Q components at 2.4 GHz), so the divider is specified to function at 5.6 GHz over corners to allow for loop transients along with some margin. Fig. 1 shows the divider architecture based on [2].

![Divider architecture](image)

The synthesizer input frequency comes from a 40 MHz crystal oscillator. The required division factor (N) is nominally 120, so the design requires six divide-by-2/3 cells. The circuit design deviates a little from [2] due to the use of True single phase clocking (TSPC) flops rather than Source-coupled logic (SCL) flops. The internal topological details of the 2/3 cell are shown in fig. 2 and 3. The combinational logic, except for an inverter, is absorbed inside the flop to reduce propagation delay and current consumption. Only the first two 2/3 cells employ TSPC logic, while the succeeding cells use static CMOS flops to reduce power.

![Implementation of 2/3 cell](image)

B. Conventional Divider Simulation Results

The simulation results on the divider are summarized in Table I. The power dissipation and the clock-to-output delay

<table>
<thead>
<tr>
<th>Table I: Divider Simulation Results</th>
</tr>
</thead>
<tbody>
<tr>
<td>Parameter</td>
</tr>
<tr>
<td>-----------------------------------</td>
</tr>
<tr>
<td>Power dissipation</td>
</tr>
<tr>
<td>Relative variation in delay</td>
</tr>
</tbody>
</table>

These results show the effectiveness of the proposed replica bias scheme in reducing both power consumption and delay variations.
show 2x and 2.5x variation over PVT respectively. The sensitivity of current consumption to PVT reduces as the frequency of operation decreases, due to the increase in idle time between transitions. If the entire circuit were to work at the maximum frequency, the spreads experienced would be much larger.

### Table I: Summary of divider results.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Min</th>
<th>Typ</th>
<th>Max</th>
<th>Units</th>
</tr>
</thead>
<tbody>
<tr>
<td>Supply voltage</td>
<td>1.08</td>
<td>1.2</td>
<td>1.32</td>
<td>V</td>
</tr>
<tr>
<td>Temperature</td>
<td>-40</td>
<td>27</td>
<td>100</td>
<td>°C</td>
</tr>
<tr>
<td>Frequency of operation</td>
<td>0.02</td>
<td>4.8</td>
<td>6.5</td>
<td>GHz</td>
</tr>
<tr>
<td>Current consumption</td>
<td>1.5</td>
<td>2</td>
<td>2.51</td>
<td>mA</td>
</tr>
<tr>
<td>Power dissipation</td>
<td>1.62</td>
<td>2.4</td>
<td>3.3</td>
<td>mW</td>
</tr>
<tr>
<td>Clk to Out delay</td>
<td>145</td>
<td>235</td>
<td>360</td>
<td>ps</td>
</tr>
</tbody>
</table>

#### III. CURRENT CONSUMPTION AND DELAY VARIATIONS OVER PVT

The operation of the TSPC flop is similar to a three stage ring oscillator, except that the former is clock-triggered. With this observation, an estimate of the variation in current consumption may be obtained by observing the current through a 3-stage ring oscillator.

A 3-stage ring oscillator with inverter strengths similar to the ones used in the TSPC 2/3 cell is designed and its frequency and current consumption across corners are extracted (fig. 4). The ring oscillator frequency varies from 6.5 GHz to 15 GHz and the current consumption from 560uA to 1.7mA. This 3x spread results in sub-optimal operation. When the transistors are skewed towards the faster corner, the module consumes more current than what is required in the typical case.

It is observed that the sensitivity of the oscillation frequency is much higher on supply than that on process and temperature. This suggests that the change in the oscillation frequency (i.e. the delay of a unit cell) due to process and temperature can be nulled through appropriate control of the supply voltage.

#### IV. POWER AND DELAY OPTIMIZED DIVIDER

A reference is required to make the supply voltage of the divider track the process and temperature variations. Fig. 5 shows a self biased inverter and a ring oscillator both biased with a temperature and process independent current ($I_{BG}$). The voltage $V_{ref}$ generated by $I_{BG}$ would behave like the desired reference. When the process corner or temperature reduces the drive strength of the devices, $V_{ref}$ increases. If this $V_{ref}$ is used to supply the divider through a regulator, the resulting supply voltage could compensate for the variations in process and temperature. This is a replica-bias scheme where the inverters are forced to operate on a fixed current due to a master inverter biased at a constant current. As the inverters work on a current that is independent of PVT, they are expected to have constant delays too.

This work focuses on the use of replica bias derived supply for digital circuits. Similar works have been reported before for analog circuits but targeting supply rejection [3],[4]. Sensitivity to process and temperature mainly comes from the bias current in circuits performing analog functions. Therefore, a supply generated through replica-bias would not be effective in controlling process- and temperature-induced variations. Successful implementations of PVT-robust analog designs change the bias current in accordance with the supply [5].

#### A. Reference selection

In order to reduce the spread, the I-V characteristic of the reference should match that of the inverters in the flop and must also track any variations arising from supply and temperature changes. Fig. 6 shows inverter current plotted against input voltage at a constant power supply ($V_{dd}$). The profile of current drawn by the inverter for a full-scale sine wave at the input\(^1\) is also shown. As the inverters swing, they

\(^1\)For the frequencies discussed here, a sinusoid better approximates the input waveform as compared to a square wave.
get biased at an average current of $I_1 \left( \frac{V_{DD}}{2}, I_1 \right)$. However, the transistors in the reference branch in fig. 5(a) are static and bias the inverter at $\left( \frac{V_{DD}}{2}, I_0 \right)$, making it ineffective in providing a tight match with the inverters in the flop.

The ring oscillator based reference in fig. 5(b) does not suffer from the problems described above and its inverters can be made to match the characteristics of those inside the flop reasonably well. The complete scheme is illustrated in fig. 7. The ring oscillator bias current is chosen such that the generated supply voltage ($V_{DD_{DIV}}$) allows the divider to operate over the required frequency range, while providing enough drop-out voltage for the regulator transistor to remain in saturation. The ring oscillator is designed to minimise this current while maintaining good match with the inverters in the flop.

A level shifter may be used to translate the output (OUT) to VDD levels. However, in this application, it would be beneficial to operate the phase-frequency detector and charge-pump on the same regulated supply. The delays in those modules too would be tightly controlled, leading to low reset path delays ultimately resulting in tighter static phase error distribution.

B. Regulator design

The low drop-out regulator (LDO) has an NMOS input stage and PMOS common-source second stage due to the input voltage range and drop-out voltage limitations respectively (fig. 8). The dominant and first non-dominant poles are located at the gate of the pass transistor (MPT) and the output node respectively. Miller capacitor $C_b$ performs pole-splitting to provide a phase margin of $50^\circ$. As the dominant pole is not at the output, the load transient response is relatively poor, but is considered adequate for this application (fig. 9). $C_{bypass}$ denotes the parasitic capacitance on the supply of the divider. It is assumed to be limited to 50pF in the simulations. The specifications of the regulator are shown in table II

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Min</th>
<th>Typ</th>
<th>Max</th>
<th>Units</th>
</tr>
</thead>
<tbody>
<tr>
<td>Supply voltage</td>
<td>1.08</td>
<td>1.21</td>
<td>1.32</td>
<td>V</td>
</tr>
<tr>
<td>Temperature</td>
<td>-40</td>
<td>27</td>
<td>100</td>
<td>ºC</td>
</tr>
<tr>
<td>Output voltage</td>
<td>0.55</td>
<td>1</td>
<td></td>
<td>V</td>
</tr>
<tr>
<td>Load</td>
<td>0</td>
<td>3</td>
<td></td>
<td>mA</td>
</tr>
<tr>
<td>Current consumption</td>
<td></td>
<td>10</td>
<td></td>
<td>uA</td>
</tr>
<tr>
<td>Bandwidth</td>
<td>10</td>
<td>16</td>
<td>20</td>
<td>MHz</td>
</tr>
<tr>
<td>Phase margin</td>
<td>50</td>
<td>63</td>
<td>75</td>
<td>degree</td>
</tr>
<tr>
<td>Supply rejection (DC)</td>
<td>50</td>
<td>60</td>
<td>68</td>
<td>dB</td>
</tr>
</tbody>
</table>

The ring oscillator consumes 100uA while the LDO consumes only 10uA. Therefore, the power overhead for the additional blocks is not very significant. The start-up transient of the regulated supply voltage is shown in fig. 9 for three representative cases. In the typical transistor corner with the power supply and temperature set to 1.2V and 27ºC respectively, the regulated voltage settles to 867mV. In the slow transistor corner, 1.08V power supply and 100ºC temperature, the transistors display poor drive strength, so the regulator output voltage increases to 1V to maintain the same delay in the inverters. In the fast transistor corner, 1.32V power supply and -40ºC temperature, the regulated voltage reduces to 767mV appropriately.

C. Results and Summary

The results for the proposed divider are shown in Table III. The current consumption and clock-to-output delay are shown across the typical and worst-case corners in fig. 10.

The reduction in current consumption is 43% and 67% in the typical and extreme case respectively. The variation in clock-to-output delay in the original divider was -39% to 53%. This reduces to a variation of -11% to 19% in the proposed circuit.
It must be noted that as the available supply voltage for the divider is reduced in this scheme, the guaranteed maximum frequency of operation falls slightly.

The current consumption is plotted across input clock frequency for the same division ratio (120) in the typical transistor corner (1.2V, 27°C) in fig. 11(a). The currents increase linearly with frequency, as expected. The efficiency of the proposed circuit over the normal divider is depicted in fig. 11(b). This efficiency improves as circuit activity increases, either through increase in switching rate or by switching more nodes at a fixed rate. Therefore, as the required optimal current increases, the proposed circuit appears increasingly attractive, making it well-suited for high power or high frequency digital circuit applications.

The advantages of the proposed divider may be summarised as below:

1) Current consumption is minimized across variations in PVT. This reduction gets better if a larger portion of the circuit switches at high frequencies.
2) The variation in delay over PVT becomes smaller. This leads to better utilization of the available time in those applications where timing is critical.
3) The divider module works under a regulated supply and is therefore shielded from noise on the external supply. This results in low supply induced jitter. Coupled with low delay variation, this translates to a linear divider which is highly desirable in a fractional-N synthesizer.
4) The power efficiency over the conventional divider increases linearly with the required current consumption.
5) The divider encapsulates a process monitor; the LDO output is an analog measure of the process. This information could be used for other purposes such as trimming an analog module or serving as a record during test.

The performance of the divider can be further improved by employing feedback. For example, the frequency of the reference ring oscillator could be compared to a fixed frequency reference. If the amplified error voltage were used as the power supply for the divider and the reference, the resulting spread in delay and current would be further reduced.

V. CONCLUSION

The sub-optimal utilization of power in standard CMOS digital circuits was analysed. A technique to improve power efficiency was demonstrated using the example of the frequency divider in an RF frequency synthesizer. It was shown that operating digital circuits with replica bias derived from a similar low-power module manages to keep the power consumption close to the minimum possible over manufacturing and operational variations. An additional favourable result was improvement in immunity from supply noise.

REFERENCES