© 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. This manuscript was presented at 2021 16th European Microwave Integrated Circuits Conference (EuMIC), 03-04 April 2022, London, United Kingdom. # Multi-Phase Clock Path Circuit up to 57 GHz Including 5 bit Programmable Phase Interpolators for Time-Interleaved Broadband Data Converters in a 28 nm FD-SOI CMOS Technology Daniel Widmann, Tobias Tannert, Xuan-Quang Du, Markus Grözing, Manfred Berroth University of Stuttgart, Institute of Electrical and Optical Communications Engineering, Germany {daniel.widmann, tobias.tannert, markus.groezing, manfred.berroth}@int.uni-stuttgart.de Abstract — Clock paths in mixed-signal integrated circuits are critical building blocks possibly determining the entire circuit performance. A precisely controllable clock phase is highly desirable e.g. for monolithic, ultra high-speed data converters with time-interleaving, i.e. digital-to-analog (DAC) and analog-to-digital (ADC) converters, to adjust the time-interleaved converter channels' timing. More precisely, these converters use the means of analog multiplexing at the DAC outputs or analog demultiplexing at the ADC inputs, respectively. A broadband and low jitter clock path for frequencies up to 57 GHz is presented including 5 bit programmable phase interpolators at half input frequency with a phase delay resolution of about 1.25 ps realized in a 28 nm FD-SOI CMOS technology. A combination of current mode logic and CMOS logic is used in the proposed circuit. $\it Keywords$ — CMOS integrated circuits, phase shifters, mixed analog digital integrated circuits, analog-digital conversion, digital-analog conversion. # I. INTRODUCTION Mixed-signal integrated data converter circuits, such as time-interleaved digital-to-analog converters (DACs) and analog-to-digital converters (ADCs) with ultra high conversion rates, demand challenging clock path circuits possibly determining the entire circuit performance. First, frequency dividers for different circuit parts are required. Secondly, several clock signal properties like common mode level, duty cycle and magnitude have to be optimized. Finally, for proper timing in clocked, time-interleaved front-end systems, the clock phase relations have to be precisely adjustable to omit intersymbol interference and ensure ideal sampling instants. Especially DACs that use the means of analog multiplexing at the DAC outputs [1] or a set of ADCs that use demultiplexing at their inputs [2], respectively, require precise relative clock timing (skew) of the single converters' clocks. Additionally, jitter is a critical aspect determining effective number of bits (ENOB) with increasing signal frequency. Here, a low jitter clock path up to 57 GHz including a 5 bit programmable phase interpolator realized in a 28 nm fully-depleted silicon-on-insulator (FD-SOI) CMOS technology [3] allowing forward body-biasing is presented. It is a key element for paving the way to higher sampling rates of monolithic, time-interleaved data converters in CMOS. The input circuit parts at highest frequencies ( $f_{\rm clk}$ and $f_{\rm clk}/2$ ) are realized in (inductively peaked) current mode logic (CML). After level conversion, the design passes into common CMOS logic. # II. CLOCK PATH CONCEPT Fig. 1 shows a block diagram of the entire clock system. Although the real circuit provides two equivalent branches of $f_{\rm clk}/2$ channels, only one channel is discussed here. As an example, in a time-interleaved data converter, $f_{\rm clk}$ is used for the high-speed output multiplexer (MUX) whereas the other clocks on lower frequency domains provide clock signals for the digital MUXs at its input. Next to differential amplifiers (DAs), source followers (SFs) and different frequency dividers in different architectures, an offset control circuit, a start circuit, a programmable phase interpolator as well as a common mode control circuit are key components of the system. After the first clock division and phase rotation, a level conversion takes place providing the signals for the much more compact CMOS logic part with less power consumption. # III. CML PART The CML part uses inductive peaking techniques in most components. The clock path starts with a passive offset control circuit depicted in Fig. 2. This circuit allows to control the common mode voltages of the two input clock signals and thus the duty cycle of the differential clock signal. On the right hand side $(R_3...R_5,\ C_3$ and L), a DC bias network sets the common mode voltage $V_{\rm cm}$ of both clock signals. On the left hand side, the external offset control voltage is shown which allows detuning the two common mode voltages for precise alignment. The resistor values are chosen to achieve $50\,\Omega$ termination single-ended (SE) $(100\,\Omega$ differential) in total. Next, a cascade of DAs with inductive series as well as shunt peaking (see e.g. [4]) amplifies the signal before reaching a start circuit (see Fig. 3). This chain of amplifiers ensures limitation even for small input clock power levels. The start process is of particular importance for the first frequency divider in the CMOS logic part in the clock domain $f_{\rm clk}/2$ . Since it is run at its upper frequency limit, the clock signal has to be initialized and started at a dedicated phase. Moreover, transient settling effects at the start can cause short pulses Fig. 1. Block diagram of the clock path (DA: differential amplifier, SF: source follower). The gray part is realized in common CMOS logic whereas the other parts are realized in CML, mostly with inductive peaking. Dashed arrows indicate the junctions in different clock domains where the clock signal can be retrieved for the data converter. Externally adjustable circuit parts are marked with $\circlearrowleft$ . In the CMOS logic part, different frequency dividers with initialization (*init*) or synchronous start (*res*) are used. Fig. 2. Passive offset and duty cycle control circuit. Fig. 3. CML start circuit. Latches are defined in such a way that they are opaque for clk = H. On start, the inductively peaked MUX switches from static levels to the clock input signal (dashed arrow). leading to a nonfunctional static state of the divider at $f_{\rm clk}/2$ and thus to a breakdown of divider activity. The circuit in Fig. 3 is responsible for both, a proper phase for initialization as well as a sampling of the reset signal $\overline{R_{\rm CML}}$ to control the length of the first clock half-pulses. The frequency divider consists of two CML latches with initialization transistors in negative feedback configuration and modified by shunt peaking. Another key component is the 5 bit programmable phase interpolator, similar to e.g. [5]. Its core is a weighted adder which is shown in Fig. 4. In the following, the phase interpolator's output delay is not considered in the output phase. Assuming ideal, single-tone signals and a perfectly linear system, the output signal of the weighted adder depending on the input amplitudes $\hat{V}_{\rm clk1}$ and $\hat{V}_{\rm clk2}$ can be described as the following phasor: $$\underline{V}_{\rm clk,\,out} \sim \alpha \cdot \hat{V}_{\rm clk1} - j \cdot (1 - \alpha) \cdot \hat{V}_{\rm clk2}$$ (1) Without loss of generality, the phase difference between $\underline{V}_{\rm clk1}$ and $\underline{V}_{\overline{\rm clk2}}$ is assumed to be $+\pi/2$ . The weighting factor $\alpha$ can be tuned by three programmable current sources with binary weighting. A programmable register allows setting the logical inputs ${\rm sel}_0$ , ${\rm sel}_1$ and ${\rm sel}_2$ . For binary weighting, the current sources in cascode configuration are adapted in their transistor widths (w, 2w, 4w) and for symmetry reasons, the resistor values $(R_0, 2R_0, 4R_0)$ are scaled accordingly. Ideally, the output signal's phase referred to $\underline{V}_{\rm clk1}$ assuming the same amplitudes for the input signals is $$arg\left\{\underline{V}_{\rm clk,\,out}\right\} = -arctan\left(\frac{1-\alpha}{\alpha}\right)$$ (2) with $$\alpha = \left( \text{sel}_2 \cdot 2^2 + \text{sel}_1 \cdot 2^1 + \text{sel}_0 \cdot 2^0 \right) / \left( 2^3 - 1 \right) \quad .$$ (3) $\alpha \in [0,1]$ and for the logical selection inputs, it holds $\mathrm{sel}_i \in \{0,1\}$ . Generally, the output phase $\phi_{\mathrm{clk,\,out}}$ of $\underline{V}_{\mathrm{clk,\,out}}$ can be expressed as $$\phi_{\text{clk, out}} = \frac{\sum_{i=0}^{N-1} \left( \text{sel}_i \cdot 2^i \cdot \phi_{\text{clk1}} + (1 - \text{sel}_i) \cdot 2^i \cdot \phi_{\overline{\text{clk2}}} \right)}{2^N - 1}.$$ (4) Here, a N=3 (bit) weighted adder is discussed. The weighted adder can only interpolate phases between those of the input signals (e.g. $[-\pi/2,0]$ ), their phases included. I.e., only phases of one quadrant can be reached. For a full phase control, two MUXs with another two control inputs $\mathrm{sel}_3$ and $\mathrm{sel}_4$ are required in front of the weighted adder enabling the choice of the desired quadrant by adaption of the input signals (see Fig. 5). Therefore, all four phases (I, $\bar{\mathrm{I}}$ , Q, $\bar{\mathrm{Q}}$ ) provided by the frequency divider are required. It has to be mentioned that the phase interpolator works best for sinusoidal signals which is Fig. 4. Schematic of the weighted adder. Programmable current sources with binary weighting of currents realize the weighting function. Fig. 5. Block diagram of the $5\,\mathrm{bit}$ programmable phase interpolation circuit. The choice of connections is defined by layout reasons. especially given for high frequencies where this circuit part is not in limitation anymore. Finally, the interface between the CML and the CMOS logic part is of special importance. The common mode level and thus the duty cycle referred to the switching point of the CMOS logic part has to be adapted. Fig. 6 shows the common mode shift in two steps. First, an amplifier stage with an additional load resistor $R_1=10\,\Omega$ shifts the common mode output level by $R_1\cdot I_{01}$ . An extra control input $I_{\rm BiasCML2CMOS}$ allows detuning the shift for precise level adaption at this critical interface. Secondly, a source follower for each clock signal causes another fixed level shift of a gate-source voltage. After common mode level adaption, architecture is changed to CMOS logic for power consumption and chip area reasons. #### IV. CMOS LOGIC PART The CMOS logic part starts with two inverters for voltage level regeneration providing a limited signal to the first frequency divider. There are two types of frequency dividers in this part as depicted in [6]. Their basic latch structures are the same. However, the first frequency divider at $f_{\rm clk}/2$ is driven at its frequency limit which is why initialization and a correct start clock phase are required. Minimum initialization transistors in combination with the proper initial phase delivered by the CML part are essential at the start. As mentioned before, the first clock pulses are critical and Fig. 6. Common mode control for CML to CMOS interface. there is only little tolerance for short pulses caused by transient settling effects. Going beyond leads to a common mode shift away from $\sim V_{\rm DD}/2$ and the divider stops action ending in a nonfunctional static state. To omit a critical voltage drop at start due to the sudden current demand through supply lines with inherent inductances and bonding wires, the frequency dividers have their own supply voltage with large block capacitors not being disturbed by other circuit parts and are switched on sequentially. At the lower frequency domains, resettable frequency dividers are used [6] with AND/NAND gates implemented as differential cascode voltage switch logic (DCVSL) gates. Their input reset signal is sampled and operation starts synchronously to the input clock signal. In case of several $f_{\rm clk}/2$ channels are required (see "other $f_{\rm clk}/2$ channels" in Fig. 1), a synchronous start by sampling of the reset signal is important for synchronization of the different channels. Therefore, the initialization and reset concept enables a multi-clock, multi-phase system that can start a clocked CMOS converter with a large power consumption jump on startup. # V. MEASUREMENT RESULTS The successive start sequence is given in Table 1 and run automatically on-chip by analog delay elements for the global reset signal. External supply voltages vary from $0.9\,\mathrm{V}$ to $1.15\,\mathrm{V}$ (overdrive) for different CMOS logic parts. For CML parts, the externally applied positive voltage is $V_{\mathrm{DD,\,CML}}=1.7\,\mathrm{V}$ , the bottom ones are $V_{\mathrm{SS1,\,CML}}=0.0\,\mathrm{V}$ and $V_{\mathrm{SS2,\,CML}}=-0.7\,\mathrm{V}$ , respectively. On the one hand, a slight overdrive to the transistors is applied. On the other hand, the voltage drops caused by series resistances on the supply lines are compensated by the given voltage values. The estimated total power consumption including two equivalent branches of $f_{\mathrm{clk}}/2$ channels for the $f_{\mathrm{clk}}/16$ signal not considering the CMOS inverters and the $50\,\Omega$ output drivers for the $f_{\mathrm{clk}}/16$ signal is about $1.1\,\mathrm{W}$ . The CMOS frequency dividers' contribution is less than $6\,\%$ . Fig. 7 shows the $f_{\rm clk}=57\,{\rm GHz}$ SE measurement results of the output clock signal ${\rm clk_{out}}$ at $f_{\rm clk}/16=3.5625\,{\rm GHz}$ Fig. 7. (a) Photograph of the bonded die in the RF board cavity. In blue color, the CML part including GSSG input is shown. The small CMOS logic parts are shown in white color. Next to the clock path, the die also contains other circuits not presented here. (b) and (c) show the measurement results of the divided SE output clock signal clk<sub>out</sub> at $f_{\rm clk}/16 = 3.5625\,\rm GHz$ for (b) all phase interpolator positions within $360^\circ$ and (c) the phase positions within one quadrant in a closer view. For measurement reasons, rising and falling edges appear together in (b). Table 1. Stepwise start sequence. | Step | Part | Signal | |------|----------------------------------------------------|---------------------------------------| | 1 | divider $f_{\rm clk} \to f_{\rm clk}/2$ | init.: on $\rightarrow$ off | | 1 | divider $f_{\rm clk}/2 \to f_{\rm clk}/4$ | init.: on $\rightarrow$ off | | 2 | start circuit $f_{ m clk}$ | MUX: static → clock | | 3 | divider $f_{\rm clk}/4 \rightarrow f_{\rm clk}/8$ | reset: on $\rightarrow$ off (sampled) | | 4 | divider $f_{\rm clk}/8 \rightarrow f_{\rm clk}/16$ | reset: on $\rightarrow$ off (sampled) | for different phase interpolator settings. Measurements were done using a subsampling oscilloscope with a sampling module bandwidth of 70 GHz and a phase reference module driven at $f_{\rm clk}$ for low jitter measurements. Fig. 7b shows all phase interpolator positions inside a phase window of 360° referred to the $f_{\rm clk}/2$ domain. In principle, a 5 bit programmable phase interpolator can generate 32 different phases. As the four phase positions at the corners of the four quadrants fall together, 28 different positions can be observed in one cycle resulting in a phase delay resolution of $\Delta t=\frac{2}{28\cdot f_{\rm clk}}\approx 1.25\,{\rm ps}$ or an angle resolution referred to $f_{\rm clk}/2$ of $\Delta\phi=360^\circ/28\approx 12.9^\circ$ , respectively. Due to asymmetries, the two corresponding values at each quadrant corner slightly differ which leads to broader curves at these positions. A closer view of all positions within one quadrant is shown in Fig. 7c revealing phase delay resolutions of about $1.0 \,\mathrm{ps} - 1.4 \,\mathrm{ps}$ . Furthermore, a low RMS jitter of $\sim 180 \, \mathrm{fs}$ can be determined. It is quite comparable to the value of $(<)150 \,\mathrm{fs}$ in [7] discussing a state-of-the-art FinFET DAC. However, it is determined differently. It has to be mentioned that this value represents the one of the whole system including CMOS logic parts (also parts for different chip functionality not discussed here) that dynamically load the supply voltage and additionally the one of the measurement setup. Consequently, the on-chip jitter at the CML part might be even less. The lowest operation frequency verified by measurements is 2 GHz. However, this value is limited by the measurement setup. # VI. CONCLUSION A low jitter clock path up to 57 GHz including a 5 bit programmable phase interpolator realized in 28 nm FD-SOI CMOS technology is presented. The key aspects are the separation of the circuit into a partially inductively peaked CML and a common CMOS logic part as well as a clock startup concept considering a successive start sequence with initialization and reset parts. Furthermore, separated supply voltage domains with large block capacitors omitting critical voltage drops have been implemented to ensure proper startup operation of all sensitive circuit parts. The presented clock path enables time-interleaving of ultra high-speed CMOS integrated sampling systems, i.e. broadband data converters. #### ACKNOWLEDGMENT This project is funded by the *Deutsche Forschungs-gemeinschaft* (DFG, German Research Foundation) – 276016065. We thank our partners from Nokia Bell Labs Stuttgart for their extensive help in packaging of this integrated circuit. Special thanks to Fred Buchali and Peter Klose for their support. #### REFERENCES - [1] K. Schuh, Q. Hu, M. Collisi et al., "100 GSa/s BiCMOS Analog Multiplexer Based 100 GBd PAM Transmission over 20 km Single-Mode Fiber in the C-Band," in European Conference on Optical Communications (ECOC), 2020, pp. 1–4. - [2] X.-Q. Du, M. Grözing, A. Uhl et al., "A 112-GS/s 1-to-4 ADC front-end with more than 35-dBc SFDR and 28-dB SNDR up to 43-GHz in 130-nm SiGe BiCMOS," in *IEEE Radio Frequency Integrated Circuits Symposium* (RFIC), 2019, pp. 215–218. - [3] N. Planes, O. Weber, V. Barral et al., "28nm FDSOI technology platform for high-speed low-voltage digital applications," in Symposium on VLSI Technology (VLSIT), 2012, pp. 133–134. - [4] J. S. Walling, S. Shekhar, and D. J. Allstot, "Wideband CMOS Amplifier Design: Time-Domain Considerations," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 55, no. 7, pp. 1781–1793, 2008. - [5] S. Sidiropoulos and M. Horowitz, "A semidigital dual delay-locked loop," IEEE Journal of Solid-State Circuits, vol. 32, no. 11, pp. 1683–1692, 1997 - [6] D. Widmann, M. Grözing, and M. Berroth, "High-Speed Serializer for a 64 GS s<sup>-1</sup> Digital-to-Analog Converter in a 28 nm Fully-Depleted Silicon-on-Insulator CMOS Technology," *Advances in Radio Science*, vol. 16, pp. 99–108, 2018. - [7] R. L. Nguyen, A. M. Castrillon, A. Fan et al., "8.6 A Highly Reconfigurable 40-97GS/s DAC and ADC with 40GHz AFE Bandwidth and Sub-35fJ/conv-step for 400Gb/s Coherent Optical Applications in 7nm FinFET," in *IEEE International Solid- State Circuits Conference* (ISSCC), vol. 64, 2021, pp. 136–138.