© 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

# 18.06.2021

This manuscript was presented at

2020 IEEE BiCMOS and Compound Semiconductor Integrated Circuits and Technology Symposium (BCICTS), Monterey, CA, USA, November 16-19, 2020.

# 128-GS/s 1-to-4 SiGe Analog Demultiplexer with 36-GHz Bandwidth for 6-bit Data Converters

Philipp Thomas, Tobias Tannert, Markus Grözing, and Manfred Berroth Institute of Electrical and Optical Communications Engineering University of Stuttgart

Pfaffenwaldring 47, 70569 Stuttgart, Germany philipp.thomas@int.uni-stuttgart.de

Abstract—Coherent optical transceivers cover multiple wavelengths to meet the growing request for ultra-wideband data links, e.g. from ultra-high definition video-on-demand. However, costs increase with the number of wavelengths per channel, so that higher baud rates are used to reduce the receiver's complexity, while simultaneously increasing electrical bandwidth requirements. Especially sampling rate and analog input bandwidth between photodiode and data converters need to be improved. For that purpose, we present a 4-way timeinterleaving analog demultiplexer in one of the most advanced SiGe-BiCMOS technologies to date, operating at the highest reported sampling rate of 128 GS/s. The total harmonic distortion of -37 to -22 dB indicates an accuracy of 5.9-3.3 ENOBTHD across the entire 36-GHz bandwidth of the sampled signal path and the signal-to-noise ratio of 28 dB at 2 GHz enables 4.4 ENOB<sub>SNR</sub>. Each of the sampling front end's four output paths can drive a 32-GS/s 6-bit analog-to-digital converter that can be connected to commercially available 32-Gbit/s digital interfaces. Combining an ultra-high symbol rate with medium accuracy allows for a data rate beyond 1 Tbit/s per wavelength with dual polarization and quadrature amplitude modulation in a cost-efficient coherent optical receiver.

Keywords—analog-digital conversion, bicmos integrated circuits, demultiplexing, sampled data circuits, silicon germanium.

# I. INTRODUCTION

FinFET-CMOS technology provides for unmatched digital computing speeds thanks to its high integration level, but it has a rather poor analog performance, compared with rivalling technologies, like FDSOI-CMOS and SiGe-BiCMOS. With both technologies, major efforts are being made to increase the transistor switching speed and thus to provide solutions for the increasingly demanding challenges of the information age. But the costs for developing advanced FDSOI-CMOS solutions that go beyond 28-nm nodes are not always economically reasonable. SiGe-BiCMOS can play an important role in these broadband analog and mixed-signal circuits. Recent advances in scaling and process optimization have led to the development of cutting-edge SiGe HBTs with transit frequencies and maximum oscillation frequencies of up to 470 GHz and 610 GHz, respectively [1], [2], [3].

These performance levels allow to address advanced ultrawideband applications like data center interconnects, optical wide area and metro networks at reasonable constraints on cost and power consumption. We demonstrate a 1-to-4 analog demultiplexer sampling front-end, targeted at these fiberoptical data links. The circuit is realized in an advanced preproduction 90-nm SiGe-BiCMOS process from Infineon Technologies with 300-GHz  $f_T$  and 480-GHz  $f_{max}$ . Due to this technology improvement, we are able to operate the circuit at 128 GS/s, being the highest reported sampling rate to date, with up to -37 dB total harmonic distortion (THD), 28 dB signal-to-noise ratio (SNR) at 2 GHz and 36-GHz bandwidth of the signal path at full sampling rate. Compared with the solution described in [4], which uses the B11HFC predecessor technology, with this solution, the power consumption can be halved, and the sampling rate can be increased from 116 GS/s to 128 GS/s whilst achieving similar linearity. We will discuss the architecture of the analog sampling core in chapter II and the clock conditioning circuit block in chapter III. Experimental results are summarized in chapter IV, before the paper is concluded in chapter 0.

# II. ANALOG DEMULTIPLEXER SAMPLING CORE

A circuit schematic of the presented 1-to-4 analog demultiplexer is shown in Fig. 1. It makes use of the current mode or charge sampling topology that was investigated in a single-core track-and-hold [5] and other analog demultiplexer realizations [4], [6]. It relies on signal current integration on hold capacitors  $C_H = 60 fF$  during the integration period I, keeping the voltage constant for the duration of the hold interval H and eventually resetting the hold capacitor's voltage through switched emitter-followers  $Q_{9-10}$  to a defined value during reset phase R to avoid inter-symbol interference. The Capacitor  $C_2 = 100 \, fF$  balances the load of the differential reset signal driver in the clock conditioning block. Typical goals are 25 % of the single-channel sampling period  $(4f_s)^{-1}$  for the R and I intervals and 50 % for H phase. This partitioning of the clock period is beneficial for the 1-to-4 analog demultiplexer. Once the initial 25 % of the sampling period have passed, the I interval is finished; the signal currents are no longer needed in the first branch, and will be steered into the second branch. After completing a full cycle of 4 track-and-hold I intervals, the signal current returns to the first branch. The 50 % hold time results in overlapping hold intervals and allows enough settling time for analog-to-digital converters that can be connected to the four sampled channels.

The current-steering transistors  $Q_{11-12}$  are realized as common-base switches below the charge sampling capacitors. Switches for the other 3 branches are emitter-coupled to the first ones. The switch pair with the highest clock voltage at a certain point in time conducts the signal current. A transconductance stage  $Q_{6-7}$ , linearized by  $R_{6-7}$ , converts the input voltage into the signal current that is required by the four charge sampling track-and-hold branches. An emitter follower buffer  $Q_{2-3}$  is added at the signal input to ease matching to differential 100- $\Omega$  terminations. The other emitter follower buffer  $Q_{15-16}$  is responsible for limiting the charge drained from  $C_H$  by the differential output buffer  $Q_{23-24}$  with linearization resistors  $R_{14-15}$  and cascode transistors  $Q_{21-22}$ for an improved bandwidth. Capacitors  $C_{0-1}$  and  $C_{3-4}$  add some peaking to mitigate the rise of the associated resistances at high signal and clock frequencies. The current source control voltages  $V_{CS,DA0-270}$  of the output buffers can be adjusted individually to balance gain mismatch.

This work was funded in part by the European Union in the ECSEL-JU project TARANTO under grant no. 737454 and in part by the German Federal Ministry of Education and Research under grant no. 16ESE0210.



Fig. 1. Circuit schematic of the 1-to-4 analog demultiplexer front-end. Emitter lengths are given for each stage. Gain mismatch between the four output channels can be balanced through separate current source control voltages in the differential output buffers.

# III. QUADRATURE CLOCK CONDITIONING

Fig. 2 shows the block diagram of the 25 % duty cycle quadrature clock conditioning circuit. As already discussed, the sampling period of the signal current integrator is divided into three phases R, I, and H. A simple clock signal with 50 % duty cycle would not be able to provide this division. In [6], a second clock signal with half of the primary frequency is utilized. Switches of both frequency levels are cascaded in a tree topology, in a way that ensures safe edge transitioning. The corresponding clock switching transistors are therefore stacked. At least twice the base-emitter voltage  $V_{BE}$  and the clock signal amplitude  $\hat{v}_{clk}$  are required to set the correct operating point for the clock switches. This scheme increases the supply voltage by at least  $V_{BE} + \hat{v}_{clk}$ , which accounts for an additional 1 V, compared with a single switching layer.

In this work, we employ a single direct switching-layer based on a quadrature clock signal with 25 % duty cycle as in [4] and [7]. The gain of this switching stage must be sufficiently large to keep edge transitioning times at a minimum. The required 25 % duty-cycle can be obtained by a simple *AND*- or *NOR*-structure with input connection of two phases of the quadrature clock signal, as pictured in Fig. 2, e.g.  $\overline{v_{clk,90,50\%}} + v_{clk,180,50\%} = v_{clk,0,25\%}$ . The same logic as in

clock conditioning is used for the reset signals, with according phase allocation for correct timing. This ensures that 25 % of the sampling period are reserved for reset and integration, respectively. The hold mode accounts for the remaining 50 % of the sampling period, as with a conventional track-and-hold amplifier. The hold capacitance  $C_H$  determines the maximum achievable voltage difference  $\Delta v_{H,max}$ , together with the maximum signal current  $i_{H,max}$  and the integration period  $T_I$  of a single track-and-hold channel:  $\Delta v_{H,max} = \frac{i_{H,max}*T_I}{C_H}$ .

The four phases of the 32-GHz clock with 50 % duty cycle for the *NOR* logic gates are obtained from a frequency divider, based on a pair of feedback latches, as shown in Fig. 2. These latches need to process the highest frequency level – the differential 64-GHz quasi-square wave. Timing issues of the latch feedback path present the bottleneck for higher sampling rates in this design. This can be observed shortly above 128 GS/s, where stable operation is no longer possible. Whereas this structure requires a high signal frequency at the differential clock input and dissipates a rather high amount of DC power, the most important benefits are an ultra-broadband clock driver adaptable to any sampling rate up to 128 GS/s, a high gain and a stable quadrature phase difference of  $90^{\circ}$ .



Fig. 2. Simplified block diagram of the inductorless 25 % duty cycle quadrature clock conditioning circuit of the 1-to-4 analog demultiplexer [4]. A 64-GHz sinewave input is required for 4 x 32 GS/s = 128 GS/s. Clock dividers are based on latches and drive the output fd16 that can be used as a 2-GHz trigger source for a broadband oscilloscope. The complementary dummy clock divider at the top provides a symmetrical load and layout. A similar approach is shown in [8].

## IV. EXPERIMENTAL RESULTS

The analog demultiplexer IC was measured on a waferprober station with 40-GHz RF probes including power pins. The differential input voltage swing at the probe connectors was fixed at 500 mV<sub>pp</sub>, and the signal frequency was swept across the signal generator's operational range. DC blocks were included in the signal and clock paths to protect the measurement equipment from the DC operating point of the IC. The negative clock input port was terminated and the positive port was driven by the single-ended clock signal generator. A broadband balun was used at the signal generator of the input side to convert the single-ended signal into a differential signal. With supply voltages of 3.6 V for the clock and 3.1 V for the signal path, a dissipated power of 2.59 W was measured - 2.16 W for the quadrature clock generation and only 430 mW for the analog core and drivers. Linearity and amplitude measurements were captured at  $4 \times 32 \text{ GS/s} = 128 \text{ GS/s}$ . The IC's clock divider output running at 2 GHz was used as an external trigger for the subsampling oscilloscope with 60-GHz bandwidth. Another two differential probes were used for the signal and clock input, respectively, which leaves only one differential probe for the output channel measurement, as shown in Fig. 3. 64 captured traces were averaged to filter noise components and derive accurate harmonic distortion levels. Frequency spectra were calculated by a 128-point DFT with one point per hold interval, in a 4-ns window of the output channel's signal.

The captured signal of the analog demultiplexer's output channel 0° is shown in Fig. 4 for a 2-GHz, 500-mV<sub>pp</sub> input signal, sampled at 32 GS/s by each of the four output channels for an overall sampling rate of 128 GS/s. The RMS noise voltage of the channel at 32 GS/s without input signal is 1.17 mV, corresponding to 28 dB SNR at 2 GHz. Harmonics in the frequency domain are shown up to the 5th order, the third one being the highest distortion, as expected from a differential circuit, with a max. SFDR of 47 dBc. THD and associated effective number of bits at 500 mVpp are -37 dBc and 5.9, respectively. The transfer characteristic shows a 1-dB compression point at 4 dBm and a third-order intercept point (IP3) at 3 dBm, taking into account a penalty of -5 dB for the single-to-dual-tone IP3 conversion. Fig. 5 shows the frequency response for a 500-mV<sub>pp</sub> input voltage swing. The 3-dB bandwidth is 36 GHz at the full sampling rate of 128 GS/s. This is in the range of the 40-GHz 1-dB bandwidth for the single-core charge-sampling track-and-hold in [5]. In this work, we use faster transistors, but four track-and-hold cores in parallel. For higher signal frequencies than 2 GHz, linearity values are shown in Fig. 6 with a minimum of 3.3 ENOB at 26 GHz, calculated from harmonic distortion.

Table I provides a comparison to state-of-the-art sampling front-ends around 100 GBaud. The presented circuit operates at the highest reported sampling rate of 128 GS/s, already stated in [7], but demonstrating actual sampling at this rate for the first time. The trigger output provides the divided clock up to 134 GS/s, limited by the max. frequency of 67 GHz for the clock signal generator. The achievable SFDR of the presented analog demultiplexer is 1–6 dB lower, compared with the other time-interleaving sampling front-ends in [4] and [6], but can be as high as 47 dBc for a smaller input voltage swing. The power consumption of the analog core is 430 mW including input, output, and clock buffers – the lowest value reported in time-interleaved charge sampling front-ends – 76 % below the core power consumption in [4] and 81 % below the core power consumption in [6].



Fig. 3. RF probe measurement configuration and chip micrograph of the 128-GS/s analog demultiplexer with dimensions of 3.10 mm x 1.55 mm.



Fig. 4. Measurement results of the analog demultiplexer output  $0^{\circ}$  for a 2-GHz input signal, sampled at 32 GS/s (128-GS/s system sampling rate), (a) transient signals and (b) frequency spectrum for a 500-mV<sub>pp</sub> input voltage swing, (c) harmonic distortion with max. SFDR, (d) transfer characteristic with 1-dB compression point and third-order intercept point.



Fig. 5. Frequency response of the analog demultiplexer output  $0^{\circ}$  for a 500-mVpp input voltage swing, sampled at 32 GS/s (128-GS/s system sampling rate).

This demonstrates the architectural advantage compared with [6] and the technological advantage compared with [4].

#### V. CONCLUSION

The 1-to-4 analog demultiplexer sampling front-end operates at 4 x 32 GS/s = 128 GS/s with 500 mV<sub>pp</sub> and features a THD of -37 to -22 dB and an SNR of 28 dB at 2 GHz, indicating more than 3 ENOB across the signal path bandwidth of 36 GHz at full sampling rate. The max. SFDR at 128 GS/s is 38 dBc for a 500-m $V_{pp}$  input signal and 47 dBc for 300 mV<sub>pp</sub>, making it a suitable front-end for 6-bit analogto-digital converters. We demonstrate improvements in comparison with other charge sampling front-ends thanks to both, architecture and technology adjustments. The substantially lower sampling core power consumption facilitates the application of the described circuit in compact optical transceiver modules with strict power constraints. The time-interleaving topology simplifies the connection of digital interfaces with optical interconnects. It enables 128-GBaud transceivers with commercially available 32-GS/s data converters and 32-Gbit/s digital data links, to build costefficient ultra-wideband coherent optical transceivers.

### ACKNOWLEDGMENT

We thank Julia Krause for support during design and tapeout and Infineon Technologies for the manufacturing and donation of the dies.

## REFERENCES

 H. Rücker and B. Heinemann, "Device architectures for high-speed SiGe HBTs," in 2019 IEEE BiCMOS and Compound Semiconductor Integrated Circuits and Technology Symposium (BCICTS), Nashville, TN, USA, 3–6 Nov. 2019, pp. 1–7.



Fig. 6. Total harmonic distortion (THD) and corresponding ENOB<sub>THD</sub> of the analog demultiplexer output  $0^{\circ}$  for a 500-mVpp input voltage swing, sampled at 32 GS/s (128-GS/s system sampling rate).

- [2] D. Manger *et al.*, "Integration of SiGe HBT with  $f_T = 305$ GHz,  $f_{max} = 537$ GHz in 130nm and 90nm CMOS," in 2018 IEEE BiCMOS and Compound Semiconductor Integrated Circuits and Technology Symposium (BCICTS), San Diego, CA, USA, 15–17 Oct. 2018, pp. 76–79.
- [3] A. Gauthier et al., "450 GHz f<sub>T</sub> SiGe:C HBT featuring an implanted collector in a 55-nm CMOS node," in 2018 IEEE BiCMOS and Compound Semiconductor Integrated Circuits and Technology Symposium (BCICTS), San Diego, CA, USA, 15–17 Oct. 2018, pp. 72–75.
- [4] P. Thomas, T. Tannert, M. Grözing, X.-Q. Du and M. Berroth, "A 1-to-4 SiGe BiCMOS analog demultiplexer sampling front-end for a 116 GBaud-receiver," in 2020 15th European Microwave Integrated Circuits Conference (EuMIC), Utrecht, The Netherlands, 11–12 Jan. 2021, accepted for publication.
- [5] X.-Q. Du, M. Grözing and M. Berroth, "A 25.6-GS/s 40-GHz 1-dB BW current-mode track and hold circuit with more than 5-ENOB," in 2018 IEEE BiCMOS and Compound Semiconductor Integrated Circuits and Technology Symposium (BCICTS), San Diego, CA, USA, 15–17 Oct. 2018, pp. 56–59.
- [6] X.-Q. Du et al., "A 112-GS/s 1-to-4 ADC front-end with more than 35-dBc SFDR and 28-dB SNDR up to 43-GHz in 130-nm SiGe BiCMOS," in 2019 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), Boston, MA, USA, 2–4 June 2019, pp. 215–218.
- [7] A. Zandieh, N. Weiss, T. Nguyen, D. Harame and S. P. Voinigescu, "128-GS/s ADC front-end with over 60-GHz input bandwidth in 22-nm Si/SiGe FDSOI CMOS," in 2018 IEEE BiCMOS and Compound Semiconductor Integrated Circuits and Technology Symposium (BCICTS), San Diego, CA, USA, 15–17 Oct. 2018, pp. 271–274.
- [8] N. Weiss, S. Shopov, P. Schvan, P. Chevalier, A. Cathelin and S. P. Voinigescu, "DC-62 GHz 4-phase 25% duty cycle quadrature clock generator," in 2017 IEEE Compound Semiconductor Integrated Circuit Symposium (CSICS), Miami, FL, USA, 22–25 Oct. 2017, pp. 1–4.
- [9] K. Vasilakopoulos, A. Cathelin, P. Chevalier, T. Nguyen and S. P. Voinigescu, "A 108GS/s track and hold amplifier with MOS-HBT switch," in 2016 IEEE MTT-S International Microwave Symposium (IMS), San Francisco, CA, USA, 22–27 May 2016, pp. 1–4.

| Ref. | Sampling<br>Rate             | Bandwidth | Peak-Peak<br>Input Voltage<br>Swing | Spurious-Free<br>Dynamic Range | Total Harmonic<br>Distortion | Total /<br>Core DC<br>Power | Supply<br>Voltage | Technology<br>f <sub>T</sub> / f <sub>max</sub> |
|------|------------------------------|-----------|-------------------------------------|--------------------------------|------------------------------|-----------------------------|-------------------|-------------------------------------------------|
| [4]  | 4 x 29 GS/s =                | _         | 500 mV                              | 44 dBc @ 2 GHz                 | -41 dB @ 2 GHz               | 5.5 W /                     | 4.7 V /           | 130-nm SiGe                                     |
|      | 116 GS/s                     |           |                                     | 29 dBc @ 27 GHz                | -28 dB @ 27 GHz              | 1.8 W                       | 4.6 V             | 250 / 400 GHz                                   |
| [6]  | 4 x 28 GS/s =                | -         | 500 mV                              | 44 dBc @ 1 GHz                 | -46 dB @ 1 GHz a             | 3.34 W /                    | 3.5 V /           | 130-nm SiGe                                     |
|      | 112 GS/s                     |           |                                     | 35 dBc @ 38 GHz                | -35 dB @ 43 GHz a            | 2.28 W                      | 6.5 V             | 300 / 450 GHz                                   |
| [9]  | 108 GS/s /                   | 40 GHz °  | 800 mV                              | 55 dBc @ 1 GHz                 | -49 dB @ 1 GHz               | 0.09 W /                    | 2.5 V /           | 55-nm SiGe                                      |
|      | 90 GS/s <sup>b</sup>         |           |                                     | 43 dBc @ 15 GHz                | -38 dB @ 15 GHz              | 0.02 W                      | 1.8 V             | 330 / 350 GHz                                   |
| This | $4 \times 32 \text{ GS/s} =$ | 36 GHz    | 500 mV                              | 38 dBc @ 2 GHz                 | -37 dB @ 2 GHz               | 2.59 W /                    | 3.6 V /           | 90-nm SiGe                                      |
| Work | 128 GS/s                     |           |                                     | 30 dBc @ 30 GHz                | -22 dB @ 26 GHz              | 0.43 W                      | 3.1 V             | 300 / 480 GHz                                   |

TABLE I. COMPARISON OF STATE-OF-THE-ART SAMPLING FRONT-ENDS WITH HIGHEST REPORTED SAMPLING RATES.

<sup>b.</sup> Max. sampling rate 108 GS/s, linearity measured at 90 GS/s.

c. Track mode bandwidth

a. Assuming THD = -SNDR.