© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

This manuscript is published in

IEEE Solid-State Circuits Letters

Reference:

D. Widmann, T. Tannert, M. Grözing and M. Berroth, "Analog Multiplexer for Performance Enhancement of Digital-to-Analog Converters and Experimental 2-to-1 Time Interleaving in 28-nm FD-SOI CMOS," in IEEE Solid-State Circuits Letters, vol. 6, pp. 277-280, 2023.

DOI: <u>10.1109/LSSC.2023.3323857</u>

# Analog Multiplexer for Performance Enhancement of Digital-to-Analog Converters and Experimental 2-to-1 Time Interleaving in 28-nm FD-SOI CMOS

Daniel Widmann, Tobias Tannert, Markus Grözing, Member, IEEE, and Manfred Berroth, Senior Member, IEEE

Abstract-To enhance the performance of digital-to-analog converters (DACs), time interleaving by an analog multiplexer (AMUX) provides a powerful concept. Next to an increased sampling rate, potential signal quality improvement as well as a sin(x)/x roll-off shift due to the nonlinear switching operation enabling a true bandwidth extension can be achieved. In this work, an integrated AMUX in a 28-nm CMOS technology is presented. The fundamental roll-off shift is deduced from a general mathematical model. In measurements, the roll-off shift as well as improvements of the edge jitter of pulse-amplitude modulated (PAM) signals due to the AMUX are demonstrated at a sampling rate of 100 GS/s. Compared to single-DAC operation at 50 GS/s, the total edge jitter of a PAM-2 signal can be improved from a standard deviation of about 1.27 ps to about 0.56 ps at 100 GS/s with AMUX operation in the given system. Finally, switching operation of the AMUX at 126 GS/s is shown demonstrating the potential of the concept.

Index Terms—Analog-digital integrated circuits, analog multiplexer, arbitrary waveform generator, CMOS integrated circuits, digital-analog conversion, digital-to-analog converter, mixed-signal integrated circuits, pulse-amplitude modulation, transmitters.

## I. INTRODUCTION

IGITAL-TO-ANALOG converters (DACs) are critical components in transmitter front-ends. Especially, optical communication systems targeting 1 Tbit/s and above per wavelength demand high-speed DACs with sampling rates of  $100 \,\mathrm{GS/s}$  and beyond [1]. To enhance the performance of DACs, time interleaving of several sub-DACs by an analog multiplexer (AMUX) is a promising concept and can be compared to analog demultiplexing front-ends on the receiver side. Performance enhancement comprises increasing sampling rate  $f_s = 1/T_s$  and bandwidth as well as signal quality improvement. In this work, a mathematical description of the AMUX operation is given and the bandwidth extension due to  $\sin(x)/x$  roll-off shift according to the total output sampling rate is proven from a system model based view. An integrated 2:1 AMUX in an arbitrary waveform generator (AWG) system implemented in 28-nm fully-depleted silicon-on-insulator (FD-SOI) CMOS technology is investigated experimentally in terms of bandwidth extension as well as signal quality improvement compared to the sub-DACs' performance. Experimental results are shown at an AMUX output sampling rate of  $100 \, \text{GS/s}$ .



Fig. 1. Interleaving concepts of DACs in time domain for an example of two DACs [1] including schematic timing diagrams. (a) Time interleaving by summation of phase-shifted DAC output signals (linear superposition) and (b) time interleaving by a clocked AMUX (nonlinear switching operation). Solid boxes indicate fully integrated systems, dashed boxes show hybrid concepts.

# II. INTERLEAVING CONCEPTS OF DIGITAL-TO-ANALOG CONVERTERS IN TIME DOMAIN

To enhance the performance of DACs, different concepts are applicable. Fig. 1 depicts two interleaving approaches in time domain at different levels of integration. In Fig. 1a, time interleaving by active or passive summation of phase-shifted DAC outputs to increase the sampling rate is shown. A detailed mathematical analysis of hold-interleaving, data-interleaving as well as data- and hold-interleaving concepts is given in [2]. There are two different variants for the concept: superposition of non-return-to-zero (NRZ) pulses with overlapping subsignals and superposition of return-to-zero (RZ) pulses without overlapping but at the expense of higher performance demands to the sub-DACs. Sampling rate and consequently potential usable synthesis bandwidth as well as signal quality can be enhanced. The latter is reached by suppression of image replicas. However, the superposition concept using NRZ pulses does not increase the fundamental bandwidth limitation defined by the  $\sin(x)/x$  roll-off. The bandwidth is still limited by the low-pass characteristic of the sub-DACs and their  $\sin(x)/x$ roll-off [1]-[3]. Especially in CMOS DACs, achieving high output bandwidths is a major challenge [1] and the  $\sin(x)/x$ roll-off contribution may be significant.

An integrated and clocked AMUX is able to overcome the disadvantages of the linear superposition concepts by its nonlinear switching operation. Due to this operation, sampling rate as well as output bandwidth can be increased assuming a higher output bandwidth of the AMUX compared to the sub-DACs [3] at the cost of additional clocking circuitry. An AMUX enables a conceptual path for performance enhancement of DACs for any given DAC architecture next to technology advances. In a hybrid concept, different technologies for the

This project is funded by the *Deutsche Forschungsgemeinschaft* (DFG, German Research Foundation) — 276016065.

Daniel Widmann, Tobias Tannert, Markus Grözing, and Manfred Berroth are with the Institute of Electrical and Optical Communications Engineering, University of Stuttgart, 70569 Stuttgart, Germany (e-mail: daniel.widmann@int.unistuttgart.de).



Fig. 2. Block diagram of the integrated system [11]. For precise clock phase alignment of the sub-DACs, 28 different phase positions with ideal timing steps of  $\sim 1.4$  ps at 100 GS/s are available.

sub-DACs and the AMUX can be chosen and in a monolithic approach, different circuit topologies can be applied at least. As in an ideal model, the AMUX is transparent for the DAC output signals in the center of the respective sample, the sub-DACs' output signals are isolated from the output during their switching periods. Hence, the sub-DACs can perform a transition without affecting the AMUX output, i.e. glitches are hidden [3], [4]. Owing to this isolation, DAC artifacts like glitches and jitter can potentially be reduced in the output signal to a certain degree assuming the AMUX itself does not contribute significant artifacts to the output signal. The operation can be viewed as another sampling process or retiming in the analog domain [4]. "Sampling" in the center of each hold period ensures maximum margin for timing inaccuracies [3]. Finally, the parasitic capacitance at the AMUX output is typically reduced compared to direct combination of several sub-DACs at the output node in a summation concept. On the other hand, possible artifacts may arise from asymmetries leading to distortions at  $\pm n \cdot f_s/2 \pm m \cdot f_{sig}$ where  $n, m \in \mathbb{N}$  and  $f_{sig}$  denotes the signal frequency, from clock feed-through as well as from a possible nonlinear transfer characteristic. A detailed discussion on AMUX impairments is given in [3], [5].

Examples for realizations according to Fig. 1a are demonstrated in [6] at 200 GS/s (hybrid) as well as in [7] at 100 GS/s (integrated). Hybrid AMUX realizations according to Fig. 1b are presented in [8] at 56 GS/s, in [5] at 120 GS/s, in [9] at 150 GS/s as well as in [10] at 168 GS/s (different scheme). In these realizations, the AMUXs are not implemented in CMOS. The concept of a monolithically integrated AMUX has already been implemented e.g. in [4] at 28 GS/s in 16-nm CMOS. The AMUX of this work is also able to operate at 118 GS/s [11]. So far, no other integrated CMOS solution of an AMUX at comparable sampling rates has been presented.

#### III. SYSTEM AND ANALOG MULTIPLEXER

The system's block diagram is depicted in Fig. 2. It is a universal AWG comprising two on-chip memory blocks, two sub-DACs and the AMUX. It is presented in [11]. The AMUX as well as the shown clock path components are implemented in current-mode topology whereas in the sub-DACs, CMOS topologies are used. AMUX and sub-DACs operate at halfrate clock. A broadband clock path allows adjusting the clock



Fig. 3. Schematic of the AMUX [11] with indicated transistor widths. The sub-DACs' single-ended output voltage range is between 250 mV and 750 mV and their signals are connected to the AMUX via source followers with 50- $\Omega$  input termination resistors. The AMUX's power comsumption is about 174 mW.

offset (clk offs) and hence the duty cycle of the AMUX. It includes two 5-bit programmable phase rotators  $(\phi_0/\phi_1)$  in front of the sub-DACs. The schematic of the active AMUX is presented in Fig. 3. It is based on two interleaved Gilbert cells. At the bottom, two linearized transconductance stages serve as data inputs. Cascode current switches driven by the differential half-rate clock signal are stacked above performing the nonlinear switching operation by current steering either to the output path or to a dummy path. Both paths are isolated from the switches by common-gate transistors also providing a common-gate input. Shunt-series peaking is applied for bandwidth enhancement. For single sub-DAC characterization, the AMUX's clock signal can be set to a static state.

#### IV. THEORY OF THE AMUX OPERATION

In this work, the following definitions and notations are valid. The Fourier transform of a signal x(t) is denoted as  $x(t) \circ - X(j\omega)$ . The notation  $X(e^{j\Omega})$  describes the discrete-time Fourier transform of a signal x[n] using the normalized angular frequency  $\Omega = \omega T_s$ . The definition of the si function is given by  $\sin(x) \coloneqq \sin(x)/x$ . Finally, the Fourier series representation of a rectangular pulse train r(t) with a period of  $T_p$  and a pulse width of  $T_p/N$  is given in (1).

$$r(t) = \sum_{\nu = -\infty}^{\infty} \operatorname{rect}\left(\frac{t - \nu T_{\rm p}}{T_{\rm p}/N}\right) = \sum_{\nu = -\infty}^{\infty} \frac{1}{N} \operatorname{si}\left(\frac{\pi\nu}{N}\right) e^{j\nu \frac{2\pi}{T_{\rm p}}t}$$
(1)

In the following, the  $\sin(x)/x$  roll-off shift due to an AMUX according to the total output sampling rate  $f_s = 1/T_s$  is proven. The starting point for the following consideration is the general model according to Fig. 4 [3]. Generally, N DACs ( $N \in$ N) with  $\Lambda = \{0, \ldots, N-1\}$  and  $\lambda \in \Lambda$  are assumed where N = 1 corresponds to a converter structure without AMUX. Furthermore, the output symbol period of the individual DACs is  $T_{\text{DAC}}$ , the one of the total output signal is  $T_s$  and their relation is assumed to be  $T_{\text{DAC}} = N \cdot T_s$ . Assuming NRZ pulses, the sub-DAC output signals are given by (2).

$$Y_{\lambda}(j\omega) = \underbrace{X_{\lambda}\left(e^{j\omega T_{DAC}}\right)}_{\text{ideal digital-to-}} \cdot \underbrace{T_{DAC} \operatorname{si}\left(\frac{\omega T_{DAC}}{2}\right)}_{\text{zero-order hold}} \cdot \underbrace{e^{-j\omega\lambda T_{s}}}_{\text{delay}} (2)$$



Fig. 4. Block diagram of a DAC-AMUX system model [3]. Low-pass filters being inherent or realized explicitly are depicted as well.

Assuming a cutting-out operation in time domain in the center of the sub-DACs' output signals by the AMUX, the clock signals can be written as

$$C_{\lambda}(j\omega) = \frac{2\pi}{N} \cdot \operatorname{si}\left(\frac{\omega T_{s}}{2}\right) \cdot e^{-j\omega\lambda T_{s}} \cdot \sum_{\mu=-\infty}^{\infty} \delta\left(\omega - \mu \frac{2\pi}{NT_{s}}\right)$$
(3)

in frequency domain. The output signal is described by

$$Y(j\omega) = \frac{1}{2\pi} \left( \sum_{\lambda \in \Lambda} C_{\lambda}(j\omega) * Y_{\lambda}(j\omega) \right) .$$
 (4)

So far, the model from [3] has been summed up. In the equations, bandwidth limiting effects can be considered by additional filter transfer functions. The proof of the  $\sin(x)/x$  roll-off shift due to an AMUX is based on (4). The output spectrum is given by (5).

$$Y(j\omega) = T_{s} \sum_{\lambda \in \Lambda} X_{\lambda} \left( e^{j\omega T_{DAC}} \right) e^{-j\omega\lambda T_{s}}$$
$$\cdot \underbrace{\sum_{\mu = -\infty}^{\infty} si\left(\frac{\left(\omega - \mu \frac{2\pi}{NT_{s}}\right) T_{DAC}}{2}\right) si\left(\mu \frac{\pi}{N}\right)}_{F_{1}(j\omega)} \quad (5)$$

Considering the last part  $F_1(j\omega)$  in (5) in time domain  $F_1(j\omega) \bullet o f_1(t)$ , it holds

$$f_{1}(t) = \sum_{\mu = -\infty}^{\infty} \operatorname{si}\left(\mu \frac{\pi}{N}\right) \cdot e^{j\mu \frac{2\pi}{T_{\text{DAC}}}t} \cdot \frac{1}{T_{\text{DAC}}} \operatorname{rect}\left(\frac{t}{T_{\text{DAC}}}\right)$$
$$\stackrel{(1)}{=} \frac{1}{T_{\text{s}}} \cdot \operatorname{rect}\left(\frac{t}{T_{\text{s}}}\right) \circ \bullet \operatorname{si}\left(\frac{\omega T_{\text{s}}}{2}\right) = F_{1}(j\omega) \quad . \tag{6}$$

Finally, the output spectrum

$$Y(j\omega) = T_{s} \cdot \sum_{\lambda \in \Lambda} X_{\lambda} \left( e^{j\omega T_{DAC}} \right) \cdot e^{-j\omega\lambda T_{s}} \cdot \operatorname{si}\left(\frac{\omega T_{s}}{2}\right)$$
(7a)

$$= T_{\rm s} \cdot X \left( {\rm e}^{{\rm j}\Omega} \right) \cdot {\rm si} \left( \frac{\omega T_{\rm s}}{2} \right)$$
(7b)

is obtained. In (7a) and (7b), the multiplication with si  $\left(\frac{\omega T_s}{2}\right)$  proves that for the sin(x)/x roll-off, the total output sampling rate  $f_s = 1/T_s$  including the AMUX is essential. This is in accordance with the expectation resulting from an RZ interleaving model [2]. However, the proof in this work originates

from the real circuit topology and operations and provides an extension to [2] for AMUX systems. I.e., NRZ sub-DAC output symbols at  $f_s/N$  as well as multiplication (mixing) operation of the AMUX are considered in the model and the derivation.

## V. MEASUREMENT RESULTS

To examine performance enhancement by the AMUX operation in the given system (micrograph in Fig. 5a), a comparison of a single sub-DAC output at  $50 \,\mathrm{GS/s}$  and full operation at 100 GS/s is drawn. In both measurements, the sub-DACs are consistently running at 50 GS/s. In case of single sub-DAC operation, the AMUX is in a static, transparent state for the respective channel by using static clock signals allowing sub-DAC investigations through the transparent AMUX while the passive sub-DAC outputs a constant value to prevent the output signal from being affected by crosstalk. In this case, the AMUX can be viewed as a linear buffer. To prove bandwidth enhancement, the impulse responses are estimated from the averaged reactions to jumps from mid-scale to the maximum and minimum output values with the length of one sampling period. Fig. 5b shows the corresponding transfer functions. The AMUX advantage becomes obvious. According to (7b), the  $\sin(x)/x$  roll-off is determined by the effective sampling rate of the given system and leads to a zero at f = 50 GHzfor the case of NRZ single-core operation. At f = 25 GHz, a bandwidth improvement due to the AMUX of >2 dB is observed (ideally  $\sim 3 \,\mathrm{dB}$ ). Hence, given the highest attenuation in the respective frequency range as the deciding parameter for broadband signals up to Nyquist frequency due to predistortion, there is already an advantage in the lower frequency region. In total, the figure illustrates the powerful bandwidth enhancement ability of an AMUX. In case of linear time interleaving of the two NRZ sub-DACs [2], the sampling rate would also double to  $100 \,\mathrm{GS/s}$ . However, the transfer function is still limited to the sub-DACs' transfer functions and the symbol rate cannot increase accordingly, i.e., it has to be far less than 50 GBd due to the roll-off.

A second comparative analysis is based on a two-level pulse-amplitude modulated signal (PAM-2) of length 1024. Figures 5c and 5d show PAM-2 signals at 50 GS/s for single-DAC operation as well as at 100 GS/s for AMUX operation. The drop in magnitude corresponds to the difference in the transfer functions according to Fig. 5b considering a signal up to the Nyquist frequency of 25 GHz for DAC0 and a signal up to the Nyquist frequency of 50 GHz for the AMUX. In both cases, linear predistortion is applied. As a characteristic parameter, the jitter around  $V_d = 0 V$  from histogram data is determined for analysis. It provides a quantitative value for the total jitter affecting the horizontal eye opening. The standard deviation of the jitter can be improved in the given system by the AMUX from  $\sigma_j \approx 1.27 \text{ ps}$  in Fig. 5c to  $\sigma_j \approx 0.56 \text{ ps}$  in Fig. 5d despite doubling the output sampling rate.

In another measurement, a PAM-2 eye diagram for single-DAC operation is compared to AMUX operation with all samples doubled and shown in Fig. 5e and Fig. 5f. Hence, in both cases, the output symbol rate is 50 GBd. Still, there is an advantage due to the AMUX improving the jitter



Fig. 5. (a) Micrograph of the bonded die in RF board cavity and (b) to (h) measurement results. (b) Normalized transfer functions for single-DAC operation at 50 GS/s and AMUX operation at 100 GS/s. Comparison of PAM-2 signals for (c) single-DAC operation at 50 GS/s and (d) AMUX operation at 100 GS/s. 50-GBd PAM-2 signals are shown in (e) for single-DAC operation at 50 GS/s and in (f) for AMUX operation at 100 GS/s. scaled to approximately the same output swings for comparison. (d) Spectrum normalized to full-scale at 100 GS/s AMUX operation for a signal frequency of 49.707031250 GHz. In (b) to (g), the sub-DACs are consistently running at 50 GS/s. (h) AMUX switching operation for static data input signals at 126 GS/s.

from  $\sigma_j \approx 1.36 \,\mathrm{ps}$  to  $\sigma_j \approx 0.47 \,\mathrm{ps}$ . As the sub-DACs are implemented in CMOS topologies, they provide a highly dynamic load to the supply voltage. Limited on-chip decoupling causes the analog output signals of the sub-DACs to be affected by supply voltage artifacts. On the contrary, the AMUX is implemented in current-mode topology and hence, it provides a constant load. In summary, sub-DAC artifacts can be reduced by an AMUX in combination with a change in circuit topology from CMOS to current-mode.

A spectrum for an averaged sinusoidal signal at a frequency  $f_{\rm sig} \approx 50 \, {\rm GHz}$  is depicted in Fig. 5g. Expected AMUX distortions at  $f_{\rm s}/2 - f_{\rm sig}$  and  $f_{\rm clk}$  appear and reveal to be in the same order of magnitude as other sub-DAC spurs in this case. At other frequencies, AMUX distortions contribute even less than other effects. Clock feed-through in the differential signal can be reduced by an improved layout.

As a final result, the AMUX switching operation at  $f_s = 126 \text{ GS/s}$  for static DAC0/1 input signals is presented in Fig. 5h illustrating the concept's potential for much higher sampling rates.

## VI. CONCLUSION

In this work, the influence of an AMUX in a DAC front-end is investigated theoretically as well as experimentally. Two sub-DACs in CMOS topology are combined using an integrated AMUX in current-mode topology. The  $\sin(x)/x$  roll-off shifts according to the effective output sampling rate by principle. Hence, an AMUX is able to enhance the bandwidth limitation of DACs as a matter of principle supposed the AMUX itself provides a sufficient analog bandwidth. In a second analysis, the AMUX's "sampling" process is evaluated in terms of edge jitter. In the given system, sub-DAC artifacts due to their dynamic supply voltage loading can be reduced by the AMUX. The standard deviation of the jitter reduces by more than 50%. A current-mode AMUX can improve the signal quality if the output signal's artifacts are dominated by sub-DACs. To further improve performance, broadband, frequency response compensating amplifiers are of special interest.

#### REFERENCES

- F. Buchali, "Beyond 1 Tbit/s transmission using high-speed DACs and analog multiplexing," in 2021 Optical Fiber Communications Conference and Exhibition (OFC), 2021.
- [2] S. Balasubramanian et al., "Systematic Analysis of Interleaved Digitalto-Analog Converters," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 58, no. 12, pp. 882–886, 2011.
- [3] C. Schmidt, Interleaving Concepts for Digital-to-Analog Converters Algorithms, Models, Simulations and Experiments, 1st ed. Wiesbaden: Springer Vieweg, 2020.
- [4] P. Caragiulo, O. E. Mattia, A. Arbabian, and B. Murmann, "A 2× Time-Interleaved 28-GS/s 8-Bit 0.03-mm<sup>2</sup> Switched-Capacitor DAC in 16-nm FinFET CMOS," *IEEE Journal of Solid-State Circuits*, vol. 56, no. 8, pp. 2335–2346, 2021.
- [5] M. Collisi and M. Möller, "Resolution-Related Design Considerations for a 120-GS/s 8-bit 2:1 Analog Multiplexer in SiGe-BiCMOS Technology," *IEEE Journal of Solid-State Circuits*, vol. 56, no. 9, pp. 2624–2634, 2021.
- [6] H. Hettrich, R. Schmid, L. Altenhain, J. Würtele, and M. Möller, "A Linear Active Combiner Enabling an Interleaved 200 GS/s DAC with 44 GHz Analog Bandwidth," in 2017 IEEE Bipolar/BiCMOS Circuits and Technology Meeting (BCTM), 2017, pp. 142–145.
- [7] H. Huang et al., "An 8-bit 100-GS/s Distributed DAC in 28-nm CMOS for Optical Communications," *IEEE Transactions on Microwave Theory* and Techniques, vol. 63, no. 4, pp. 1211–1218, 2015.
- [8] T. Tannert et al., "A SiGe-HBT 2:1 Analog Multiplexer with more than 67 GHz Bandwidth," in 2017 IEEE Bipolar/BiCMOS Circuits and Technology Meeting (BCTM), 2017, pp. 146–149.
- [9] J. Schostak et al., "150 GBd PAM-4 Electrical Signal Generation using SiGe-Based Analog Multiplexer IC," in 2022 17th European Microwave Integrated Circuits Conference (EuMIC), 2022, pp. 173–176.
- [10] M. Nagatani *et al.*, "A Beyond-1-Tb/s Coherent Optical Transmitter Front-End Based on 110-GHz-Bandwidth 2:1 Analog Multiplexer in 250-nm InP DHBT," *IEEE Journal of Solid-State Circuits*, vol. 55, no. 9, pp. 2301–2315, 2020.
- [11] D. Widmann, T. Tannert, X.-Q. Du, T. Veigel, M. Grözing, and M. Berroth, "A Time-Interleaved Digital-to-Analog Converter up to 118 GS/s with Integrated Analog Multiplexer in 28-nm FD-SOI CMOS Technology," *IEEE Journal of Solid-State Circuits*, 2023.