© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

This manuscript was presented at 2022 IEEE International Symposium on Circuits and Systems (ISCAS), 2022 DOI: 10.1109/ISCAS48785.2022.9937421

# Mixed-Signal Integrated Circuit for Direct Raised-Cosine Filter Waveform Synthesis of Digital Signals up to 24 GS/s in 22 nm FD-SOI CMOS Technology

Daniel Widmann, Raphael Nägele, Markus Grözing, and Manfred Berroth

University of Stuttgart, Institute of Electrical and Optical Communications Engineering, Germany {daniel.widmann, raphael.naegele, markus.groezing, manfred.berroth}@int.uni-stuttgart.de

Abstract—Pulse shaping for signal transmission over bandwidth limited channels and for sensor systems is very important to control intersymbol interference and to comply with spectrum emission mask specifications by reducing the occupied bandwidth. In this work, an efficient, low-power concept for digital-to-waveform conversion is presented on a 22 nm CMOS node. A key characteristic is the approximation concept of a raised-cosine filter for waveform synthesis by non-binary weighting in the digital-to-analog converter (DAC) keeping hardware complexity and thus power consumption low. Due to the proposed pulse shaping method, spectral side lobes of a pseudo-random bit stream example can be reduced by more than 20 dB at 24 GS/s and a power consumption of only about 30 mW. In summary, this concept replaces high-resolution DACs or analog filters, respectively, in pulse shaping circuits by simple CMOS logic and an application-centric DAC.

*Index Terms*—Baseband, CMOS integrated circuits, digital-analog conversion, pulse shaping methods, waveform synthesis.

#### I. INTRODUCTION

Pulse shaping of baseband signals is important for data transmission systems and for sensor systems due to bandwidth limitations and spectral emission masks [1]. In accordance with [2] adressing performance limitations of analog-to-digital converters and referring to the umbrella term of "Analog-to-Information (A-to-I) conversion" for future efficiency improvements, an appropriate "more holistic and application-centric approach" [2] for digital-to-analog converters (DACs) is presented here. In this spirit, the target waveform of a raised-cosine filtered digital signal is generated by a customized digital-to-waveform approach rather than using an application-agnostic DAC. By adapting the circuit to the requirements of the target waveform, its implementation can be simplified gaining potentially more performance and power as well as chip area efficiency. Digital pre-processing of an externally supplied bit sequence at a data rate of 6 Gbit/s is performed. It is internally converted to an upsampled 3-bit digital signal at 24 GS/s. Finally, the symbols are converted to an analog signal by a non-binary DAC approaching a raised-cosine shape completed by a simple filter logic.

Classically, the preferred implementation of a pulse shaping filter is given by a digital FIR filter followed by a binary DAC [3], [4]. Area and hardware efficiency (and accordingly the capability of integration) are critical aspects of such systems. The method proposed in this work is an efficient way to perform spectral shaping at a power consumption of only about 30 mW and very low hardware effort requiring neither the high power consuming combination of digital signal processors and conventional high-resolution DACs with binary weighting nor area intensive analog filters.

The fully differential pulse shaping circuit is realized in 22 nm fully-depleted silicon-on-insulator (FD-SOI) CMOS technology of Globalfoundries providing flip-well transistors and can be part of baseband circuits for analog front-end transmitters. Thus, possible applications can be found in the field of data communication as well as in sensing applications like radar transmitter systems with ultra-wideband (UWB) pseudo-random noise (PRN) baseband signals [5] in CMOS technology. Radar systems are of special interest for automotive applications [6], [7]. The ongoing progress in CMOS technology, e.g. the 22 nm FD-SOI technology [8], [9], paves the way to a technology evolution towards pure CMOS processes for millimeter-wave circuits and radar systems [10] even if SiGe bipolar devices can still outperform CMOS transistors [7].

#### II. THEORY

This work concentrates on pseudo-random bit stream (PRBS) signals as stimulus which can be used in radar systems as described e.g. in [5]. In principle, the concept for binary signals can be extended to more output levels. However, the advantage over high-resolution DAC concepts decreases with increasing number of output levels. Pulse shaping for signal transmission over bandwidth limited channels is very important to control intersymbol interference [1] and to comply with spectrum emission mask specifications. The proposed system implements an approximation of the raised-cosine filter, a special variant of a Nyquist filter with the following frequency characteristic [11]:

$$X(f) = \begin{cases} T_b & 0 \le |f| \le \frac{1-\beta}{2T_b} \\ \frac{T_b}{2} \left\{ 1 + \cos\left[\frac{\pi T_b}{\beta} \left(|f| - \frac{1-\beta}{2T_b}\right)\right] \right\} & \frac{1-\beta}{2T_b} \le |f| \le \frac{1+\beta}{2T_b} \\ 0 & |f| > \frac{1+\beta}{2T_b} \end{cases}$$
(1)

This project is partly funded by the *Deutsche Forschungsgemeinschaft* (DFG, German Research Foundation) – 276016065.

TABLE I Normalized analog DAC output levels for an input signal transition  $0 \rightarrow 1$  at  $t = -\frac{T_b}{2}$ .



Fig. 1. Principle of pulse shaping. A  $0\rightarrow 1\rightarrow 0$  transition at an input bit period of  $T_b$  is ideally transformed to a time-continuous staircase signal with continuous values according to (2). In the system presented here, the shaping is approximated by five values, ideally having a hold time of  $T_s = T_b/4$ .

 $T_b$  is the bit period and  $\beta$  is called the roll-off factor with  $0 \le \beta \le 1$ . The corresponding pulse to this spectrum in time domain is given as

$$x(t) = \frac{\sin(\pi t/T_b)}{\pi t/T_b} \frac{\cos(\pi \beta t/T_b)}{1 - 4\beta^2 t^2/T_b^2} .$$
 (2)

The approximation approach of this work sets the roll-off factor to  $\beta = 1$  and neglects the parts for  $|t| > T_b$  due to  $1/t^3$  decay. Furthermore, the developed architecture delivers approximated values for the theoretical pulse function. Table I compares the theoretical values for an ideal pulse function  $x_{\text{ideal}}(t)$  ( $\beta = 1$ ) in the interval  $-T_b \leq t \leq 0$  to the approximated values delivered by the system  $x_{\text{approx}}(t)$  for a  $0 \rightarrow 1$  transition at  $t = -T_b/2$ . Fig. 1 illustrates the pulse shaping for a normalized  $0 \rightarrow 1 \rightarrow 0$  transition with an ideal hold time of  $T_s = T_b/4$ . Certainly, this theoretical staircase signal will be additionally smoothed in the implemented system by analog bandwidth limitation.

# III. DIGITAL-TO-ANALOG CONVERSION

The concept presented here uses an upsampling factor of four, i.e. a symbol rate of 24 GS/s is generated out of an input signal of 6 Gbit/s. Consequently, five different analog output values are required to approximate raised-cosine filtering. To keep the circuitry small and power consumption low, a special, application-specific DAC with non-binary weighting is applied as depicted in Fig. 2 instead of usual binary weighting. More precisely, quantization steps are adapted to a sampled raised-cosine pulse shape. Owing to this weighting,



Fig. 2. DAC output structure (SE) with non-binary weighting (external load resistor not shown). It converts the 3-bit input  $(b_0, b_1, b_2)$  generated by the digital processing unit to an analog output signal  $V_{out}$  approximating a raised-cosine response. One stage comprises a CMOS style inverter (see right part) and a series resistor. The different output driver capabilities are depicted as w, 2w and 3w.

the implementation requires a very low number of stages. Considering the largest deviation between the ideal value and the approximated value in Table I of about 0.016 for  $-\frac{T_b}{4}$ , a voltage precision of about  $6.4 \,\mathrm{mV}$  would t be required for a single-ended (SE) peak-to-peak value of  $V_{\text{pp,SE}} = 400 \,\text{mV}$ . Generally, this would be reached by a common binary weighted DAC with at least 6 bits. This comparison illustrates the efficient structure of the proposed application-centric design. In the system shown here, six  $6R = 300 \Omega$  output stages are connected to the 2R - 3R - 6Rstructure where  $R = 50 \Omega$  is defined by the requirement of  $50\,\Omega$  output matching (100\,\Omega differential). Another aspect of the chosen architecture is the implementation of a voltage based DAC using CMOS style inverters with series resistors as output drivers instead of commonly used current weighting.

For implementation, various aspects have to be considered for the design of the output structure. First, the output inverters have an inherent output resistance  $r_{\rm DS}$  which has to be compensated in the output stage resistors of the weighting network. Secondly, for matching reasons, all output paths are derived from the 6R path by parallelization without being a constraint. Generally, adjusted unit cells with arbitrary weights are also possible. The different output driver capabilities are indicated by w, 2w and 3w in Fig. 2. At an external supply voltage of  $800 \,\mathrm{mV}$ , the SE output voltage swing for  $50 \,\Omega$ termination ranges from  $200 \,\mathrm{mV}$  to  $600 \,\mathrm{mV}$  with a common mode voltage of  $V_{\rm cm, SE} = 400 \,\mathrm{mV}$ .

# IV. DIGITAL PROCESSING

Sampling and coding are performed in the digital processing unit depicted in Fig. 3 showing the simplified block diagram of the whole fully differential system. The external input data is sampled at a clock frequency of 6 GHz and fed into a unidirectional shift register. A phase switch allows controlling the setup time of the first sampling flip-flop. For any transition of the input signal, the shift register state changes in a thermometer code manner in the 24 GHz domain which is allocated by the following logic to the 3-bit digital output  $b_0$ ,  $b_1$ ,  $b_2$ . Table II gives an overview about this allocation. Due to the direct digital synthesis of the approximated



Fig. 3. Simplified block diagram of the system (external load resistor not shown). Most of the required buffers are neglected for simplification. The digital-to-analog converter (D/A) contains a differential version of the structure of Fig. 2. For proper timing, the coded signal  $b_0$ ,  $b_1$ ,  $b_2$  is sampled by flip-flops (FF) before digital-to-analog conversion. Clock and signal inputs consist of termination resistors followed by CMOS inverters.

raised-cosine response, neither a lookup table and therefore no memory nor a high order FIR filter have to be implemented.

 TABLE II

 Allocation of shift register states to 3-bit digital output code.

| Serial code in shift register |       |       |       | Output code |          |       | r $(t)$                             |
|-------------------------------|-------|-------|-------|-------------|----------|-------|-------------------------------------|
| $a_0$                         | $a_1$ | $a_2$ | $a_3$ | $b_0$       | $b_1$    | $b_2$ | $\mathcal{L}_{approx}(\mathcal{U})$ |
| 0                             | 0     | 0     | 0     | 0           | 0        | 0     | 0                                   |
| 1                             | 0     | 0     | 0     | 1           | 0        | 0     | 1/6                                 |
| 1                             | 1     | 0     | 0     | 0           | 0        | 1     | 1/2                                 |
| 1                             | 1     | 1     | 0     | 0           | 1        | 1     | 5/6                                 |
| 1                             | 1     | 1     | 1     | 1           | 1        | 1     | 1                                   |
| 0                             | 1     | 1     | 1     | 0           | 1        | 1     | 5/6                                 |
| 0                             | 0     | 1     | 1     | 0           | 0        | 1     | 1/2                                 |
| 0                             | 0     | 0     | 1     | 1           | 0        | 0     | 1/6                                 |
| Weight:                       |       |       |       | 1/6         | $^{1/3}$ | 1/2   | -                                   |

Appropriately, the logical functions (3) to (5) including dummy gates due to symmetry reasons can be concluded being realized in differential cascode voltage switch logic [12].

$$b_0 = a_0 \cdot \overline{a_1} + a_0 \cdot a_3 + \overline{a_2} \cdot a_3 \tag{3}$$

$$b_1 = a_1 \cdot a_2 \left( +1 \cdot 0 + 1 \cdot 0 \right) \tag{4}$$

$$b_2 = a_0 \cdot a_1 + a_2 \cdot a_3 \left(+1 \cdot 0\right) \tag{5}$$

Compared to area intensive passive analog filters, the concept can be integrated on a very small chip area at the expense of a higher clock frequency due to upsampling and higher power consumption.

# V. MEASUREMENT RESULTS

A photograph of the bonded die in the RF board cavity is shown in Fig. 4 with outer chip dimensions of about  $1450\,\mu m \times 1400\,\mu m$  containing the pulse shaping core, supply voltage block capacitors and other circuitry not presented here. The core size itself is only about  $45\,\mu m \times 105\,\mu m$  (marked in Fig. 4) including only part of the supply voltage block capacitors. Fig. 5 illustrates the measurement setups for time and frequency domain measurements.

A pulse pattern generator (PPG) is used as external PRBS generator clocked by the circuit's clock output  $(f_{clk}/4)$ .



Fig. 4. Photograph of the bonded die in the RF board cavity. The pulse shaping core is marked in white color.





Fig. 5. Measurement setups for differential time domain and SE frequency domain measurements (PT: prescale trigger, PR: phase reference).

For differential time domain measurements, a two-channel subsampling oscilloscope with a sampling module bandwidth of 70 GHz is used running in a combined prescale trigger (PT) and phase reference (PR) mode for low jitter measurements (< 100 fs RMS system jitter). The chip clock output also serves as input for the PT whereas the PR is driven at input clock frequency  $f_{\rm clk}$ . Frequency domain measurements are performed with a single-channel spectrum analyzer and therefore with the SE output signal. To monitor proper chip functionality during frequency measurements, the second SE output channel is observed with the subsampling oscilloscope, simultaneously. Resolution and video bandwidth of the spectrum analyzer are set to 1 kHz each and measurements are performed with a sweep time of 1000 s.

In Fig. 6, a differential eye diagram in persistence mode as well as the frequency domain measurement results compared to theoretical and simulated spectra are presented for a PRBS-9 input bit sequence (meaning the length of the sequence is  $N = 2^9 - 1$ ). On the one hand, the eye diagram shows the pulse shaping at 24 GS/s with different DAC output voltage levels which are smoothed due to the system's bandwidth limitation given by the core itself as well as the on-chip output line, output pads, bond wires, RF board, cables and connectors. The RMS jitter is less than 1.4 ps. Furthermore, the measurement validates proper functionality of the shaping concept because only the five valid voltage levels as well as transitions of neighboring levels appear in the eye diagram. On the other hand, the measured spectra (Fig. 6b) confirm the theoretically expected side lobe suppression of about 20 dB indicated in the figure illustrating the spectral shaping ability of the concept. The remaining spectral side lobes above 18 GHz can be easily suppressed by an integrated analog low pass filter with very relaxed demands concerning attenuation slope. It has to be mentioned that the measured spectra of the PPG signal as well as the one of the DAC output are affected each by an additional bandwidth limitation which is why their magnitudes are well below the theoretical curves with an assumed infinite bandwidth at higher frequencies. Moreover, the simulated spectrum only considers the extracted core including resistive as well as capacitive parasitics without the output structure.

To conclude, a massive side lobe suppression can be shown experimentally as theoretically predicted illustrating the spectral shaping ability of the concept.

Finally, the overall average power consumption for a supply voltage of  $800 \,\mathrm{mV}$  is about  $30.4 \,\mathrm{mW}$ . For lower sampling rates such as  $20 \,\mathrm{GS/s}$  and  $16 \,\mathrm{GS/s}$  it is  $27.2 \,\mathrm{mW}$  and  $24.0 \,\mathrm{mW}$ , respectively. The CMOS implementation as well as the efficient approximated data processing at low hardware effort are the key aspects for the system's compactness and low power consumption.

# VI. CONCLUSION

In this work, an efficient CMOS pulse shaping circuit for digital signals including digital-to-analog conversion with adapted quantization steps up to  $24 \,\mathrm{GS/s}$  is presented. From a more holistic and application-centric point of view, this



Fig. 6. Measurement results at 24 GS/s of (a) time domain and (b) frequency domain measurements for a PRBS-9 input signal.

customized digital-to-waveform approach aims at efficient generation of a raised-cosine filtered digital signal due to a reduced number of bits. Spectral side lobes can be suppressed by more than 20 dB in the given example. This suppression is especially important to prevent interference of other systems. The concept has potential for much higher sampling rates. This can be achieved by adapting setup timings and by implementing an initialization circuit for the first clock divider. Furthermore, the concept can be adapted to other spectral requirements and complemented by an application-specific on-chip data source. All system parts are compatible with common static CMOS logic. Due to the compact and efficient implementation using only simple and basic circuit elements in one voltage domain, the circuit is a favorable design for e.g. CMOS based radar systems omitting a high-resolution, high power consumption DAC or a large analog filter, respectively, for spectral shaping.

#### ACKNOWLEDGMENT

We thank Ingmar Kallfass, Sébastien Chartier and Athanasios Gatzastras from ILH at the University of Stuttgart for the valuable discussions on pseudo-random noise radar transmitters and for supporting a shared MPW.

#### REFERENCES

- N. S. Alagha and P. Kabal, "Generalized raised-cosine filters," *IEEE Transactions on Communications*, vol. 47, no. 7, pp. 989–997, 1999.
- [2] B. Murmann and B. Hoefflinger, Eds., NANO-CHIPS 2030: On-Chip AI for an Efficient Data-Driven World, ser. The Frontiers Collection. Cham: Springer, 2020.
- [3] W.-P. Zhu, M. O. Ahmad, and M. N. S. Swamy, "ASIC implementation architecture for pulse shaping FIR filters in 3G mobile communications," in *IEEE International Symposium on Circuits and Systems. Proceedings* (Cat. No.02CH37353), 2002.
- [4] R. Schmogrow, S. Ben-Ezra, P. C. Schindler *et al.*, "Pulse-Shaping With Digital, Electrical, and Optical Filters—A Comparison," *J. Lightwave Technol.*, vol. 31, no. 15, pp. 2570–2577, 2013.
  [5] H. J. Ng, R. Feger, and A. Stelzer, "A Fully-Integrated 77-GHz
- [5] H. J. Ng, R. Feger, and A. Stelzer, "A Fully-Integrated 77-GHz UWB Pseudo-Random Noise Radar Transceiver With a Programmable Sequence Generator in SiGe Technology," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 61, no. 8, pp. 2444–2455, 2014.
- [6] J. Hasch, E. Topak, R. Schnabel *et al.*, "Millimeter-Wave Technology for Automotive Radar Sensors in the 77 GHz Frequency Band," *IEEE Transactions on Microwave Theory and Techniques*, vol. 60, no. 3, pp. 845–860, 2012.
- J. Hasch, "Driving towards 2020: Automotive radar technology trends," in *IEEE MTT-S International Conference on Microwaves for Intelligent Mobility (ICMIM)*, 2015.
- [8] M. Sadegh Dadash, S. Bonen, U. Alakusu, D. Harame, and S. P. Voinigescu, "DC-170 GHz Characterization of 22nm FDSOI Technology for Radar Sensor Applications," in *13th European Microwave Integrated Circuits Conference (EuMIC)*, 2018, pp. 158–161.
- [9] S. N. Ong, S. Lehmann, W. H. Chow et al., "A 22nm FDSOI Technology Optimized for RF/mmWave Applications," in *IEEE Radio Frequency Integrated Circuits Symposium (RFIC)*, 2018, pp. 72–75.
- [10] A. Margomenos, "A Comparison of Si CMOS and SiGe BiCMOS Technologies for Automotive Radars," in *IEEE Topical Meeting on Silicon Monolithic Integrated Circuits in RF Systems*, 2009.
- [11] J. G. Proakis and M. Salehi, Eds., *Digital communications*, 5th ed. McGraw-Hill, 2014.
- [12] L. Heller, W. Griffin, J. Davis, and N. Thoma, "Cascode voltage switch logic: A differential CMOS logic family," in *1984 IEEE International Solid-State Circuits Conference. Digest of Technical Papers*, 1984, pp. 16–17.