High Temperature CMOS Silicon Carbide Asynchronous Circuit Design

Landon John Caley

University of Arkansas, Fayetteville

Follow this and additional works at: http://scholarworks.uark.edu/etd

Part of the VLSI and Circuits, Embedded and Hardware Systems Commons

Recommended Citation


http://scholarworks.uark.edu/etd/30

This Dissertation is brought to you for free and open access by ScholarWorks@UARK. It has been accepted for inclusion in Theses and Dissertations by an authorized administrator of ScholarWorks@UARK. For more information, please contact scholar@uark.edu, ccmiddle@uark.edu.
High Temperature CMOS Silicon Carbide Asynchronous Circuit Design
High Temperature CMOS Silicon Carbide Asynchronous Circuit Design

A dissertation submitted in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy in Computer Engineering

By

Landon Caley
University of Arkansas
Bachelor of Science in Computer Engineering, 2010

May 2015
University of Arkansas

This dissertation is approved for recommendation to the Graduate Council.

Dr. Jia Di
Dissertation Director

Dr. H. Alan Mantooth
Committee Member
Dr. Dale Thompson
Committee Member

Dr. J. Patrick Parkerson
Committee Member
ABSTRACT
Designing a digital circuit to operate in an extreme temperature range is a challenge with increasing demand for a solution. Large variations in temperature have a distinct impact on electron mobilities causing substantial changes to the threshold voltage of the devices. These physical changes affect the setup and hold times of clocked components, such as D-Flip Flops, of a traditional synchronous digital circuit. Focusing primarily on high temperature circuit operation, this dissertation presents a digital circuit design methodology pairing an asynchronous circuit design paradigm called NULL Convention Logic (NCL) as well as traditional Boolean circuitry with a wide-bandgap semiconductor material, Silicon Carbide (SiC). A total of nineteen circuits have been designed and fabricated. Chip testing results show correct operation for all circuits returned from fabrication, with most performing at or above the targeted temperature of 300°C.
ACKNOWLEDGEMENTS

I would like to thank my advisor, Dr. Jia Di, for all of his guidance and support through the most enjoyable years of my academic career. The opportunity he has given me to pursue an advanced degree will prove to be an invaluable resource throughout my entire career. I would also like to thank my parents, Tim and Susan Caley, who provided encouragement along every step of the way. Finally, I would like to thank my colleagues at the University of Arkansas who were pivotal in developing the technical skills to make this work possible.
DEDICATION

To my beautiful wife, Muna. Your sacrifices, encouragement, and steadfast love have made this possible. Thank you for your patience and unwavering support.
# TABLE OF CONTENTS

1. **Introduction** ........................................................................................................... 1  
   1.1. Problem ............................................................................................................. 1  
   1.2. Dissertation Statement ...................................................................................... 2  
   1.3. Objectives .......................................................................................................... 3  
   1.4. Dissertation Organization .................................................................................. 3  

2. **Background** ....................................................................................................... 4  
   2.1. NULL Convention Logic (NCL) .......................................................................... 4  
      2.1.1. NCL Gates .................................................................................................. 5  
      2.1.2. NCL Registers ........................................................................................... 10  
   2.2. Silicon Carbide (SiC) ....................................................................................... 14  
      2.2.1. Wide-Bandgap Semiconductors .................................................................. 17  
      2.2.2. Design Challenges and Benefits ............................................................... 18  
      2.2.3. Existing SiC Digital ICs ............................................................................. 18  

3. **Approach** ........................................................................................................... 19  
   3.1. Design Challenges .......................................................................................... 19  
      3.1.1. Channel Routing ....................................................................................... 19  
      3.1.2. POLY Crossovers ....................................................................................... 21  
   3.2. Design Flow ..................................................................................................... 23  
   3.3. Tapeout 1 ........................................................................................................ 24  
      3.3.1. Cell Layouts ............................................................................................... 24  
      3.3.2. Tapeout 1 Circuits ..................................................................................... 26  
   3.4. Tapeout 2 ........................................................................................................ 31  
      3.4.1. Cell Layouts ............................................................................................... 32
3.4.2. Tapeout 2 Circuits
3.4.2.1. Design for Testability (DFT)
3.4.2.2. NCL Flyback Controller
3.4.2.3. Synchronous DAC Controller FSM
3.5. Physical Testing Setup
3.5.1. Probe Testing
3.5.2. Packaged Testing
3.5.2.1. Packaging
3.5.2.2. FPGA Testing Implementation
3.5.2.3. Level Shifter PCB
4. Results and Analysis
4.1. Tapeout 1
4.1.1. Simulation Results
4.1.2. Physical Testing Results
4.1.3. Yield Issues
4.1.3.1. Circuit Functionality
4.1.3.2. Packaging Difficulty
4.1.3.3. DRC Concessions for Tapeout 2
4.1.4. Tapeout 1 Analysis
4.2. Tapeout 2
4.2.1. Simulation Results
5. Conclusions
5.1. Summary
LIST OF TABLES

Table 1: Dual-Rail Encoding ......................................................................................4
Table 2: Quad-Rail Encoding .....................................................................................5
Table 3: 27 Fundamental NCL Gates ........................................................................8
Table 4: Bandgap of Common Semiconductors ....................................................18
Table 5: Test Cell Transistor Combinations ..........................................................29
Table 6: Cells Selected for Transistor Sizing Analysis ..........................................30
LIST OF FIGURES

Figure 1: TH23 Gate .................................................................................................... 6
Figure 2: TH34w32 Gate .............................................................................................. 6
Figure 3: TH13b Gate .................................................................................................. 7
Figure 4: TH33n Gate .................................................................................................. 7
Figure 5: TH22 Transistor-Level Schematic .............................................................. 9
Figure 6: Single-Bit Dual-Rail NCL Register ........................................................... 10
Figure 7: 8-bit NCL Register with Completion Logic .............................................. 12
Figure 8: Single-Stage NCL Pipeline ....................................................................... 13
Figure 9: Electrons and holes as carriers through 2-D representation of silicon crystal lattice structure ............................................................................. 16
Figure 10: Energy-Band Diagram ............................................................................. 17
Figure 11: Traditional Routing versus Channel Routing ....................................... 21
Figure 12: POLY Crossovers in Circuit Core .......................................................... 22
Figure 13: POLY Crossovers in Pad Ring ................................................................ 23
Figure 14: TH12 Cell using Tapeout 1 Layout Style ............................................... 25
Figure 15: 8+4×4 NCL Multiply Accumulate Unit .................................................... 27
Figure 16: NCL Counter ............................................................................................ 27
Figure 17: Boolean FSM ........................................................................................... 27
Figure 18: NCL RCA .................................................................................................. 27
Figure 19: Boolean RCA ........................................................................................... 28
Figure 20: Ring Oscillator ......................................................................................... 28
Figure 21: RO (Probe Pad) ........................................................................................ 28
Figure 22: SR (Trans.) ............................................................................................... 28
Figure 23: SR (NAND) ............................................................................................... 28
Figure 24: SR (Static) ................................................................................................ 28
Figure 25: Boolean Library ....................................................................................... 28
Figure 26: NCL Library 1 ........................................................................................... 28
Figure 54: NCL RCA Simulation Waveforms ..........................................................53
Figure 55: Boolean FSM Physical Testing Data .....................................................53
Figure 56: NCL Counter Physical Testing Data ......................................................54
Figure 57: Boolean RCA Physical Testing Data .....................................................55
Figure 58: Transmission Gate Shift Register Physical Testing Data ....................55
Figure 59: Average Ring Oscillator Physical Testing Data ....................................56
Figure 60: Oscilloscope Waveforms of Boolean RCA Operating at 300°C ...........57
Figure 61: Oscilloscope Waveforms of NCL Counter Operating at 300°C ..........58
Figure 62: Oscilloscope Waveforms of Boolean FSM Operating in excess of 300°C .................................................................................................................59
Figure 63: Oscilloscope Waveforms Showing Correct Operation of the NCL MAC at room temperature.................................................................60
Figure 64: DAC Controller PEX Simulation Results ............................................64
Figure 65: Flyback Controller PEX Simulation Results – Control and Test Signals .................................................................................................................64
Figure 66: Flyback Controller PEX Simulation Results – Lambda On A Inputs ..65
Figure 67: Flyback Controller PEX Simulation Results – Lambda On B Inputs ..65
Figure 68: Flyback Controller PEX Simulation Results – Lambda On C Inputs ..66
Figure 69: Flyback Controller PEX Simulation Results – Lambda On D Inputs ..66
Figure 70: Flyback Controller PEX Simulation Results – Lambda Off A Inputs ..67
Figure 71: Flyback Controller PEX Simulation Results – Lambda Off B Inputs ..67
Figure 72: Flyback Controller PEX Simulation Results – Lambda Off C Inputs ..68
Figure 73: Flyback Controller PEX Simulation Results – Lambda Off D Inputs ..68
Figure 74: Microscope View of NCL MAC Undergoing Testing with Multi-Contact Wedge Probes.................................................................70
1. Introduction

1.1. Problem
As the need for high temperature digital circuitry increases, so does the demand for a solution to the challenges associated with designing these circuits. Traditionally, for a circuit to operate correctly at room temperature as well as at a target temperature, complex control mechanisms are required. With applications emerging in automotive, aerospace, and power electronics industries, high temperature circuit design has become a lucrative topic for researchers. As the temperature increases, the electron mobilities inherent to the semiconductor material change, creating a drastic swing in the threshold voltage of the devices. In turn, this causes the delicate setup and hold times of synchronous components to change. This change in timing may be accounted for during the synthesis process, but this method is only effective over a narrow temperature range. As a result, special care must be taken to control and protect currently available circuitry from these extreme temperatures, resulting in degraded efficiency and reliability.

A common solution for analog circuit designers is to use a wide-bandgap semiconductor, rather than traditional Silicon (Si). In recent years, Silicon Carbide (SiC) has been rapidly increasing in popularity as an alternative semiconductor material due to its resilience to high temperatures [1]. Throughout this work, all references to SiC imply the 4H polytype. Higher operating temperatures allow the circuit to function in hotter ambient conditions, such as in an automotive or aerospace engine compartment, with the advantage of reduced active cooling requirements [2]. However, SiC is still a developing technology, and often only offer NMOS devices, a single metal layer, and do not have reliable device models. Therefore, these processes are rarely suited for digital
integrated solutions.

Still under active development, the High Temperature Silicon Carbide (HTSIC) process, developed by Raytheon, offers many attractive features for digital integrated circuits (ICs). It is a 1.2µm CMOS process (i.e., with both NMOS and PMOS transistors) with one metal layer, two POLY layers (one highly resistive), a 15V nominal voltage, N-Type substrate, and has been proven at temperatures above 350°C [3]. It is also the first CMOS SiC process with a high-fidelity Process Design Kit (PDK), developed by [4].

Even though SiC is a reliable semiconductor material for high temperature applications, it does not resolve the timing problem inherent to clocked synchronous digital circuits. NULL Convention Logic (NCL) is an asynchronous quasi delay-insensitive (QDI) architecture introduced in [5]. By relinquishing the need for a clock signal, sequential components no longer depend on rigid timing requirements for correct circuit operation. NCL is a correct-by-construction architecture controlled by handshaking signals; thus, as long as the transistors retain functionality, the circuit will continue to operate error-free. NCL circuits are comprised of 27 fundamental threshold gates with hysteresis to ensure all signals have arrived before the gate transitions. Most NCL circuits are encoded in a dual-rail configuration; however, quad-rail encodings are not unusual.

1.2. Dissertation Statement

The goal of this dissertation is to develop a design methodology using an automated design flow that is capable of producing digital ICs with proven operation from room temperature to 300°C.
1.3. Objectives

This design methodology was developed and utilized during the completion of a two-tapeout grant from the National Science Foundation (NSF). The objective of these tapeouts was to create an intelligent integrated gate driver capable of stable operation up to 300 °C. Arkansas Power Electronics, Incorporated (APEI) provided the specification guiding circuit design decisions. The digital components of this system make up the control logic for various functions of the circuit. The first tapeout primarily focused on building block circuits and test structures to learn the process strengths and limitations as well as device behavior. Tapeout two focused on creating the components for the intelligent gate driver. This was done in 5 discrete chips that are to be integrated at the package level.

1.4. Dissertation Organization

Chapter 2 provides background information introducing the main enablers of this work: NCL and SiC. Chapter 3 contains a detailed report on the approach taken to design these circuits using the proposed design methodology. Chapter 4 presents simulation results for both tapeouts was well as and physical testing for Tapeout 1, and an analysis of this data. Chapter 5 summarizes the findings and concepts discussed in this dissertation as well as examine future possibilities for this work.
2. Background

2.1. NULL Convention Logic (NCL)

NCL achieves delay insensitivity by utilizing multi-rail logic encodings. Traditional Boolean circuitry utilizes two logic levels, logic1 = true and logic0 = false. NCL also utilizes these two data values, however it also requires a third value. Called the NULL state, this third logic value indicates the absence of data. These logic values, referred to as DATA1, DATA0, and NULL, are encoded in a one-hot scheme, which requires two wires per signal. These two wires, or rails, are mutually exclusive, thus asserting both rails simultaneously is forbidden. Referred to as dual-rail logic, the signal encodings can be seen in Table 1. The NULL state is used to flush all logic values out of the state-holding NCL gates, and indicates that DATA is not yet available.

<table>
<thead>
<tr>
<th></th>
<th>NULL</th>
<th>DATA0</th>
<th>DATA1</th>
<th>INVALID</th>
</tr>
</thead>
<tbody>
<tr>
<td>Rail0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>Rail1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
</tbody>
</table>

Table 1. Dual-Rail Encoding

Dual-rail encoding is the most common form; however, other encodings are possible. For example quad-rail logic is not uncommon, and offers some gained efficiency in certain situations. Much like its dual-rail counterpart, quad-rail logic encodes a signal with four values in a mutually exclusive one-hot relationship. All encodings not found in Table 2 are considered invalid for quad-rail logic.
Signals encoded in a multi-rail format are typically denoted in one of two ways. Most
commonly used in the design tool flow, the first method is to refer to the signals by

\( \text{name}(i).\text{rail}<X> \), where \( \text{name} \) corresponds to the designation given to the signal, \( i \) labels
the index of the signal within a bus, if applicable, and \( X \) is the indicator for the particular
rail/wire being referenced for example, \( Z(2).\text{rail}0 \) indicates signal \( Z \), bit 2, wire 0. The
second method, which is more convenient for logic equations uses the form \( \text{name}^X_i \), for
instance \( Z^0_2 \) is how the previous signal would be denoted in this notation. Throughout
this dissertation, the first notation is used.

### 2.1.1. NCL Gates

NCL circuits are composed of 27 fundamental logic gates. Called threshold gates, each
of these gates follows a simple formula. Each gate transitions from logic0 to logic1 only
when a certain threshold of asserted inputs is achieved. Due to the naming convention
implemented with these logic gates, it is simple to understand which function each gate
performs. Denoted \( THmn \), where \( 1 \leq m \leq n \), these gates have \( n \) inputs. When at least \( m \)
of these inputs are asserted, the output transitions from logic0 to logic 1. For example, a
TH23 is a three-input gate that requires two or more to be asserted before the output is
asserted. The symbol for the TH23 is shown below in Figure 1.
While all threshold gates follow the $THmn$ notation, there are several addendums to indicate special functionality. Most commonly used, several threshold gates use weighted inputs to perform the required function. This is noted by $THmnWx_1x_2...x_n$ where $1 < x \leq m$. This notation ignores cases where $x = 1$. The values of $x_1, x_2, ..., x_n$ refer to the weights of the inputs in order, i.e., $x_1$ is the weight of input A, $x_2$ is the weight of input B, etc. For example, a TH34w32 is a gate with four inputs that asserts its output when a threshold of three is achieved; due to the weighted inputs on this gate, the A input has a weight of three, thus may assert the output by itself, and the B input has a weight of two, thereby only requiring one other input asserted to assert the output. The C and D inputs have a weight of one, and therefore are not indicated in the list of weights. This concept is greatly simplified by studying the symbol assigned to weighted threshold gates, Figure 2.
Other addendums to the $THmn$ notation are required for the last two types of NCL gates, inverting and resettable gates. Inverting gates take the form $TH1nb$. Typically, inverting gates have a threshold of one and are almost identical to their $TH1n$ counterparts at the transistor level, simply exempting the output inverter. An example of an inverting gate is the TH13b. Resettable gates are used in NCL storage elements and typically are of the form $THnn$. Appending an $n$ or $d$ after $THnn$ indicates whether the gate resets to NULL or DATA. For example, a TH33n is a three-input resettable gate with a threshold of three that resets to ‘0’, while a TH33d resets to ‘1’. Figures 3 and 4 show examples of inverting and resetting threshold gates, respectively.

![Figure 3. TH13b Gate](image)

![Figure 4. TH33n Gate](image)

All threshold gates use hysteresis such that when asserted, the gate requires all inputs to be deasserted until the output is deasserted. This is the functionality behind the previously mentioned NULL cycle that provides the delay insensitive behavior of the NCL architecture. This suite of 27 gates, shown in Table 3, expresses all Boolean functions of four variables or less [5].
<table>
<thead>
<tr>
<th>NCL Gate</th>
<th>Boolean Function</th>
</tr>
</thead>
<tbody>
<tr>
<td>TH12</td>
<td>A+B</td>
</tr>
<tr>
<td>TH22</td>
<td>AB</td>
</tr>
<tr>
<td>TH13</td>
<td>A+B+C</td>
</tr>
<tr>
<td>TH23</td>
<td>AB + AC + BC</td>
</tr>
<tr>
<td>TH33</td>
<td>ABC</td>
</tr>
<tr>
<td>TH23w2</td>
<td>A + BC</td>
</tr>
<tr>
<td>TH33w2</td>
<td>AB + AC</td>
</tr>
<tr>
<td>TH14</td>
<td>A+B+C+D</td>
</tr>
<tr>
<td>TH24</td>
<td>AB + AC + AD + BC + BD + CD</td>
</tr>
<tr>
<td>TH34</td>
<td>ABC + ABD + ACD + BCD</td>
</tr>
<tr>
<td>TH44</td>
<td>ABCD</td>
</tr>
<tr>
<td>TH24w2</td>
<td>A + BC + BD + CD</td>
</tr>
<tr>
<td>TH34w2</td>
<td>AB + AC + AD + BCD</td>
</tr>
<tr>
<td>TH44w2</td>
<td>ABC + ABD + ACD</td>
</tr>
<tr>
<td>TH34w3</td>
<td>A + BCD</td>
</tr>
<tr>
<td>TH44w3</td>
<td>AB + AC + AD</td>
</tr>
<tr>
<td>TH24w22</td>
<td>A + B + CD</td>
</tr>
<tr>
<td>TH34w22</td>
<td>AB + AC + AD + BC + BD</td>
</tr>
<tr>
<td>TH44w22</td>
<td>AB + ACD + BCD</td>
</tr>
<tr>
<td>TH54w22</td>
<td>ABC + ABD</td>
</tr>
<tr>
<td>TH34w32</td>
<td>A + BC + BD</td>
</tr>
<tr>
<td>TH54w32</td>
<td>AB + ACD</td>
</tr>
<tr>
<td>TH44w322</td>
<td>AB + AC + AD + BC</td>
</tr>
<tr>
<td>TH54w322</td>
<td>AB + AC + BCD</td>
</tr>
<tr>
<td>THxor0</td>
<td>AB + CD</td>
</tr>
<tr>
<td>THand0</td>
<td>AB + BC + AD</td>
</tr>
<tr>
<td>TH24comp</td>
<td>AC + BC + AD + BD</td>
</tr>
</tbody>
</table>

**Table 3. 27 Fundamental NCL Gates**

At the transistor level, NCL gates are constructed of four major blocks. These blocks, named *set*, *reset*, *hold0*, and *hold1*, each perform a specific function for the circuit. The *set* and *reset* blocks are responsible for controlling the logical function of the circuit, i.e., changing the output from logic0 to logic1 and logic1 to logic0, respectively. As their names imply, the *hold0* and *hold1* blocks maintain the value of the output once
determined by the *set* and *reset* blocks. The hysteresis-inducing transistors are then placed in serial with the *hold0* and *hold1* blocks. The *hold0* and *set* blocks form a complementary CMOS function while the *hold1* and *reset* blocks form another. A transistor-level representation of a TH22 gate may be found in Figure 5.

![TH22 Transistor-Level Schematic](image)

**Figure 5. TH22 Transistor-Level Schematic**

An additional requirement for NCL circuits to remain delay-insensitive is that they exhibit input-completeness and observability [5]. In order to be input-complete, a circuit must satisfy two requirements:

1) All inputs of the circuit must transition from NULL to DATA before any outputs may transition from NULL to DATA;

2) All inputs of the circuit must transition from DATA to NULL before any outputs may transition from DATA to NULL.
Additionally, the circuit must be fully observable. This means that all gates transitioning within the circuit must have an impact on an output of the circuit. Wires that transition, but are not used to determine the output are called orphans. Orphans that do not cause a gate transition may be neglected by means of the isochronic fork assumption [5].

2.1.2. NCL Registers

In order to maintain delay insensitivity during sequential operation, NCL utilizes a specialized register. Comparable to a Boolean D-Flip Flop, these registers utilize resettable and inverting NCL gates, as mentioned previously. The schematic for a single-bit dual-rail register may be seen in Figure 6.

![Figure 6. Single-Bit Dual-Rail NCL Register](image)

In addition to holding a dual-rail signal, this register is capable of performing the handshaking necessary for asynchronous sequential operation by broadcasting which
state the register is currently in, either DATA or NULL. This functionality is provided by the control signals $K_0$ and $K_i$. The register accepts $K_i$ as an input; this signal indicates whether DATA or NULL should be passed next. When $K_i$ is ‘1’ only DATA is allowed to pass. Conversely, when $K_i$ is ‘0’, the circuit must pass a NULL. $K_0$ acts as an acknowledgement and indicates which wavefront the register requires next. When a complete DATA wavefront is received, the $K_0$ becomes ‘0’, thus is communicating a Request for NULL ($rfn$). Similarly, when a complete NULL wavefront is received, $K_0$ rises, indicating a Request for DATA ($rfd$).

When multiple single registers are combined to form an $n$-bit register, an additional circuit block is required to combine each $K_0$ bit into a single bit. This extra circuitry, called completion logic, consists of a tree of $THnn$ gates of size $\log_4 m$, where $m$ is the register width. For example, an 8-bit register would use a completion logic tree of height two, consisting of two $TH44$ gates and one $TH22$. The schematic for the 8-bit register and completion logic can be seen in Figure 7.

In order to understand sequential NCL circuit operation, it is important to understand how $K_0$ and $K_i$ interact to create the handshaking protocol that allows for asynchronous operation. The simplest form of this interaction is a combinational circuit between two NCL registers, creating a single-stage pipeline.

As seen in Figure 8, the primary inputs are connected to the input register, which then gives the signals to the combinational circuit. The resulting outputs are fed into the output register; this register then outputs the signals as the circuit’s primary outputs. The output register accepts an external input for its $K_i$, and its $K_0$ is fed directly into the
input register’s $K_i$ input. The input register’s $K_o$ is then given as a primary output of the circuit.

Figure 7. 8-bit NCL Register with Completion Logic
Upon reset, typically both registers will be set to NULL. They both will be ready to accept DATA upon starting normal circuit operation, thus their Ko signals will both be ‘1’ indicating an rfd. Since the input register is accepting a ‘1’ from its Ki (the Ko from the output register), the register will be able to accept the DATA as soon as it is presented.

When all bits have received the DATA wavefront, all of the individual Ko bits will be ‘0’, or rfn. By means of the completion logic, the primary Ko output will fall. This indicates that the circuit is ready to accept the NULL wavefront. However, this NULL will not flow through the circuit until the output register has received the complete DATA wavefront from the combinational circuit, at which time its Ko will fall. The order in which the output register presents its rfn and the NULL wavefront arriving at the input register is insignificant. This cycle is then reversed for the next DATA wavefront.

The time it takes the circuit to complete one cycle of this operation is called the DATA-to-DATA cycle time, which is denoted as T_{DD}. This time value can be generalized to a clock speed in a traditional synchronous circuit. However, unlike a clock speed, the T_{DD} is a dynamic time and can change from cycle to cycle. This results in the circuit operating as fast as possible under the current conditions. Synchronous circuitry
demands rigorous timing analysis to determine the slowest path through the circuit. The clock speed is then determined by taking this timing requirement and adding a small amount of time to act as a buffer for unforeseen environmental or manufacturing irregularities. This gives synchronous circuits worst-case performance as it is operating at a speed slightly slower than its slowest path. However, due to this dynamic operation, NCL circuits operate with average-case performance. In addition to offering better performance, the ability to automatically adjust timing also makes the circuit very robust, and highly resilient to process variation and environmental changes, such as supply voltage and temperature [7].

2.2. Silicon Carbide (SiC)

Silicon Carbide is an alternative semiconductor to Silicon (Si), which is most commonly used in integrated circuit design. Known as a wide-bandgap semiconductor, SiC differs at the molecular level from Si. As indicated by their name, all semiconductors possess the properties of both conductors and insulators. These properties originate from how their molecular structure reacts when subjected to an electric field. Both Si and SiC have what is known as a crystalline lattice structure [8]. This is a highly regular molecular structure that lends itself well for free electrons, known as carriers, to move through the structure.

While at rest within their parent atoms, electrons naturally exist in one of several energy bands. As with most natural processes, electrons come to rest with the least amount of energy possible. However, due to the atomic structure of their parent atoms and the quantum forces at work, an electron’s lowest attainable energy is not always possible.
Electrons that are captured into a suitable energy level are at rest within an atom, covalently bonded to the surrounding atoms in the lattice structure. These electrons are within a set of energy bands referred to as the *valence band*. If an electron in the upper range of the valence band receives some energy from an outside source such as heat, light, or an electric field, the electron can break free of its covalent bond and travel freely through the crystalline lattice. These electrons are now within a set of energy bands called the *conduction band* [8].

When these negatively charged electrons break free of the bonds to their parent atom, this atom is left with an available location for an atomic bond and a net positive charge. This is referred to as a *hole*. As other free electrons within the lattice, which have also created their own hole elsewhere in the material, are captured by this hole, the positive charge is effectively moved in the opposite direction of the electron flow. This creates two types of carriers, electrons and holes. As these carriers move across the semiconductor, they create an electric current in the opposing direction of electron travel. Figure 9 [8] illustrates this interaction.

Depending on the semiconductor material in which this interaction is taking place, a different amount of energy is required to break the electrons free of their bonds. This energy is known as the *bandgap*, as it literally is the amount of energy that creates the gap between the valence and conduction bands. By using this bandgap to create carriers within the semiconductor on demand by applying an electric field, it is possible to control the current within a semiconductor. An energy-band diagram, shown in Figure 10 [8], is commonly used to model these characteristics. In this diagram, the carrier electrons reside in the conduction band, while the holes created by these freed
electrons are shown in the valence band. The energy stored in the carriers can be seen in this diagram as well; lower energy electrons in the conduction band "sink" to the bottom of the band like stones in water, while lower energy holes in the valence band tend to "float" to the top like bubbles [8].

Figure 9. Electrons (e⁻) and holes (h⁺) as carriers through 2-D representation of silicon crystal lattice structure [8]
2.2.1. Wide-Bandgap Semiconductors

The energy of an electron is measured in *electron volts*, or eV. Requiring only 1.12 eV to free an electron, Si makes a very good semiconductor capable of extremely fast circuit operation. However, when placed within an environment where energy is abundant, such as a location with high ambient temperature conditions, carriers can inundate the lattice. When this happens, the electric field applied can no longer control the current, and the device cannot be shut off. The solution to this problem is to use a semiconductor material that requires more energy to free electrons from their bonds, thus has a *wide bandgap*. SiC is one such material. With a bandgap of 3.26 eV, nearly triple that of Si, SiC makes an excellent semiconductor for high temperature applications. See Table 4 [8] for other common semiconductors and their associated bandgap.
Material & Bandgap (eV)

<table>
<thead>
<tr>
<th>Material</th>
<th>Bandgap (eV)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Germanium</td>
<td>0.66</td>
</tr>
<tr>
<td>Silicon</td>
<td>1.12</td>
</tr>
<tr>
<td>Gallium Arsenide</td>
<td>1.42</td>
</tr>
<tr>
<td>4H-Silicon Carbide</td>
<td>3.26</td>
</tr>
<tr>
<td>Gallium Nitride</td>
<td>3.4</td>
</tr>
<tr>
<td>Silicon Nitride</td>
<td>5</td>
</tr>
<tr>
<td>Diamond</td>
<td>5.5</td>
</tr>
<tr>
<td>Silicon Dioxide</td>
<td>9</td>
</tr>
</tbody>
</table>

Table 4. Bandgap of Common Semiconductors [8]

2.2.2. Design Challenges and Benefits

Most SiC integrated circuit processes, particularly those suitable for digital circuit designs, are still under active development. Due to the immaturity of these processes, feature sizes remain large, metallization options are limited, the devices are difficult to model, and projected yields are low. However, despite these difficulties, the potential benefits of this technology are vast. One application of this technology is in modern hybrid electric vehicles (HEVs). Currently, automakers are required to include an additional cooling loop rated for 70°C to cool commercially available electronics. If these circuits were implemented in SiC, the existing engine coolant system could also be utilized to cool the electronics with little to no overhead. This change results in the overall mass and volume of the electronics module being reduced by an order of magnitude [9].

2.2.3. Existing SiC Digital ICs

Very little research has been reported on sequential digital circuitry in SiC. Flip-flops and a binary counter are reported in [2, 10, 11]. Researchers from [12] also report a
working flip-flop. More work has been reported in combinational circuitry, but mostly single gate test structures and simple digital circuits such as half adders and ring oscillators [2, 3, 10, 11, 12]. All mentioned previous work has been reported to function correctly at or above 300°C.

3. Approach

3.1. Design Challenges

Despite being one of the most advanced SiC processes currently available, HTSIC is still a very young process. It is under active development by Raytheon, and as such, there are many opportunities for innovation when designing integrated circuits in this process.

3.1.1. Channel Routing

Perhaps the most challenging issue from the physical design aspect is that currently only a single layer of metal is available. This proved difficult in many aspects of physical design, particularly with place-and-route (P&R). Modern electronic design automation (EDA) tools are designed to utilize P&R algorithms that assume multiple layers of metal. As such, these algorithms do not work as well on this process. Very early in this work a viable P&R scheme was developed. This scheme is based on some very specific requirements for the cell layouts, notably requiring all pins be located on the left and right edges of the cell, as discussed in greater detail in section 3.3.1. This strategy also included adding an empty space on both sides of every placed cell. Since only one metal is available, routing must be done in Metal 1 and POLY. This P&R scheme guarantees that the automatic routing tool has access to the pins without needing to
cross over the cell boundary where Metal 1 and POLY were used to create the devices in the gate. This method was tested using a simple netlist containing 1000 TH12 gates. By using a single cell, layout adjustments to increase routability were trivial. Once a satisfactory P&R was achieved, a template was created from this TH12 gate, and full NCL and Boolean libraries were created.

During a scheduled design review, Raytheon suggested a P&R technique that worked well for them in the past. This outdated method, called *channel routing*, abuts the cells horizontally and typically requires two layers of metal. This technique requires the cells to be placed on every other row of the circuit leaving a routing “channel” between each row of cells. Following this method strictly was not possible due to the fact that only one metal was available. Also, horizontal abutment was not possible due to connectivity issues to the pins placed along the vertical edge of all the cells in the completed libraries, so a hybrid method was developed. The cells were placed in channels, but were still allowed a padding of empty space between each placed cell. Applying this method increased cell placement density by 10%, reduced cell padding from 180µm to 36µm, and reduced overall core area by 30%. Figure 11 shows the same circuit before and after channel routing is applied for comparison.
3.1.2. POLY Crossovers

A method was developed to use as little POLY as possible in the $V_{DD}$ and $V_{SS}$ networks. POLY is a highly resistive material, and therefore is an unattractive option for routing signals. However, its use is unavoidable at this time in the HTSIC process. Special care was taken to use no POLY in the $V_{SS}$ ring, rails, and stripes within the core. When POLY is used to route a signal, the high resistance induces a voltage drop, thereby impacting the voltage on the net. Propagated across the core, this results in devices in different areas of the core receiving notably different voltages. Due to this effect, whenever the $V_{DD}$ and $V_{SS}$ rails must cross, the $V_{DD}$ is selected to via down to POLY, allowing $V_{SS}$ to cross undisturbed. This causes substantial voltage droop on the $V_{DD}$ net, but the inherent robustness of NCL circuit design will mitigate these effects on the circuit.
This is highly preferred to devices with non-zero voltages on their $V_{SS}$ rail. Figure 12 shows the implemented solution.

![Figure 12. POLY Crossovers in Circuit Core](image)

To create a metal-only $V_{SS}$ net, it was necessary to place the $V_{SS}$ ring inside the $V_{DD}$ ring. Thus, whenever routing the rails from the core to pad ring, POLY was required to go over the $V_{DD}$ ring. This POLY was made to be as wide as possible to reduce resistance, as shown in Figure 13. Additionally, in Tapeout 1 every circuit has multiple $V_{DD}$ and $V_{SS}$ pads to improve reliability; consequently, the resistances of all the connected $V_{SS}$ pads are in parallel, and become negligible. Furthermore, when placed over the N-Type bulk used by the HTSIC process, the large pieces of grounded POLY
create significant decoupling capacitance. This technique was not necessary in Tapeout 2.

![Figure 13. POLY Crossovers in Pad Ring](image)

### 3.2. Design Flow

Using modern CAD tools in a process with only a single layer of metal provided many challenges. These tools required extensive configuration file modifications for adaptation to the needs of this process. Tools used include Mentor Modelsim for HDL creation and simulation, Cadence Virtuoso for gate schematic creation, verification, and layout design; in addition, DRC, LVS, and PEX were accomplished using Mentor Calibre. Cadence Abstract was used for cell abstraction and LEF file creation. Synopsys Design Vision and Mentor Leonardo were utilized for physical netlist creation of asynchronous circuits. Also, synchronous RTL synthesis as well as all P&R was
performed using Cadence Encounter. Encounter does not officially support channel routing, as modern techniques are far more efficient when more metal layers are available. However, there are options available to skip 0, 1, or 2 rows when placing cells, so the effect is not difficult to achieve.

3.3. Tapeout 1

A key advantage to a two-tapeout project is the first tapeout can be used to learn valuable information about how devices and building block circuits behave. The models for the HTSIC process received extensive work in [4], but prior to the first tapeout, they remained largely unproven. The circuits developed in the first tapeout provided the data necessary to make informed design decisions for the second tapeout.

An important first step was deciding on a minimum sized PFET and NFET. For a well-established process, this is available in the design guide included in the PDK. However, this is not the case for a process in its infancy. Based on physical test data provided by Raytheon, the NFET was selected to be $4 \mu m \times 2 \mu m$. The data also suggested a PFET:NFET device-width ratio of 6:1; therefore, the minimum sized PFET was $24 \mu m \times 2 \mu m$.

3.3.1. Cell Layouts

As briefly described in section 3.1.1, specialized layout techniques were required to create a cell library compatible with modern P&R tools due to the single layer of metal available in this process. The input and output pins were moved to the outside vertical edges of the cell to prevent the router from needing to place any Metal or POLY routing traces over the edge of the cell boundary. This has the added benefit of allowing
unrestricted access to the use of both interconnect layers when designing the cells. In order to guarantee access to these pins, a cell padding was introduced on both sides of every cell. The $V_{DD}$ and $V_{SS}$ rails are located at the top and bottom of each cell respectively, and the substrate and P-WELL contacts are placed within the rails. Since these cells were designed with traditional placement methods in mind, there is enough room for a POLY trace to go through the rails horizontally between the substrate and P-WELL contacts when the cells are abutted vertically. However, due to the introduction of channel routing, this feature was not utilized. An example of a TH12 cell layout using this technique can be seen in Figure 14.

Figure 14. TH12 Cell using Tapeout 1 Layout Style
3.3.2. Tapeout 1 Circuits

A total of 16 circuits were included in Tapeout 1. Top-level layout images for these circuits can be seen in Figures 15-30. These circuits include:

- 8+4×4 NCL Multiply Accumulate Unit (MAC)
- 4-Bit NCL Counter
- Boolean Finite State Machine (FSM)
- 4-Bit NCL Ripple Carry Adder (RCA)
- 4-Bit Boolean RCA
- 11-Stage Ring Oscillator (RO)
- 11-Stage Ring Oscillator with probe pads
- 8-Bit Boolean Shift Register (SR) using transmission gate D-Flip Flops (DFF)
- 8-Bit Boolean Shift Register using DFFs constructed of NAND gates
- 8-Bit Boolean Shift Register using optimized static DFFs
- Complete Boolean library
- Complete NCL library (two circuits)
- Transistor sizing test cells (three circuits)

Each circuit included in this tapeout was carefully selected for a specific purpose. The largest circuit included in this tapeout, the NCL MAC is comprised primarily of a 4-bit multiplier and an 8-bit RCA. This provides a highly regular circuit structure, which allowed Cadence Encounter to achieve an efficient P&R of the circuit. This circuit was included in the tapeout to serve as an example of a moderately complex circuit of a modest size.
Figure 15. 8×4×4 NCL Multiply Accumulate Unit

Figure 16. NCL Counter

Figure 17. Boolean FSM

Figure 18. NCL RCA
Despite being a much larger circuit, the MAC is mostly comprised of combinational logic. The NCL Counter serves as an example of a sequential NCL circuit to study how NCL handshaking functionality changes as circuit temperature increases. Similarly, the Boolean FSM was included to have an example of how a synchronous circuit in SiC operates when tested over temperature. Both the NCL and Boolean RCAs provide a purely combinational circuit for testing and comparison between the two circuit design techniques. The Ring Oscillators and Shift Registers give valuable test data about device speed and reliability. The data collected from the shift registers also provided invaluable information when deciding which DFF design to use in the second tapeout.

The remaining six circuits were included to gather information on gate design and transistor sizing choices for use in the second tapeout. In the Boolean and NCL library circuits, each cell in the libraries were individually testable to verify gate functionality as well as provide individual gate characterization over temperature. The transistor sizing test cells consist of eleven different cells, each with various transistor sizing combinations. These gates were selected as candidates for these circuits because they exhibit characteristics representative of the entire library. Table 5 details the different transistor combinations employed, while Table 6 lists these chosen cells along with the reasoning behind their selection.

<table>
<thead>
<tr>
<th>PFET: NFET = 2:1, L = 1.2µm</th>
<th>PFET: NFET = 4:1, L = 1.2µm</th>
<th>PFET: NFET = 6:1, L = 1.2µm</th>
</tr>
</thead>
<tbody>
<tr>
<td>PFET: NFET = 2:1, L = 1.5µm</td>
<td>PFET: NFET = 4:1, L = 1.5µm</td>
<td>PFET: NFET = 6:1, L = 1.5µm</td>
</tr>
<tr>
<td>PFET: NFET = 2:1, L = 2µm</td>
<td>PFET: NFET = 4:1, L = 2µm</td>
<td>PFET: NFET = 6:1, L = 2µm</td>
</tr>
</tbody>
</table>

Table 5. Test Cell Transistor Combinations
<table>
<thead>
<tr>
<th>Cell</th>
<th>Reasoning</th>
</tr>
</thead>
<tbody>
<tr>
<td>TH12</td>
<td>Simple NCL cell</td>
</tr>
<tr>
<td>TH22</td>
<td>Average complexity, 2-input NCL cell</td>
</tr>
<tr>
<td>TH24</td>
<td>Above average complexity, 4-input NCL cell</td>
</tr>
<tr>
<td>TH33n</td>
<td>Above average complexity, 3-input NCL cell</td>
</tr>
<tr>
<td>buffx4</td>
<td>Drive strength testing</td>
</tr>
<tr>
<td>buffx8</td>
<td>Drive strength testing</td>
</tr>
<tr>
<td>buffx16</td>
<td>Drive strength testing</td>
</tr>
<tr>
<td>Inv</td>
<td>Transistor characterization</td>
</tr>
<tr>
<td>xor2</td>
<td>Above average complexity Boolean cell</td>
</tr>
<tr>
<td>Gate-level DFF</td>
<td>Analysis on DFF choice for Tapeout 2</td>
</tr>
<tr>
<td>Static DFF</td>
<td>Analysis on DFF choice for Tapeout 2</td>
</tr>
</tbody>
</table>

Table 6. Cells Selected for Transistor Sizing Analysis

Figure 31 shows the full reticle measuring 21mm × 12.5mm which includes all circuits, both digital and analog, submitted in Tapeout 1.

Figure 31. Tapeout 1 Reticle
3.4. Tapeout 2

Using all the information learned from Tapeout 1, a second design phase focused on creating the digital control logic for the intelligent gate driver outlined in the specification from APEI. The controller operates as a closed-loop feedback system, as seen in Figure 32. System sensor data is first filtered through analog conditioning circuitry before being sent through an Analog-to-Digital Converter (ADC). The resulting data is then processed through a digital block called a Flyback Controller, which controls the charging and discharging profile of the Gate Driver. This results in a fully SiC circuit capable of producing a high-current, self-regulating signal, which may then be used in various extreme environment power electronics applications. Figure 33 shows the Tapeout 2 reticle containing all analog and digital circuits submitted for fabrication.
3.4.1. Cell Layouts

In preparation for the second tapeout, the cell library was completely redesigned to take full advantage of the channel routing P&R technique. An example layout of a redesigned TH12 may be seen in Figure 34. Using this new layout technique, the $V_{DD}$ and $V_{SS}$ rails were placed in the middle of the cell and are wide enough to encapsulate a substrate or P-WELL contact for each $V_{DD}$ or $V_{SS}$ connection. This allows all of the pins to be placed at both top and bottom of the cell, allowing the automatic routing tool full access to the cell regardless of if the routing comes from the channel above or below the cell. Instances were even observed of Encounter routing through the cell, utilizing both top and bottom pins. The cells were laid out to optimize for vertical spacing, but a maximum cell height was not enforced. This allowed smaller cells to contribute to
the routing channel rather than wasting extra space. Despite not being constrained by a height, Encounter requires a common cell height when placing cells. In order for this

Figure 34. TH12 Cell Using Tapeout 2 Layout Style
technique to accommodate the requirements of the tools, empty space was added to the top and bottom of the smaller cells such that all cells in the library have a common height of 76µm; 20µm shorter than the cells used in Tapeout 1. In theory, these changes allow the padding between cells to be removed. However, in practice, routability was greatly increased on larger circuits by allowing a 3µm cell padding, down from 20µm in Tapeout 1. In order to provide empirical data on the effectiveness of these changes, the new library was used to P&R the Boolean RCA from Tapeout 1. Shown in Figure 35, the new library has proven to increase core cell placement density by 39% and has reduced the core area from 534,916µm² to 318,019µm², a 41% decrease.

Figure 35. Tapeout 1 Boolean RCA (Left) versus Tapeout 2 Boolean RCA (Right)

3.4.2. Tapeout 2 Circuits

In order to perform the required digital control functions, two circuits were designed for this tapeout. The first, a building block for the ADC circuit, is a synchronous FSM to
control Digital-to-Analog Converter (DAC) operation, and the second is a large NCL block to control the Gate Driver’s charging and discharging profile, the Flyback Controller. The FSM was designed in two different configurations, the first is designed to be a stand-alone circuit component for detailed testing, while the second is in a form factor for integration directly into the ADC layout; as such this design does not have a pad frame, and has had all testing functionality removed to provide the most compact layout possible.

3.4.2.1. Design for Testability (DFT)

After encountering difficulties during Tapeout 1 testing, detailed in section 4.1.3, several accommodations for testing were included in the Flyback Controller and the DAC Controller equipped with a pad frame. By including input and output shift registers, both circuits may have a heartbeat test performed at a probe station using only 7 pins. It was important that this functionality does not remove the ability for the circuit to be packaged or to diminish performance of a packaged circuit. So a control signal was included to place the circuit into either test or production mode by enabling or bypassing the shift registers, as shown in Figure 36. It was also necessary to add additional pads for any signals that were required for probe testing and packaged testing such as \( V_{DD} \), \( V_{SS} \), and the mode-control signal for reasons also fully explained in section 4.1.3. Additionally, in order to optimize the required number of pins to test the circuit, the two \( V_{DD} \) and \( V_{SS} \) rails for core and pad power were connected together. This decision removes the ability to test the power consumption of only the core without the pad buffers skewing the data. However, due to the importance of DFT in this tapeout, this was a reasonable trade-off.
Due to a large number of inputs and outputs, 51 clock cycles are required to shift the data into the Flyback Controller. As normal circuit operation only requires 1 $T_{DD}$, this circuit exhibits significantly reduced performance in test mode. However, the DAC Controller does not have any performance penalty in test mode as the inputs and outputs only take 9 clock cycles to cycle through the shift register while a single circuit operation requires 10 cycles.

3.4.2.2. NCL Flyback Controller

The Flyback Controller layout is shown in Figure 37. With more than 13,000 transistors and core dimensions of 4000µm $\times$ 4000µm, this represents the largest known digital
circuit fabricated in SiC. Despite being a fully delay-insensitive NCL circuit design, the Flyback Controller accepts single rail inputs, gives single rail outputs, and requires a clock signal. This is due to the synchronous wrapper placed on the circuit. This adds some overhead, but was necessary for top-level integration into the system. Detailed in Figure 38, the wrapper generates the dual-rail inputs as well as the NULL cycles for the NCL circuit. The clock input is connected directly to the $K_i$ signal. Thus, as long as the clock signal is slower than the circuit delay, it will continue to function as if the handshaking were automatic.

Figure 37. NCL Flyback Controller Layout
The Flyback Controller performs two mathematical functions detailed in [13] called Lambda off ($\lambda_{\text{off}}$), shown in Equation 1 [13] and Lambda on ($\lambda_{\text{on}}$), shown in Equation 2 [13]. Due to area concerns, analysis was performed and showed some of the operations could be done more efficiently with the analog signals before analog-to-digital conversion took place. This is the purpose of the Analog Data Conditioning block discussed in Section 3.4. This simplified the $\lambda_{\text{off}}$ and $\lambda_{\text{on}}$ functions substantially, as shown in Equations 3 and 4, respectively.
3.4.2.3. Synchronous DAC Controller FSM

This 17-state FSM implements a simple algorithm to control the DAC while performing conversions from within the ADC. Besides Reset and Enable signals, the circuit only accepts one digital input from the comparator, COMP, and outputs an 8-bit signal called DAC to the DAC component of the ADC; as well as a single bit Valid signal to indicate a completed conversion. Upon deasserting Reset, the FSM waits for Enable to be asserted, at which time DAC is set to "10000000". This value is then processed by the DAC and compared to the analog value being converted. If the analog value is larger, the comparator outputs a '1' to the COMP input of the FSM. If this is the case, DAC[7] remains asserted, and DAC[6] is then asserted, making "11000000" the new value of DAC. This value is then processed by the DAC, and the process continues until all eight bits have been compared to the analog value. At any time in the conversion, if the digital value presented to the DAC by the FSM is larger than the analog input value, then the comparator will return a '0', and the bit of DAC currently being computed is deasserted before asserting the next bit and continuing the algorithm. Each conversion requires ten clock cycles to complete, at which time the circuit will pause, waiting for the next assertion of Enable. The state diagram of this process may be seen in Figure 39.

\[
\lambda_{off} = v_{on}^2 + (i_{mn} - i_{on})^2 - 1 - i_{on}^2 \\
\lambda_{on} = i_{mn}i_{on} + v_{ccn}v_{on} - v_{ccn} \times 1 \\
\lambda_{off} = A + B - C - D \\
\lambda_{on} = A + B \times C - B \times D
\]

Equation 1
Equation 2
Equation 3
Equation 4
Two versions of this circuit were submitted for fabrication. The first, which can be seen in Figure 40, was designed specifically for integration into the ADC layout. Fitting into the form factor given by the specification, this circuit was made to be as compact as
possible. As such, there are no pads, non-essential buffers, or DFT shift registers in this version of the FSM, making it impossible to test separately from the ADC circuit. To remedy this, a second version of the circuit was also included, seen in Figure 41. This circuit is fully testable both to be packaged for temperature testing, and at the probe station for heartbeat verification.

Figure 40. DAC Controller FSM for Integration Layout
3.5. Physical Testing Setup

Upon receiving the fabricated wafers from Raytheon, the testing process began at a probe station to perform heartbeat testing on all circuits with 8 or fewer pins. Once the circuits passed heartbeat testing, packaging began. In order to test the packaged circuits, there were several difficulties to resolve. Among these difficulties was how to
deal with the high nominal voltage of the HTSIC process and the high operating temperatures required for testing. Digital testing equipment is rarely equipped to handle circuits with a \( V_{DD} \) of 15V. In order to resolve this, a level shifter board was designed, fabricated, and constructed. Also, a specialized testing apparatus was utilized designed specifically to apply heat directly to the packaged chip without damaging the surrounding components.

3.5.1. Probe Testing

A probe station is a piece of equipment that is used for testing unpackaged chips and therefore does not require the assembly steps required for packaging. Depending on the number of pins and the type of testing required, access to a probe station can eliminate the need to package a circuit to perform the desired testing. This works by securing the chip via suction to a metallic base called a chuck. The chuck can be biased to a voltage or electrically floated. For the purposes of this testing, the chuck was floated at all times. Once the chip is secure, using a microscope and very precise manipulators, a needle is carefully placed on each pad of the circuit. These needles, or probes, are then attached to various testing equipment, such as power supplies, signal generators, or oscilloscopes. Probe testing has the added benefit of requiring smaller pads than traditional bonding, which can enable access to signals internal to the circuit as well as primary inputs and outputs as long as the probe pad has a buffer sufficient to drive the testing equipment. Probe testing provides very rapid testing results at a lower cost in both time and materials. Pictured in Figure 42, this Semiprobe probe station is equipped with eight probes, can test circuits at temperatures up to 300°C, and was used extensively to test unpackaged material returned from fabrication throughout this work.
3.5.2. Packaged Testing

For the circuits included on the first tapeout, several required packaging to perform all testing functions for various reasons including pin count, temperature testing requirements, and to properly control the NCL handshaking. Due to the DFT circuitry included in the circuits on the second tapeout, packaging is not required for testing.

3.5.2.1. Packaging

All packaging for this work was performed using the facilities at the University of Arkansas High Density Electronics Center (HiDEC). The circuits were returned on five 4-inch wafers from the foundry, as shown Figure 43. The wafer was then diced, with
Each reticle placed into a custom 3D-printed die pack printed using a conductive resin, as shown in Figure 44. Due to how compact the different circuits were integrated into the reticle, subdicing the reticles required sacrificing some circuits, so careful planning was required to dice the desired circuits while damaging as few of other circuits as possible. Once the dicing was complete, the next step was to epoxy the bare die into the cavity of the package. This was accomplished using a conductive epoxy and placing it inside a vacuum oven at 150°C for 4 hours in a 15 mmHg Nitrogen (N₂) environment. Once the epoxy was cured, the circuit was ready for wire bonding. Using 1mil Gold wire, the circuits were bonded to the package utilizing the ball bonding technique. As seen in Figure 45, this technique welds a wire with a small Gold ball at the end to the pads in order to make a strong electrical connection to the circuit. When bonding is complete, the package is then soldered to a custom printed circuit board (PCB) made of high-temperature Rogers material, shown in Figure 46. This PCB was designed to attach to the high-temperature testing fixture. Once all solders and wire bonds have been checked for shorts, the circuit is ready to be tested. Figure 47 shows a chip attached to the high-temperature testing fixture undergoing temperature testing using a hotplate for a heat source.
Figure 43. One of five 4-inch SiC Tapeout 1 Wafers (Photograph by author)

Figure 44. Diced Reticles in 3D-Printed Die Pack (Photograph by author)

Figure 45. Gold Ball Bonding on NCL MAC (Photograph by author)

Figure 46. Package Soldered to Rogers High Temperature PCB (Photograph by author)
3.5.2.2. FPGA Testing Implementation

A Xilinx Virtex 7 FPGA was utilized to provide inputs to the packaged circuits. The FPGA also accepts signals from the circuits to provide support for the handshaking protocols required for the delay-insensitive NCL circuit operation. Equipped with FPGA Mezzanine Card (FMC) adapter cards, the FPGA offers over 200 bi-directional channels to use as inputs and outputs for the circuit under test.

3.5.2.3. Level Shifter PCB

While the FPGA implementation offers great control over the circuit, the FPGA is limited to 1.8V signals. To allow connection to the 15V test circuit, a level shifter PCB was designed. Offering a total of 64 level shifter channels, the circuit is capable of shifting 32
signals from 1.8V to 15V and 32 signals from 15V to 1.8V. This was done in two stages. Using 16 4-bit Pericom PI4ULS5V104ZBEX level shifters, signals were shifted from 1.8V to 5V. Then by utilizing 12 6-bit Texas Instruments CD4504BM96 level shifters, signals were converted from 5V to 15V. These were soldered to a 4-layer PCB designed using CadSoft Eagle PCB Design Software.

Figure 48 shows the full testing setup:

- Top Left – Xilinx Virtex 7 FPGA
- Middle Left – FMC adapter card
- Bottom Left – Level shifter PCB
- Bottom Right – Circuit under test mounted on high-temperature testing fixture, resting on a hotplate
4. Results and Analysis

4.1. Tapeout 1

4.1.1. Simulation Results

During design verification before submitting Tapeout 1 for fabrication, full-length extracted simulations were performed across temperature at all available corners on all of the major test circuits. This included the NCL MAC, NCL Counter, Boolean FSM, NCL RCA, Boolean RCA, and Ring Oscillator. All waveforms included within this chapter were taken while the circuit under test was operating at 275°C, the highest temperature the models allowed, using the slowest corner available. Under these conditions, the Ring Oscillator operated at 3.7 MHz, as seen in Figure 49. In a subset of the simulation waveforms, shown in Figure 50, the NCL MAC can be seen operating correctly across all patterns with a T\text{DD} of 1.73\mu s or approximately 570 KHz. Figure 51 shows an exhaustive test of the NCL Counter, which operates flawlessly with a TDD of 1.09\mu s, or 910 KHz. The Boolean FSM shows correct operation up to 2.25 MHz in the partial waveform shown in Figure 52. Due to some fortuitous timing effects discussed in greater detail in section 4.1.4, the Boolean RCA operates at 6.32 MHz, shown in Figure 53. Finally, the NCL RCA operated correctly up to 1.22 MHz, shown in Figure 54.

Figure 49. Ring Oscillator Simulation Waveforms
Figure 50. NCL MAC Simulation Waveforms
Figure 51. NCL Counter Simulation Waveforms
Figure 52. Boolean FSM Simulation Waveforms

Figure 53. Boolean RCA Simulation Waveforms
4.1.2. Physical Testing Results

The FSM operated at a maximum temperature in excess of 300 °C displaying a swing in propagation delay from 360ns and operating frequency of 1.35MHz at room temperature to 390ns and 1.2MHz at max temperature with a performance peak of 240ns and 1.95MHz at 275 °C, as seen in Figure 55.

**Figure 54. NCL RCA Simulation Waveforms**

**Figure 55. Boolean FSM Physical Testing Data**
Shown in Figure 56, the NCL Counter had a propagation delay of 1240ns and a $T_{DD}$ of 2.94μs (340KHz) at room temperature and 880ns delay and $T_{DD}$ of 2.24μs (446KHz) at 300°C. The NCL Counter experienced peak performance at a temperature of 250°C showing an 800ns delay and a 2.04μs $T_{DD}$ (490KHz).

### NCL Counter Physical Testing Data

![NCL Counter Physical Testing Data](image)

**Figure 56. NCL Counter Physical Testing Data**

The Boolean RCA exhibited performance that was not in line with the other circuits. At room temperature, the circuit provided correct output with a propagation delay of 276ns with an effective output rate of 9.5MHz. This circuit exhibited peak performance at its maximum operating temperature of 200°C with 198ns of propagation delay and 10.85MHz circuit operation. This data can be seen in Figure 57.
Selected as the best choice for the Tapeout 2 DFF choice, the Transmission Gate Shift Register was the only Shift Register to undergo detailed temperature testing, shown in Figure 58. This circuit functioned properly at room temperature exhibiting a 3.7MHz switching frequency, a maximum data rate of 5.9MHz at 275°C, and 5.5MHz at 300°C.
Due to the importance of accurate Ring Oscillator testing information, a total of five ring oscillators were subjected to full temperature testing. Due to the large process variation, the values presented are the average across all five devices. At room temperature, they exhibited an average frequency of 3.16MHz, 3.78MHz at 300°C, and an average peak of 4.00MHz at 225°C, displayed in Figure 59.

**Figure 59. Average Ring Oscillator Physical Testing Data**

Oscilloscope screenshots of select circuits showing logically-correct maximum-temperature circuit operation are shown in Figures 60-62. Figure 63 shows preliminary testing results showing the NCL MAC functioning at room temperature.
Figure 60. Oscilloscope Waveforms of Boolean RCA operating at 300°C. Note, bit S2_2 became stuck at 200°C
Figure 61. Oscilloscope Waveforms of NCL Counter operating at 300°C
Figure 62. Oscilloscope Waveforms of Boolean FSM operating in excess of 300 °C
Figure 63. Oscilloscope Waveforms showing correct operation of the NLC MAC at room temperature

4.1.3. Yield issues

As with any process still undergoing active development, there were several factors that impacted circuit yield. These are separated into two categories based on their effects to either circuit functionality or packaging.

4.1.3.1. Circuit Functionality

While testing other circuits included on the first tapeout, it was discovered that the major driving factor to circuit yield in the first tapeout was a defect in the process dealing with
wide transistors. When attempting to test designs with large transistors, there were many circuits that exhibited shorts in the gate-source as well as in the gate-drain junctions. In the digital circuits produced by this work, this had major implications due to the output pad buffers falling victim to this defect. Because of this, a number of packaged circuits output 7V on many if not all of the output pads. This made finding working samples of some of the larger circuits very difficult. Eventually, decision was made to perform heartbeat checks on individual pads at probe prior to packaging to avoid wasting resources on packaging chips that did not yield.

4.1.3.2. Packaging Difficulty

Packaging the material from Tapeout 1 also revealed some defects in the materials used to create the bond pads. While wire bonding, the Gold plating over the pads would rip off while attempting to weld the Gold ball to the pad. Originally attributing the problem to an improperly configured bonding machine or poor bonding technique, many troubleshooting steps were taken to remedy the issue including additional training. It was discovered that increasing the ball size as well as being extremely careful to ensure the bond was in the center of the pad helped make the plating rip off less often. However, it did not fully fix the issue.

4.1.3.3. DRC Concessions for Tapeout 2

After discussing the gate-source and gate-drain shorts with Raytheon, they performed some analysis on material they had on hand and discovered the shorts were happening between gate POLY and the openings in the oxide for the source or drain contacts. Due to the Tapeout 2 deadline approaching, they decided the best approach would be to
create a concession in the DRC rule for gate POLY to contact spacing for the second run.

4.1.4. Tapeout 1 Analysis

Despite the experimental models included in the PDK, the simulations provided impressive similarities to the physical testing results. Despite a few irregularities, most results are comparable to the simulation results. One circuit that performed uncharacteristically well is the Boolean RCA. Believed to be due to the delay inherent in the chain-like logic structure of a RCA acting as pipeline stages, the throughput of the circuit was increased greatly. However, this is an unstable effect creating race conditions from within the circuit at this speed. Due to this instability, reliability of a circuit designed using these techniques should not be expected. The Boolean RCA also was the only circuit that did not operate flawlessly up to the target temperature of 300°C. During testing, between the data point collected at 200°C and the next data point to be collected at 225°C, the bit S2_2 became stuck in the '1' position. This was attributed to the yield issue with the pad buffers as when measured, this bit output 7V, and the remaining outputs of circuit continued to operate correctly up to 300°C (see Figure 60).

Due to the physical requirements of the process, circuit complexity had to be kept to a minimum. These simple synchronous circuits tested did not experience hold time violations as a more complex circuit would. Because of this, testing results show stronger than expected performance out of the Boolean circuits. Conversely, the NCL circuits underperformed based on expectations from simulations. Believed to be due to weaker than expected drive strength from the FETs, the added circuitry in the NCL
designs, while allowing more circuits to achieve stable operation at temperatures exceeding 300°C, resulted in slower than expected performance. However, increased process stability, manufacturing yield, and drive strength analysis will allow for more advanced gate design possibilities to remedy these issues in the future, thus allowing the NCL designs to perform to their fullest capability.

4.2. Tapeout 2

4.2.1. Simulation Results

Prior to submitting Tapeout 2, a full simulation was performed on each of the circuits included in the submission in both production and test mode. All simulations shown in Figures 63 to 72 were performed with extracted parasitics running at 300°C. These circuits all show exemplary performance running at 1MHz despite the requirements of the specification being much slower. The DAC Controller will need to run at approximately 500KHz in the full system, and the Flyback Controller will only need to perform at 50KHz. Due to the speed issues with the NCL circuits in the first tapeout, this circuit was designed to operate correctly much faster than required to ensure there will not be any issues impacting full-system performance if simulation results and physical testing results are different.
Figure 64. DAC Controller PEX Simulation Results

Figure 65. Flyback Controller PEX Simulation Results - Control and Test Signals
Figure 66. Flyback Controller PEX Simulation Results - Lambda On A Inputs

Figure 67. Flyback Controller PEX Simulation Results - Lambda On B Inputs
Figure 68. Flyback Controller PEX Simulation Results - Lambda On C Inputs

Figure 69. Flyback Controller PEX Simulation Results - Lambda On D Inputs
Figure 70. Flyback Controller PEX Simulation Results - Lambda Off A Inputs

Figure 71. Flyback Controller PEX Simulation Results - Lambda Off B Inputs
Figure 72. Flyback Controller PEX Simulation Results - Lambda Off C Inputs

Figure 73. Flyback Controller PEX Simulation Results - Lambda Off D Inputs
5. Conclusions

5.1. Summary

This dissertation presents a novel digital design methodology using SiC semiconductor material with synchronous and asynchronous digital design in a fully automated design flow. A total of two tapeouts were completed using this methodology. The first tapeout consisted of 16 circuits providing valuable test data in order to produce an operational digital controller for a fully integrated SiC intelligent gate driver. The circuits from the first tapeout have returned from fabrication and the results from physical testing have been presented. All circuits submitted in the first tapeout have shown logically-correct circuit operation for all test patterns over a wide range of temperatures. The circuits from the second tapeout are scheduled to return from fabrication in April 2015.

The results show that the circuits used in the first tapeout were not complex enough to show the limiting effects that hinder synchronous circuit operation such as hold time violations. The circuit complexity is a function of the level of development in the design kit available for the process. Once SiC processing has had more time to mature, the feature size begins to shrink, metallization options increase, and design kits become more stable, the theoretical analysis of the NCL circuit design paradigm will begin to align with measured circuit performance. The correct-by-construction aspects of the NCL circuitry as well as its asynchronous handshaking protocols will allow for autonomous speed adjustments as ambient temperatures vary greatly, thereby providing true average-case performance while also relinquishing the need for complex control circuitry to maintain circuit operation.
5.2. Future Work

While this dissertation presents a design methodology that has resulted in fully functional operation from all circuits tested to date, there is more physical testing required to gather important information from these circuits. The NCL MAC, with the aid of new multi-contact wedge probes seen in Figure 74, will receive additional testing to provide performance data over temperature. Also, upon receipt of material from the foundry, detailed testing of the Tapeout 2 material at the circuit- and the system-level will be required. The data shows great promise for digital circuit design in SiC. In order to explore these exciting possibilities in this emerging technology further, more work should be done in other SiC processes to provide comparison data to the results from the HTSIC process.

Figure 74. Microscope View of NCL MAC Undergoing Testing with Multi-Contact Wedge Probes (Photograph by author)
6. References


