This article looks at how aggressive DTCO can advance 3-nm technology node design

BY PAUL MCLELLAN, EDA Industry Blogger
Cadence
www.cadence.com

One challenge that the semiconductor
industry faces is that traditional scaling is not enough on its own to drive
performance improvements and size reductions at advanced nodes. Although
semiconductor manufacturers will make slightly different tradeoffs in their
design processes, they end up pretty similar because they use the same
equipment and materials. This article looks at how aggressive design-technology
co-optimization (DTCO) can advance 3-nm technology node design. The numbers for
an actual manufacturer’s process may not be exactly the same as those in this
example, but they will be close.

Here are the rules: From 5 nm to 3 nm,
the contacted poly pitch (CPP) can’t be reduced; it remains at 42 nm. However,
the metal pitch is scaled down from 32 nm to 21 nm. Overall, we want to keep on
track with Moore’s Law — a 50% reduction in component area per each new
technology node. This cannot be achieved by using traditional methods in which
we scale the process and then toss the 50% smaller design rules over the wall
to the design groups. It requires the use of aggressive DTCO, which means
making changes to the process that shrinks the size of the standard cells and
memories.

From 7 nm down to 5 nm, there is pitch
scaling of 35%. By adding a self-aligned gate contact (SAGC), it provides
another 15% reduction, getting to the desired 50% target while keeping a
6.5-track (T) standard cell library. But that’s a one-time thing. We have to
stick with SAGC going forward, but that doesn’t provide any additional scaling.
The designer needs to consider which process features and scaling boosters, for
example, would reduce the height of standard cells by cutting one or more
tracks.

With 6.5-T standard cell libraries, one-and-a-half
tracks are used for the power and ground, and the other five are
middle-end-of-line-interconnect tracks used for the signals. The first
challenge is to cut the requirement down to just four intermediate metal (MINT)
tracks.

The challenge is that there is a very
tight cut in the middle of every cell when the lowest MINT layer is required to
connect both the upper two tracks and the lower two tracks (but obviously not
join all four tracks together). The process needs to be enhanced with a
spacer-defined cut to do that because traditional lithography can’t deliver the
precision necessary. That spacer-defined cut is one example of a scaling
booster.

But how low can the industry go? Is it
possible to get down to just three MINT tracks? If the designer sticks with the
standard two-fin transistors, the answer is no because there just isn’t room
for both a two-fin P-transistor and a two-fin N-transistor. However, a single
FinFET really doesn’t have enough current drive, so the resulting performance
will be inferior to the previous process generation, which is going in the
wrong direction.

However, there is a lot of promise in
a gate-all-around (GAA) transistor structure. The approach that most
manufacturers seem to be exploring is a triple-stacked nanosheet. A
triple-stacked nanosheet has three channels running through the gate that are
not circular, but instead, they are flattened into ovals (although the name “sheet”
makes it sound more like a sheet of paper than a slightly flattened wire). This
approach offers the desired characteristics of more drive than the previous
generation but in just the area of a single-fin FinFET. This has the potential
to let a designer eliminate one more track from the cell, but that requires
more tweaks to the process — another scaling booster.

GAA
horizontal nanosheet
Like the four MINT track standard cells,
there is a challenge in the center part of the cell when using a GAA transistor
structure. The designer needs to extend M0A to the middle of the cell so that
P-transistor structures can be connected to the N-transistor. But the only way
to do this is to create some very tight staggered cuts, which require special
processing because both cuts can’t be made at the same time.

Gate-all-around (GAA) horizontal nanosheet.

The other problem with tiny cells like
this is that there may not be enough routing resources. With a few scaling
boosters, the cells are manufacturable. If a designer just looks at a single
cell, it appears that there are enough routing resources for the router to pick
up the signals. In reality, there isn’t, and when the designer performs routing
experiments, there will be congestion. Another scaling booster comes to the
rescue: supervias. A supervia extends all the way from the poly layer up to the
first metal layer, skipping the MINT layers. With supervias and a little cell
redesign, the routing issue goes away.

There are some further possibilities
that don’t require any help from scaling boosters in the process, which is to
make some cells double or even triple in height. This is especially important
for the more complex flip-flops such as D flip-flops (DFFs) because there are such
limited resources with only three MINT tracks in what ends up being very long
standard cells if implemented in a single row.

If the designer can implement DFFs and
the other big standard cells in double or even triple rows, the area used is
much smaller. Of course, the placer needs to step up to the challenge and have
the ability to place multi-height standard cells in the same area. The router
needs to be able to hook them up, but that is only a minor adjustment.

To get to 5 nm (although some companies
call this 7 nm, which can be confusing), the designer gets a pitch reduction of
35% from traditional pitch scaling and 15% from adding the contact over the
active gate with the SAGC.

Then, to get down to 3 nm, the
designer again gets pitch scaling of 35%. But the designer needs to get the
other 15% reduction by cutting a track out of the cells and using supervias and
the spacer-defined cut. With FinFETs, that is probably the best that we can do.
But if the designer goes to a new device architecture (or can live with a
single fin), then another track can be eliminated to get a further 15%
reduction. The designer then ends up with a CPP of 42 nm, a metal pitch of 21
nm, and a 4.5-T standard cell library.

The industry is several years from seeing
3-nm processes become finalized, let alone ramping them to volume. But in the
meantime, experiments, including building test chips, can be done to help make
better-informed decisions. Unlike a decade ago, when the process was finalized
and then handed to the designers, the only way to keep scaling going is to
optimize the design and process together through DTCO because linear scaling
alone is not enough.

At 3 nm, it’s reasonable to guess that
there will be the usual 50% reduction in routed density and a 15% to 20% speed
improvement, or the speed improvement could be taken as a big reduction in
power at the same performance.

It will be
interesting to watch advanced-node development continue to unfold.