Testbench Acceleration
Synthesizing e Testbenches Using Verisity's
eCelerator |
 |
Abstract
Timeliness is one of the most crucial success factors in the electronic
design industry. Taping out a bug-free chip, on schedule, can make
the difference between a huge success and a complete disaster. Get
it right the first time in the designated time-to-market
window, and your product can become a best seller. Miss the time
window by a few months because of a respin, and you may completely
lose the market to your competitors.
The main bottleneck affecting your ability to release quality systems
desired is functional verification. An efficient simulation-based
functional verification process combines the abilities to:
- Get the most verification from every simulation cycle
- Get as many simulation cycles as possible
Testbench automation tools automate the verification process by
using high-level, intelligent environments. Such environments provide
the tools to automate the verification process and ensure completeness,
but are highly dependent on the software simulators speed.
Hardwareassisted verification is, in a sense, exactly the
opposite. Pure emulation environments use low-level testbenches
that are mapped into the hardware to achieve a high rate of cycles
per second.
This document introduces the tools and methodologies that integrate
these apparently contradicting approaches into a single verification
environment to provide a powerful, high-performance verification
process.
Simulation Versus Emulation
Simulation-based verification works by exercising a description
of the design on a general-purpose workstation. The idea of using
the current generation of computers to simulate and verify next
generation logic has been around since the 1950s and is prevalent
in ASIC design flows today. Most simulators are event-based; signal
changes potentially trigger chains of events, which are repeatedly
calculated until relaxation. After all the calculations have completed,
the simulator advances the clock to simulate the next cycle. Such
event-by-event frameworks are relatively easy to use. Traditional
simulation exercises the logic design mostly at the register transfer
level (RTL). After the RTL specifications are exercised and verified,
synthesis technologies convert the RTL model into a semantically
equivalent gate-level implementation. Using logic simulation to
exercise the design in RTL-form proved to be very efficient. The
setup time for such environments is minimal, the RTL code can be
built to be modular, and it is easy and cheap to apply a change
to the design at this early stage. Debugging the simulation is also
easy due to myriad supplementing tools, including waveform viewers
and source-level debuggers, which provide full visibility into the
design.
Testbench Automation
As design complexity increases, most verification managers estimate
that the verification effort can consume up to 70 percent of the
entire design cycle, even at a ratio of seven verification engineers
for every three designers. Testbench automation tools such as Specman
Elite address this functional verification bottleneck by automating
the verification process. High-level declarative languages drive
powerful verification engines which drastically reduces the need
for writing and maintaining huge amounts of procedural code that
might otherwise be required (if using HDL or even languages such
as C/C++). Specifically, Specman Elite automates test generation,
self-checking tests (including protocol checks), and uses functional
coverage analysis to ensure completeness. Once a testbench automation
tool is in place, ensuring complete and efficient verification,
the number of simulation cycles per second is the next major issue
to be addressed. This problem is getting more acute as designs get
larger, requiring more and more computations per cycle. Some companies
use simulation farms, comprising hundreds of CPUs to provide enough
cycles to complete thorough design verification. For some domains
however, even a simulation farm approach is not adequate; individual
tests often require far too many cycles than can be simulated in
a reasonable amount of time.

Figure 1: Specman Elite testbench automation tool.
Hardware-Assisted Verification
In the 1990s hardware-assisted (HW) verification technologies evolved.
Advanced synthesis techniques were developed to map the RTL description
of the design under test (DUT) to standard FPGA chips or custom-programmable
devices. These assisted technologies can then be further classified
into acceleration and emulation. Accelerators export only some of
the logic into the HW box, thus maintaining the same event-based
simulation. The downside of this approach is its performance. As
the logic that executes on the workstation slows down the entire
process. Emulators, in contrast, map the entire design into FPGAs
or other custom-programmable devices. With the entire DUT in the
HW box, the result is a substantially faster emulation. Emulation
also enables in-circuit operation where real external hardware and
software can be connected to the emulator, verifying the DUT in
a real-life environment. However some of the disadvantages
of emulation include:
- Having to dedicate valuable engineering time to adjust the design
to an emulated environment
- Requiring the HW environment that surrounds the design to be
up and running, including associated software
- Reduced debugging capabilities
- Lack of four state logic support
These factors have pushed emulation to be mainly a back-end technology,
and not to be utilized during extensive parts of the verification
cycle.
As weve seen, simulation and emulation complement each other
in many ways. But is there a usage model that would incorporate
both into a single process?
Usage Today
Traditionally simulation is used at the front-end to clean
the design. Emulation is only used later, after a significant part
of the testing is done placing most of the burden upon simulation.
Thus, emulation alone does not necessarily shorten the time it takes
to do verification. The methodologies that have been used to parallel
the simulation and emulation processes can be categorized as:
- Having separate teams for each process
- Unify both approaches into a single process
The most common way to use emulators today is to employ two different
teams that maintain one environment for simulation and yet another
for emulation. Often, the first environment is an intelligent testbench
that uses testbench automation technologies such as Specman Elite.
The emulation framework is designed to use either:
- Recorded stimuli captured during simulation
- Simple random test vectors created especially for
it
- Real life traffic generated by a dedicated test
equipment or a real environment (e.g., a real network producing
a stream of packets)
Exercising stimuli captured in simulation requires running, in
software, each vector sent to the emulation testbench. Brute force
vector creation, the second option above, can only supplement an
intelligent testbench. For example, the chances of exercising a
corner case like divide by zero are practically zero
if you randomly generate an operator and two real operands. The
third option above, the real life approach, suffers
from similar problems - corner cases and stress tests. The stimulus
might be testing the design in a typical environment, but it very
seldom reaches rare corner cases and stress tests.
Methodologies that try to unify the two processes into one work
by either:
- Connecting a workstation to an emulator using an APIlike
communication that most emulators provide, or
- Creating synthesizable testbenches
Using a software testbench in a hardware-assisted environment is
likely to create a major bottleneck. Engineers using a testbench
specifically designed for performance are likely to find that even
though their testbench consumes as little as 10% of the total simulation
time, they are still limited to, at most, 10x improvement in the
emulated environment.
The industry is striving to automate the verification process using
high-level testbenches and advanced verification engines. This is
in direct contradiction to an emulators need for synthesizeable
RTL testbenches. Writing testbenches under the restriction of using
only synthesizable HDL is likely to prolong the verification time
for most designs.
Can testbench automation tools, such as Specman Elite, truly be
part of a single methodology including acceleration/emulation?
The Verisity Solution
When analyzing the co-simulation of a workstation and a HW box,
we would see that the overall processing time depends on all of
the following:
- Time spent in the software testbench running on the workstation
- Frequency of interaction between workstation and the HW box
- Time spent doing the interaction
- Time spent in the HW box
By far the biggest bottleneck is cased by the testbench running
on the workstation. How can we reduce the time spent in the workstation?
Lets take another look at the way an advanced testbench behaves
and where the bottlenecks lie.

Figure 2: Activity level with respect to the DUT
The testbench in Figure 2 could represent a typical high-level
testbench. It contains both high-level activities (like end-to-end
checkers) as well as low-level bus functional models (BFMs). The
automation engines, such as constraint-driven generation, are not
frequently active throughout the simulation. The low-level BFMs,
protocol checkers and synchronization facilities, in contrast, are
constantly feeding the design and monitoring its behavior. This
accounts for most of the processing time consumed by the testbench.
The second largest bottleneck is the frequency of interaction.
How can we minimize the frequency of interactions between the HW
box and the workstation?
Switching back and forth between the hardware and the software
platforms hurts performance. In order to minimize such switching,
the HW box needs to execute as long as possible, uninterrupted,
passing control back to the workstation only when necessary. In
addition, the communication between the platforms must be buffered
so that large amounts of data can be passed in each context switch,
reducing further the number of required switches.
What about the time it takes to do an interaction?
To move big chunks of memory contents between the workstation and
the hardware box, we need to have good communication channels between
the HW box and the workstation.
Verisitys solution is a testbench acceleration methodology,
this includes:
- Buffered transaction-based interface
- Buffered port communication between the testbench and the HW
box
- Fast "physical" communication to the HW box
- Methodology
- Modeling the testbench to minimize the interaction between the
HW-box and the workstation
- eCelerator Verisitys
e synthesis product
- Synthesizes the frequent, time consuming tasks of the e
testbench to run on the HW
The transaction-based communication in Specman Elite works using
a new interface feature called a port. Buffering is
part of the port semantics, which means that no deliberate planning
is necessary on the users side. The size of the ports
buffer is dynamic and set on a test-by-test basis. Ports also utilize
the fast physical communication to the HW box provided by the leading
acceleration and emulation vendors.
The right methodology starts by partitioning the testbench into
two major layers: high level functions and low-level drivers/monitors.
The high-level layer consists of units containing asynchronous high-level
functions such as transaction generation, scoreboards and test orchestration.
The second, lower-level layer consists of units that implement the
drivers, protocol checkers, error handling and monitors. Communication
between the units in both layers is mediated by the same ports used
to communicate with the HW box.
The third part of the solution is eCelerator.
The e testbench is fed into eCelerator,
which synthesizes the units that are designated by the user for
hardware simulation.
Figure 3 shows an example of a partitioned testbench for
a datacomm device before e synthesis. High-level
(and non-frequent) methods generate the input transactions and maintain
a scoreboard for end-to-end checks. The transactions are then transferred
to low-level BFMs, that then process and inject the stimuli into
the DUT, following the required communication protocol. Temporal
checkers expressed in declarative temporal constructs in e
ensure protocol correctness. The BFMs that drive stimuli and monitor
the outputs of the DUT and the protocol checkers are the most active
components of the testbench.

Figure 3: Partition testbench before e synthesis

Figure 4: The same testbench in a testbench acceleration methodology.
Figure 4 shows the same testbench with a testbench acceleration
methodology. The single driver unit is split into two units, one
synthesized, and the other non-synthesized. The non-synthesized
unit contains the high-level functions, called time-consuming methods
(TCMs) in e. It avoids direct access to the
DUT and is decoupled from the DUT clock (to avoid a context switch
every clock cycle). The synthesized unit contains the low level
BFMs and all protocol checkers, which will be accelerated within
the hardware box.
Buffered ports are used to communicate between these two units.
The high-level unit (executing on the workstation) will generate
a stream of transactions until the buffer is full, at which point
control will return to the HW box to resume execution. The HW box
will trigger a context switch back to the workstation only when
the buffer is empty again, or when a condition that requires interrupting
the execution is encountered. For example, when the low-level BFM
detects an interesting DUT state, the high-level testbench can be
invoked to generate a new input to test a corner case. This, in
essence, facilitates a powerful on-the-fly generation methodology,
allowing the testbench to efficiently drive the design to interesting
corner cases from multiple random paths, which is typical of a Specman
Elite methodology.
Buffered ports behave the same in both HW-assisted and pure software
execution. Thus, the verification engineer needs to maintain a single
environment for both simulation and emulation. A bug that was identified
during emulation can easily be reproduced by the same environment
using only a software simulator, where rich debugging capabilities
are available.
Synthesizeable e is rich and includes such
things as data struct initialization, protocol checking, BFMs, and
basic list manipulation. Figure 5 demonstrates a synthesizeable
checker written in e that will be accelerated
within the hardware box. It includes one expect definition
(temporal checker) and several event definitions. The first event
(TX_start) detects the beginning of a transmission by detecting
a rise of the appropriate HDL signal. The direct access to HDL signals
here is the same as in any e testbench. The
checker ensures that ten cycles after a packet transmission has
started, an acknowledge is received. If this rule is violated, the
co-simulation stops and the error message is printed. Such error
messages can include relevant information from the HW box to be
as descriptive as desired, thus saving substantial debug time. The
event intr_during_TX captures a simple yet important scenario,
where an interrupt occurs during transmission (before an acknowledge
was received). If this event is utilized in a generation constraint,
the high-level testbench will be invoked to steer the co-simulation
based on the new DUT state.
|
unit driver_checker_u {
keep synthesized() ==
TRUE; // mark this unit to be synthesized
event TX_start is rise
(`TXenable) @sim; // signal access
expect @TX_start =>
{[10] * cycle; @ack} @clk
else dut_error("A
packet transmitted into the DUT should be
acknowledged
after exactly 10 clock cycles");
// flag an interrupt
during packet transmission
event intr_during_TX
is
{@TX_start;
[..] * not @ack; @interrupt} @clk;
};
|
Figure 5: Sample synthesizeable e code
Figure 6 describes the testbench acceleration flow. The
testbench is partitioned and units are marked for synthesis using
simple constraints. The full testbench is then loaded onto eCelerator,
which synthesizes e (along with the RTL DUT)
into a Verilog module. After successful synthesis, the generated
Verilog files are then compiled, along with the RTL DUT, and loaded
into the emulator, following the HW-boxs specific guidelines.
Specman Elite will then co-simulates efficiently with the HW-box
via the buffered transaction-based interface.

Figure 6: Testbench Acceleration Flow.
Note that this flow preserves the ability to steer the environment
by loading specific test files on top of the compiled testbench.
This not only control the tests that are generated, but also the
size of the buffered ports, directly impacting the overall performance.
Summary
Verisitys testbench acceleration methodology, which utilizes
eCelerators synthesis technology, is
a breakthrough in hardware-assisted verification. A synthesized
e testbench provides a single environment,
single language and single methodology whether you use pure software
simulation or hardware-assisted verification. While preserving the
same methodology and advanced verification capabilities that made
Specman Elite the industry leader in testbench automation tools,
this solution increases the utilization and power of your emulation
technology investment and can be a significant advantage for you
over your competition.
Verisity Design, Inc. 331 E. Evelyn
Ave Mountain View, CA 94041
phone: (650) 934-6800 Fax: (650) 934-6801
www.verisity.com
eCelerator, Specman Elite and the Verisity logo are
trademarks of Verisity Design, Inc. All other trademarks are the
exclusive property of their respective holders. ©2002 Verisity
Design, Inc.
|