Testbench Acceleration
Synthesizing e Testbenches Using Verisity's eCelerator


Abstract
Timeliness is one of the most crucial success factors in the electronic design industry. Taping out a bug-free chip, on schedule, can make the difference between a huge success and a complete disaster. Get it right “the first time” in the designated time-to-market window, and your product can become a best seller. Miss the time window by a few months because of a respin, and you may completely lose the market to your competitors.

The main bottleneck affecting your ability to release quality systems desired is functional verification. An efficient simulation-based functional verification process combines the abilities to:

  • Get the most verification from every simulation cycle
  • Get as many simulation cycles as possible

Testbench automation tools automate the verification process by using high-level, intelligent environments. Such environments provide the tools to automate the verification process and ensure completeness, but are highly dependent on the software simulators speed.

Hardware–assisted verification is, in a sense, exactly the opposite. Pure emulation environments use low-level testbenches that are mapped into the hardware to achieve a high rate of cycles per second.

This document introduces the tools and methodologies that integrate these apparently contradicting approaches into a single verification environment to provide a powerful, high-performance verification process.


Simulation Versus Emulation
Simulation-based verification works by exercising a description of the design on a general-purpose workstation. The idea of using the current generation of computers to simulate and verify next generation logic has been around since the 1950s and is prevalent in ASIC design flows today. Most simulators are event-based; signal changes potentially trigger chains of events, which are repeatedly calculated until relaxation. After all the calculations have completed, the simulator advances the clock to simulate the next cycle. Such event-by-event frameworks are relatively easy to use. Traditional simulation exercises the logic design mostly at the register transfer level (RTL). After the RTL specifications are exercised and verified, synthesis technologies convert the RTL model into a semantically equivalent gate-level implementation. Using logic simulation to exercise the design in RTL-form proved to be very efficient. The setup time for such environments is minimal, the RTL code can be built to be modular, and it is easy and cheap to apply a change to the design at this early stage. Debugging the simulation is also easy due to myriad supplementing tools, including waveform viewers and source-level debuggers, which provide full visibility into the design.


Testbench Automation
As design complexity increases, most verification managers estimate that the verification effort can consume up to 70 percent of the entire design cycle, even at a ratio of seven verification engineers for every three designers. Testbench automation tools such as Specman Elite address this functional verification bottleneck by automating the verification process. High-level declarative languages drive powerful verification engines which drastically reduces the need for writing and maintaining huge amounts of procedural code that might otherwise be required (if using HDL or even languages such as C/C++). Specifically, Specman Elite automates test generation, self-checking tests (including protocol checks), and uses functional coverage analysis to ensure completeness. Once a testbench automation tool is in place, ensuring complete and efficient verification, the number of simulation cycles per second is the next major issue to be addressed. This problem is getting more acute as designs get larger, requiring more and more computations per cycle. Some companies use simulation farms, comprising hundreds of CPUs to provide enough cycles to complete thorough design verification. For some domains however, even a simulation farm approach is not adequate; individual tests often require far too many cycles than can be simulated in a reasonable amount of time.


Figure 1: Specman Elite testbench automation tool.


Hardware-Assisted Verification
In the 1990s hardware-assisted (HW) verification technologies evolved. Advanced synthesis techniques were developed to map the RTL description of the design under test (DUT) to standard FPGA chips or custom-programmable devices. These assisted technologies can then be further classified into acceleration and emulation. Accelerators export only some of the logic into the HW box, thus maintaining the same event-based simulation. The downside of this approach is its performance. As the logic that executes on the workstation slows down the entire process. Emulators, in contrast, map the entire design into FPGAs or other custom-programmable devices. With the entire DUT in the HW box, the result is a substantially faster emulation. Emulation also enables in-circuit operation where real external hardware and software can be connected to the emulator, verifying the DUT in a “real-life” environment. However some of the disadvantages of emulation include:

  • Having to dedicate valuable engineering time to adjust the design to an emulated environment
  • Requiring the HW environment that surrounds the design to be up and running, including associated software
  • Reduced debugging capabilities
  • Lack of four state logic support

These factors have pushed emulation to be mainly a back-end technology, and not to be utilized during extensive parts of the verification cycle.

As we’ve seen, simulation and emulation complement each other in many ways. But is there a usage model that would incorporate both into a single process?


Usage Today
Traditionally simulation is used at the front-end to “clean” the design. Emulation is only used later, after a significant part of the testing is done placing most of the burden upon simulation. Thus, emulation alone does not necessarily shorten the time it takes to do verification. The methodologies that have been used to parallel the simulation and emulation processes can be categorized as:

  • Having separate teams for each process
  • Unify both approaches into a single process

The most common way to use emulators today is to employ two different teams that maintain one environment for simulation and yet another for emulation. Often, the first environment is an intelligent testbench that uses testbench automation technologies such as Specman Elite. The emulation framework is designed to use either:

  • Recorded stimuli captured during simulation
  • Simple “random” test vectors created especially for it
  • “Real life” traffic generated by a dedicated test equipment or a real environment (e.g., a real network producing a stream of packets)

Exercising stimuli captured in simulation requires running, in software, each vector sent to the emulation testbench. Brute force vector creation, the second option above, can only supplement an intelligent testbench. For example, the chances of exercising a corner case like “divide by zero” are practically zero if you randomly generate an operator and two real operands. The third option above, the “real life” approach, suffers from similar problems - corner cases and stress tests. The stimulus might be testing the design in a typical environment, but it very seldom reaches rare corner cases and stress tests.

Methodologies that try to unify the two processes into one work by either:

  • Connecting a workstation to an emulator using an API–like communication that most emulators provide, or
  • Creating synthesizable testbenches

Using a software testbench in a hardware-assisted environment is likely to create a major bottleneck. Engineers using a testbench specifically designed for performance are likely to find that even though their testbench consumes as little as 10% of the total simulation time, they are still limited to, at most, 10x improvement in the emulated environment.

The industry is striving to automate the verification process using high-level testbenches and advanced verification engines. This is in direct contradiction to an emulator’s need for synthesizeable RTL testbenches. Writing testbenches under the restriction of using only synthesizable HDL is likely to prolong the verification time for most designs.

Can testbench automation tools, such as Specman Elite, truly be part of a single methodology including acceleration/emulation?


The Verisity Solution
When analyzing the co-simulation of a workstation and a HW box, we would see that the overall processing time depends on all of the following:

  • Time spent in the software testbench running on the workstation
  • Frequency of interaction between workstation and the HW box
  • Time spent doing the interaction
  • Time spent in the HW box

By far the biggest bottleneck is cased by the testbench running on the workstation. How can we reduce the time spent in the workstation?

Let’s take another look at the way an advanced testbench behaves and where the bottlenecks lie.


Figure 2: Activity level with respect to the DUT

The testbench in Figure 2 could represent a typical high-level testbench. It contains both high-level activities (like end-to-end checkers) as well as low-level bus functional models (BFMs). The automation engines, such as constraint-driven generation, are not frequently active throughout the simulation. The low-level BFMs, protocol checkers and synchronization facilities, in contrast, are constantly feeding the design and monitoring its behavior. This accounts for most of the processing time consumed by the testbench.

The second largest bottleneck is the frequency of interaction. How can we minimize the frequency of interactions between the HW box and the workstation?

Switching back and forth between the hardware and the software platforms hurts performance. In order to minimize such switching, the HW box needs to execute as long as possible, uninterrupted, passing control back to the workstation only when necessary. In addition, the communication between the platforms must be buffered so that large amounts of data can be passed in each context switch, reducing further the number of required switches.

What about the time it takes to do an interaction?

To move big chunks of memory contents between the workstation and the hardware box, we need to have good communication channels between the HW box and the workstation.

Verisity’s solution is a testbench acceleration methodology, this includes:

  • Buffered transaction-based interface
    - Buffered port communication between the testbench and the HW box
    - Fast "physical" communication to the HW box
  • Methodology
    - Modeling the testbench to minimize the interaction between the HW-box and the workstation
  • eCelerator™ – Verisity’s e synthesis product
    - Synthesizes the frequent, time consuming tasks of the e testbench to run on the HW

The transaction-based communication in Specman Elite works using a new interface feature called a “port”. Buffering is part of the port semantics, which means that no deliberate planning is necessary on the user’s side. The size of the port’s buffer is dynamic and set on a test-by-test basis. Ports also utilize the fast physical communication to the HW box provided by the leading acceleration and emulation vendors.

The right methodology starts by partitioning the testbench into two major layers: high level functions and low-level drivers/monitors. The high-level layer consists of units containing asynchronous high-level functions such as transaction generation, scoreboards and test orchestration. The second, lower-level layer consists of units that implement the drivers, protocol checkers, error handling and monitors. Communication between the units in both layers is mediated by the same ports used to communicate with the HW box.

The third part of the solution is eCelerator™. The e testbench is fed into eCelerator, which synthesizes the units that are designated by the user for hardware simulation.

Figure 3 shows an example of a partitioned testbench for a datacomm device before e synthesis. High-level (and non-frequent) methods generate the input transactions and maintain a scoreboard for end-to-end checks. The transactions are then transferred to low-level BFMs, that then process and inject the stimuli into the DUT, following the required communication protocol. Temporal checkers expressed in declarative temporal constructs in e ensure protocol correctness. The BFMs that drive stimuli and monitor the outputs of the DUT and the protocol checkers are the most active components of the testbench.


Figure 3: Partition testbench before e synthesis


Figure 4: The same testbench in a testbench acceleration methodology.


Figure 4
shows the same testbench with a testbench acceleration methodology. The single driver unit is split into two units, one synthesized, and the other non-synthesized. The non-synthesized unit contains the high-level functions, called time-consuming methods (TCMs) in e. It avoids direct access to the DUT and is decoupled from the DUT clock (to avoid a context switch every clock cycle). The synthesized unit contains the low level BFMs and all protocol checkers, which will be accelerated within the hardware box.

Buffered ports are used to communicate between these two units. The high-level unit (executing on the workstation) will generate a stream of transactions until the buffer is full, at which point control will return to the HW box to resume execution. The HW box will trigger a context switch back to the workstation only when the buffer is empty again, or when a condition that requires interrupting the execution is encountered. For example, when the low-level BFM detects an interesting DUT state, the high-level testbench can be invoked to generate a new input to test a corner case. This, in essence, facilitates a powerful on-the-fly generation methodology, allowing the testbench to efficiently drive the design to interesting corner cases from multiple random paths, which is typical of a Specman Elite methodology.

Buffered ports behave the same in both HW-assisted and pure software execution. Thus, the verification engineer needs to maintain a single environment for both simulation and emulation. A bug that was identified during emulation can easily be reproduced by the same environment using only a software simulator, where rich debugging capabilities are available.

Synthesizeable e is rich and includes such things as data struct initialization, protocol checking, BFMs, and basic list manipulation. Figure 5 demonstrates a synthesizeable checker written in e that will be accelerated within the hardware box. It includes one expect definition (temporal checker) and several event definitions. The first event (TX_start) detects the beginning of a transmission by detecting a rise of the appropriate HDL signal. The direct access to HDL signals here is the same as in any e testbench. The checker ensures that ten cycles after a packet transmission has started, an acknowledge is received. If this rule is violated, the co-simulation stops and the error message is printed. Such error messages can include relevant information from the HW box to be as descriptive as desired, thus saving substantial debug time. The event intr_during_TX captures a simple yet important scenario, where an interrupt occurs during transmission (before an acknowledge was received). If this event is utilized in a generation constraint, the high-level testbench will be invoked to steer the co-simulation based on the new DUT state.

unit driver_checker_u {
      keep synthesized() == TRUE; // mark this unit to be synthesized

      event TX_start is rise (`TXenable’) @sim; // signal access

      expect @TX_start => {[10] * cycle; @ack} @clk
      else dut_error("A packet transmitted into the DUT should be
                                    acknowledged after exactly 10 clock cycles");

      // flag an interrupt during packet transmission
      event intr_during_TX is
                  {@TX_start; [..] * not @ack; @interrupt} @clk;
};


Figure 5: Sample synthesizeable e code


Figure 6 describes the testbench acceleration flow. The testbench is partitioned and units are marked for synthesis using simple constraints. The full testbench is then loaded onto eCelerator, which synthesizes e (along with the RTL DUT) into a Verilog module. After successful synthesis, the generated Verilog files are then compiled, along with the RTL DUT, and loaded into the emulator, following the HW-box’s specific guidelines. Specman Elite will then co-simulates efficiently with the HW-box via the buffered transaction-based interface.


Figure 6: Testbench Acceleration Flow.

Note that this flow preserves the ability to steer the environment by loading specific test files on top of the compiled testbench. This not only control the tests that are generated, but also the size of the buffered ports, directly impacting the overall performance.


Summary
Verisity’s testbench acceleration methodology, which utilizes eCelerator’s synthesis technology, is a breakthrough in hardware-assisted verification. A synthesized e testbench provides a single environment, single language and single methodology whether you use pure software simulation or hardware-assisted verification. While preserving the same methodology and advanced verification capabilities that made Specman Elite the industry leader in testbench automation tools, this solution increases the utilization and power of your emulation technology investment and can be a significant advantage for you over your competition.

 

Verisity Design, Inc.  •  331 E. Evelyn Ave  •  Mountain View, CA  94041  •  phone: (650) 934-6800  •  Fax: (650) 934-6801  •  www.verisity.com

eCelerator, Specman Elite and the Verisity logo are trademarks of Verisity Design, Inc. All other trademarks are the exclusive property of their respective holders. ©2002 Verisity Design, Inc.

© Copyright 2005 Verisity Design, Inc. All rights reserved. Privacy Policy.