ReConfigurable Computing (RCC) For Logic Simulation
The enabling technology for high performance simulation with advanced debugging capabilities

Table of Contents
The Design Challenge

The electronic industry is undergoing a series of transformations which will shape how new products are designed, manufactured and used. Fueled by advancements in semiconductor manufacturing capabilities, System-On-Chip (SoC) has become a reality with ever increasing demands for products to possess high performance, capacity and reliability.

While SoC implementation is extremely appealing, the design process is exceedingly challenging. Traditional design tools have not kept pace with designs at this level. Particularly the biggest design bottleneck is in design verification which may consume 50 to 70 percent of total design time.

Varying Verification Methodologies

The traditional verification process has combined the usage of various point tools specific to a fixed task; no single tool has yet penetrated the entire verification process by providing complete solution from behavioral simulation to hardware/software co-verification. In addition, most product development teams are forced to separate into design and verification groups since the front-end and back-end verification tools are conceptually different and the vast learning needed to execute all the tools.

All of this is about to change with the advent of Axis System's proprietary ReConfigurable Computing (RCC) technology applied to functional verification.


Current Verification Technologies

In a traditional event simulation algorithm, each event is processed by a simulation kernel in a central microprocessor. Each output event has to be propagated to all the fanout connections until the circuit stabilizes with no additional changes. This is mainly a sequential process with performance barriers of having the microprocessor process tremendous amounts of event evaluations. As a result, when designs migrate into the one million gate range, event simulation performing functional verification is extremely time consuming. Software simulation by far offers the highest level of flexibility but sacrifice in simulation performance. Hardware accelerators try to resolve the serial execution problem by designing custom ASICs dedicated for event processing. However, hardware accelerators are designed to simulate exclusively at the gate level with timing. The design process for hardware accelerator ASIC usually lags behind advancement in microprocessor technology resulting in poor speedup factor in comparison with software simulation on workstations.

For hardware prototyping, emulation technology provides the technique to plug designs under test directly into hardware systems. Emulation uses an array of FPGA chips interconnected by crossbar, partial crossbar or other interconnect hardware switches to duplicate design behavior. System speed can reach up to 1Mhz. The emulation step usually occurs at the end of the design cycle and supports only at the gate level with loose integration to RTL and behavioral models. In addition, debugging a circuit in emulation mode requires the usage of logic analyzers which is difficult to setup and provides an inefficient debugging environment.

Both hardware accelerator and emulation technology lack support for RTL and behavioral constructs. For RTL level designs, both technologies allow a logic synthesis process of translating RTL code into gates. However, correlation between the RTL and gate level is lost during the synthesis process and the designer is forced to debug at the gate level which is extremely ineffective. A similar comparison would be if software developers write C programs and had to debug the program at assembly level.

Migrating to the most advanced technology, ReConfigurable Computing (RCC) accelerates simulation by orders of magnitude with the same flexibility of software simulators. For each design, a custom RTL co-processor is constructed using a combination of RCC computing elements and RCC microsequencer. The RCC computing elements are custom single instruction processors dedicated for a single cause and their varying types closely follow the RTL design language constructs. Control for the computing elements follow the Single Instruction Multiple Data (SIMD) parallel processing paradigm with high speed communication to the workstation microprocessor. Thus RCC accelerates simulation at all language levels without design modifications while preserving the original simulation debugging environment.

Table A illustrates the simulation technology comparison in a tabular format

Table A: Comparison Summary of Different Verification Technologies

Spacer RCC Simulator Hardware Emulation Hardware Accelerator
Target Application Functional Simulation&Hardware-Software co-verification In circuit Emulation Gate level simulation with timing capability
Usage Model Behavior/RTL/gate

Transparent compile into RCC.
Gate Level Emulation/prototype

One-to-two month setup
Gate Level Simulation

Tedious Timing library conversion
Speed 10K-100K Cycle/sec 200K-1000K Cycle/sec 0.5K-1K Cycle/sec
Underlining Technology ReConfigurable Computing with tight integration of software/hardware.Accelerate simulation using hundreds of thousands of RCC computing elements. Transparently maps RTL/gate into computing elements Hardware prototyping. Wire for wire and gate for gateResolution for setup and hold timing issues is very complex Custom ASIC processor designed for event simulation processing.

Number of processors is in the hundreds.


ReConfigurable Computing (RCC) Technology

ReConfigurable computing had been in the research arena for the last ten years with varying application for this technology. Early adopters of reconfigurable technology have been in the military or US government operations for encryption and decryption.

In a traditional microprocessor based computing model, the user exploits the microprocessor's static resources to solve a particular problem. If the instruction is not ideal, efficiency may be lost with longer execution time. In addition, programs written for microprocessor based systems are usually executed in a sequential manner with minimum parallelism.

In contrast to a general purpose microprocessor, RCC configures the hardware structure to match the algorithm and selects the best resources for a particular task with maximum parallelism. For example, if a particular algorithm can take advantage of six arithmetic logic units with addition and subtraction as its only instruction, RCC will select the best hardware resource structure with those attributes for maximum efficiency.

RCC comes in two flavors: static and dynamic. Static RCC refers to the situation of having predetermined and fixed resources during execution. The resource allocation is performed at compile time when the algorithm is being analyzed. On the other hand, dynamic RCC refers to the situation that different algorithm requires different resources during execution. Depending on the exact location of execution, different resources are swapped in on a needed basis. For example, if during execution of RCC program, a different arithmetic logic unit is needed to efficiently run the algorithm, dynamic RCC will swap in the needed resource when running while static RCC will have loaded the predetermined resources before execution.

Applications for RCC technology have not been widespread until now mainly due to the previous low capacity of programmable logic devices. Today with the introduction of high capacity programmable logic devices such as Altera 10K-250 and new algorithm to map RCC elements onto multiple programmable logic devices, Axis Systems has applied this technology to accelerate functional verification by orders of magnitude while preserving the original debugging environment.


Functional Simulation using RCC Technology on an RTL Co-Processor

To fully take advantage of the merits of RCC technology, the most suited algorithms are ones which can be massively parallelized with construction of a specialized co-processor. Functional verification naturally falls into this category since evaluation of RTL and gate level constructs can be accelerated with massively parallelized RCC co-processor.

Current System-On-Chip (SoC) design methodology involves describing the system in a language based format (e.g. Verilog or VHDL). The language is separated into three categories: behavioral, RTL and gate. Behavioral constructs usually describe the system testbench and are most efficiently simulated on a microprocessor because they are serially executed with extensions to network and hard drive resources. RTL and gate level constructs describe the design and can be compiled into RCC architecture.

Behavioral constructs are usually written in sequential execution format with calls to system resources such as the network or hard disk. They are extremely difficult to parallelize and the microprocessor is the best resource to simulate the diverse instruction. In contrast, RTL and gate level constructs are written for parallel execution. Each RTL or gate level statement can be mapped into a computing device specifically designed to efficiently execute the instruction. RCC architecture for functional verification achieves its high speed by having a co-processor containing a massively parallel structure of computing elements specially configured for each design. A computing element is a small compact processor dedicated to perform one function. For example, Axis Systems has designed a custom computing element to simulate Verilog RTL "case" and "if" statements.

When executing, the RCC co-processor obtains instructions and data from the microprocessor, sends the execution command to the SIMD controller which sequences the evaluation and communication of all RCC computing elements. The controller next step is to collect all evaluation result from computing elements, pack them in a data stream format and send the resulting data back to the microprocessor to continue simulation.

By mapping the design RTL constructs onto its custom interconnected computing elements, the RCC hardware is programmed for maximum performance execution for each design being verified. . Using its proprietary systolic array interconnection architecture, communication between computing elements and between multiple devices is fast and efficient. Figure 1 shows a architecture block diagram of a RTL co-processor using ReConfigurable Computing. This diagram illustrates the co-processor controller as well as RCC computing elements.

WP

Figure 1: Architecture of a RTL co-processor using RCC


Axis Systems Xcite Simulator
First commercial RCC application for Electronic Design Automation

Axis Systems has incorporated the first commercial functional verification system using RCC architecture. Designed for compactness and high performance, the Axis Systems XciteŽ engine uses an array of the highest density Altera programmable chips. The Xcite RCC engine connects directly with Sun Microsystems workstation including high bandwidth communication via the PCI bus. With its small form factor, Xcite does not sacrifice on design capacity. With current capacity up to10 million gates, Xcite offers the best price/performance simulation advantage.

To acceleration simulation throughput the design process, the Xcite product family includes its software simulator, RCC compiler and RCC hardware engine. Whether the design is described using behavioral, RTL or gate level constructs, the RCC compiler will directly compile RTL constructs into RCC computing elements to be directly loaded onto RCC hardware engine while behavioral constructs are native compiled into Xcite software simulator. Thus, designers can use Xcite products throughout the complete design process from architecture design using Xcite software simulator to software/hardware co-verification using Xcite RCC engine.

WP

Figure 2: Xcite RTL Language Compiler Architecture

Unlike traditional RTL support, Xcite compiles RTL designs into a custom built RTL co-processor with RCC computing elements. This approach is illustrated in Figure 2. By avoiding the costly synthesis procedure, RTL debugging is preserved without having the designers to diagnose at the gate level.

Since its inception, Xcite is designed to maximize debuggability. With its proprietary technique of swapping simulation state between the software simulator and RCC, the user has the capability of accelerating as fast as possible using RCC engine to the point of error and swap the RCC simulation state back into the software simulator. Once the states are in software simulation, all software control and internal states are available for inspection. As a result, Xcite combines the best of two worlds with RCC level hardware acceleration along with software simulation debuggability.

WP
Figure 3: Xcite Instantaneous Swap Between Software Simulator and RCC


The Ultimate Verification System

ReConfigurable Computing is the newest technology to be applied to design verification. With more than ten years investment in research and development, the application has not flourish until the advent of large capacity programmable logic chips.

ReConfigurable computing architecture for functional simulation is vastly different than the traditional verification methods. RCC follows the single instruction multiple date (SIMD) parallel processing paradigm by mapping every design into its own unique interconnection of hundred of thousands of RCC computing elements. Instead of performing a logic synthesis process from higher level constructs into logic gates, RCC compiler maps each RTL statement into RCC computing elements to preserve debuggability and fast compile time. Further, RCC uses a proprietary systolic interconnect technology which expands efficiency in computing element communication.

Axis Systems Xcite is the first commercial functional verification using RCC technology. With its high capacity compact form, Xcite connects directly into the workstation. Xcite fits directly into the existing design methodology with a combination of software simulator along with the RCC hardware engine. To improve on debugging effectiveness, Xcite offers instantaneous simulation state swap between software simulator and RCC hardware engine. With this capability, it offers the best of two worlds of having software simulation flexibility along with high speed RCC simulation.

Make the winning choice with RCC technology.

© Copyright 2005 Verisity Design, Inc. All rights reserved. Privacy Policy.