Speeding FPGA Prototype Debug Process with Active Debug and Full Visibility

By Joe Gianelli and Tom Huang

System integration continues to drive the semiconductor design market. This is most obvious when looking at the increased system integration associated with System on Chip (SoC) design over the last few years. Integrating complex hardware features with complex software applications onto one silicon device makes the validation process for today’s SoC designs a tricky one to say the least.

What have been increasingly popular to aid in this complex validation process are FPGA prototype systems. They run extremely fast, almost as fast as the production SoC, and have doubled in capacity every 18 months for the last 5 years. They also enable real world system interfaces to DDRAM, PCI, Ethernet, while using high-speed serial interfaces over 10 Gb/s.

Despite their current strides in speed, capacity, and real world high-speed interfaces, using these FPGA devices to help verify and validate SoC designs are difficult at best due to the many and long FPGA P&R compile times and poor visibility. InPA Systems proposes to address these issues with their active debug and full visibility technology.

Issues Plaguing FPGA Prototyping
FPGA prototyping has become a standard SoC validation vehicle. However, it is very difficult to navigate around system faults due to a lack of methodology and EDA technology around today’s tedious and complicated SoC debug process.

FPGA Prototype Debug Process ~ InPA Systems

As can be seen in the above diagram, the current SoC prototyping flow produces system faults that arise with the integration of hardware and software. Mapping a SoC design onto a multi-FPGA prototype board that requires some arbitrary design partitions such that it can fit into the FPGA(s) can create problems. It may take a few weeks to finally map the SoC into the FPGAs and the chance of this working is very slim. Why is this? It could have nothing to do with problems in the original SoC RTL such as:

  • Bad soldering or defect components in the prototype
  • Clock generators and clock distribution are not configured correctly
  • Memory model and PHY are not properly substituted

It is very time consuming and counter productive attempting to detect and solve such problems. Using off-the-shelf fixed boards can reduce the board issues but not the mapping issues; often times it is even more difficult to map the SoC RTL onto these off-the-shelf prototype boards due to their fixed interconnect scheme. Eliminating problems that are not associated with the original SoC RTL design should be the first challenge of a prototyping project, collectively called “bring-up.”

The next challenge in the FPGA prototype methodology is addressing system issues that require software engineers to assist identifying whether it is a software issue or RTL design issue. Software engineers can control the flow by reading registers and memory map information of the design to analyze the hardware behavior but have little understanding as to why a problem is caused. This is usually due to a misunderstanding of the specification, having the wrong control sequence, or a true hardware bug. A typical hardware/software integration process may take a few months to debug such that it has stable hardware for application software development and a demo of the system. Today, software (ICE) and hardware (ELA) debugging tools are generally disjointed. The panacea in this process is controlling and synchronizing them such that a true system-debug solution can be realized.

Overall system bugs are difficult to find, and require the attention of software, system and RTL engineers to investigate them. The investigation usually starts from the system level until in can be narrowed to either software or RTL. Today’s process is a “trial-and-error” debugging method that requires a more active debug approach that can perform dynamic “cuts and jumps” to find the issue faster. For instance, hardware can force (control) an interrupt at run time to cause the software (synchronization) to enter a debugging state to examine current status; then hardware can disable (Control) IP blocks or functional blocks interactively at run time to better detect the problem area; then software (Control) sends a message to a pseudo port to cause the hardware (synchronization) to catch intermediate information. We call this method Active Debug.

The last and most difficult challenge in this debug process is the limited visibility and slow P&R iterations when using FPGA-based prototypes. The limited visibility in FPGA-based prototyping is a well known problem, but it compounds itself when using multiple FPGA prototype systems. Having limited system visibility (which today is one FPGA at a time) really causes a lot pain when debugging issues that transcend over multiple FPGA prototypes. Using tools with limited visibility assures more trial and error iterations, with long FPGA P&R compile times. For example, if an exception condition comes across in one FPGA and the control processor and bus matrix is in another FPGA, you could be stuck in one FPGA for a lengthy trial and error process. Once you identify the problematic functional block, more P&R iterations are needed. Another example would be, a video encoder may have more than a few hundred thousand nets and you can only probe a couple thousand nets at a time in one FPGA. One could be stuck in “one-a-day” P&R iterations passively searching for the problem, for a month.

Today, bring-up, co-debug, and RTL-debug challenges are iterative in the ECO design process, generally as follows:

  • Bring up
    • Verify prototype board and components are working properly
    • Verify RTL design is properly implemented in prototype
  • System-debug
    • Detect problematic area with software debugger and register/memory map probing
    • Determine if it’s a software or hardware problem
    • Fix software problem using software debug process
    • If RTL problem send problematic area to RTL debug engineer
  • RTL-debug
    • Inherit system level instrumentation from software engineer
    • Refine hardware debug search
    • Capture faulty hardware scenario in simulation environment
    • ECO RTL

Technology Required
A prototyping methodology and EDA technology that meet the challenges of bring-up, hardware/software system-debug, and RTL-debug are listed below:

  • Bring-up
    • Integrate today’s RTL testbench and simulation environments with the FPGA prototype
    • Apply existing RTL testbench integrity test on FPGA prototype to ensure a correct mapping process and working hardware
  • System-debug
    • Control and synchronize existing software debugger (ICE), hardware and embedded instrument (ELA) at run time to identify software or hardware problem
    • Handle system level active debugging method in multiple FPGA environment
  • RTL-debug
    • Capture faulty hardware scenario and debug in simulation-like environment, cross linking the RTL and FPGA for debug ease
    • Full visibility within the faulty block to significantly reduce FPGA P&R iterations

FPGA prototyping methodologies must evolve to reduce bring-up, system debugging, and RTL debugging time as primary goals. Integrating today’s tried and true RTL simulation environment with the FPGA prototype hardware to create a co-simulation flow is an ideal way to help automate design bring-up time. Once the design on the FPGA prototype is brought up, controlling and synchronizing the software and hardware debug environments allows for a true system-debug environment for today’s SoC design projects. Finally, once the bugs have been isolated as issues in the FPGA prototype, full visibility, with active debug capabilities, are the fundamental technologies required to reducing iterations and ultimately the overall SoC validation time. InPA Systems is developing these patent-pending technologies to streamline the SoC debug process.

This article was written by Joe Gianelli and Tom Huang of InPA Systems