FPGA difficult to understand?Compare it with GPU and you will understand

FPGA is a bunch of transistors, you can wire them up to make any circuit you want. It is like a nano-scale breadboard. Using FPGA is like tape out a chip, but you only need to buy this chip to build a different design. In exchange, you need to pay some efficiency costs.

FPGA is a bunch of transistors, you can wire them up to make any circuit you want. It is like a nano-scale breadboard. Using FPGA is like tape out a chip, but you only need to buy this chip to build a different design. In exchange, you need to pay some efficiency costs.

Literally speaking, this statement is wrong, because you don’t need to rewire the FPGA, it is actually a 2D grid of lookup tables connected through a routing network, as well as some arithmetic units and memory. FPGAs can simulate any circuit, but they are actually just imitating, just like a software circuit simulator simulating a circuit. What makes this answer inappropriate is that it oversimplifies the way people actually use FPGAs. The next two definitions can better describe FPGA.

Circuit simulation is a classic mainstream use case of FPGAs, which is also the reason why FPGAs first appeared. The key to FPGA is that the hardware design is coded in HDL, and you can get the same effect as ASIC by buying some cheap hardware. Of course, you cannot use exactly the same Verilog code on FPGAs and real chips, but at least their abstraction scope is the same.

This is a different use case from ASIC prototyping. Unlike circuit simulation, computational acceleration is an emerging use case for FPGAs. This is why Microsoft has recently succeeded in accelerating search and deep neural networks. And the key point is that the calculation example does not depend on the relationship between FPGA and real ASIC: The Verilog code written by developers for FPGA-based acceleration does not need to have any similarity with the Verilog code used to tape out.

FPGA difficult to understand?Compare it with GPU and you will understand

There are huge differences between these two examples in terms of programming, compiler, and abstraction. I pay more attention to the latter, which I call “computing FPGA programming” (computaTIonal FPGA programming). My argument is that the current programming methods for computing FPGAs all borrow the traditional circuit simulation programming model, which is wrong. If you want to develop ASIC prototypes, both Verilog and VHDL are the right choices. But if the goal is computing, we can and should rethink the entire stack.

Let us speak straight to the point. FPGA is a very special kind of hardware, which is used to efficiently execute special software described by analog circuits. FPGA configuration requires some low-level software-it is a program written for ISA.

Here you can use the GPU as an analogy.

Before deep learning and blockchain became popular, there was a period of time when GPUs were used to process graphics. At the beginning of the 21st century, people realized that they would also use GPUs as accelerators when processing computationally intensive tasks without graphics data: GPU designers have built more general-purpose machines, and 3D rendering is just one of the applications.

FPGA difficult to understand?Compare it with GPU and you will understand

Definition of FPGA and analogy with GPU

Computing FPGAs followed the same trajectory. Our idea is to use this fashionable hardware a lot, of course not for circuit simulation, but to use calculation modes suitable for circuit execution, and to look at GPU and FPGA in analogy.

In order for the GPU to develop into today’s data parallel accelerator, people have to redefine the concept of GPU input. We used to think that GPUs accept peculiar, strong, domain-specific descriptions of visual effects. We have implemented GPU execution programs, thus unlocking their true potential. This realization allows the goal of GPU to evolve from a single application domain to an entire computing domain.

I think computing FPGAs are undergoing a similar transition, and there is no concise description of the basic computing modes that FPGAs are good at. But it is related to potential irregular parallelism, data reuse, and most static data flows.

Like GPU, FPGA also needs hardware abstraction that can reflect this computing model. The problem with Verilog for computing FPGA is that it does not work well in low-level hardware abstraction, and it does not work well in high-level programming abstraction. Let us imagine by disproving what it would be like to replace these roles with RTL (register transfer level).

Even RTL experts may not believe that Verilog is an efficient way to develop mainstream FPGAs. It will not push programming logic into the mainstream. For experienced hardware hackers, RTL design seems friendly and familiar, but the productivity gap between it and software languages ​​is immeasurable.

In fact, for current computing FPGAs, Verilog is actually ISA. Major FPGA vendor tool chains will use Verilog as input, and high-level language compilers will use Verilog as output. Vendors generally keep the bitstream format secret, so Verilog will be as low as possible in the abstraction hierarchy.

The problem with using Verilog as an ISA is that it is too far away from the hardware. The abstraction gap between RTL and FPGA hardware is huge. From a traditional perspective, it must at least include synthesis, technology mapping, and placement and routing-each of which is a complex and slow process. Therefore, the compilation/editing/running cycle of RTL programming on FPGA takes hours or days. Worse, this is an unpredictable process. The deep stack of the tool chain may obscure the changes in RTL, which may affect Design performance and energy characteristics.

A good ISA should directly show the real situation of the underlying hardware without modification. Like assembly language, it doesn’t actually need to be very convenient to program. But also like assembly language, its compilation speed needs to be very fast, and the results are predictable. If you want to build higher-level abstractions and compilers, you need a low-level goal that does not appear unexpected. RTL is not such a goal.

If the computing FPGA is an accelerator for a specific type of algorithm mode, the current FPGA cannot ideally achieve this goal. Under this game rule, the new hardware type that can beat the FPGA can bring a new level of abstraction. The new software stack should abandon the legacy issues of FPGA in circuit simulation and RTL abstraction.

The Links:   TPS65266RHBR DVF20400-58

Related Posts