As data centers are called upon to handle an explosion of unstructured data fed into a variety of cutting-edge applications, the future for FPGAs looks bright.
That’s because FPGAs, or field programmable gate arrays, are essentially chips that can be programmed, after manufacturing, to act as custom accelerators for workloads including machine-learning, complex data analysis, video encoding, and genomics – applications that have far-reaching consequences for communications, networking, health care, the entertainment industry and many other businesses.
Such applications lend themselves to parallel processing, an important feature of FPGAs, which can also be reconfigured on the fly to handle new features as the nature of these workloads evolve.
Now Xilinx, which for decades has vied with rival Altera (now part of Intel) for technical leadership in FPGAs, is unveling what it calls a new product category – the Adaptive Compute Acceleration Platform (ACAP) – that, it says, goes far beyond the capabilities of current FPGAs.
What is an ACAP?
The first product range in the category is code-named Everest, due to tape out (have its design finished) this year and ship to customers next year, Xilinx announced Monday. Whether it’s an incremental evolution of current FPGAs or something more radical is tough to say since the company is unveiling an architectural model that leaves out many technical details, like precisely what sort of application and real-time processors the chips will use.
The features that we do know about are consequential, though. Everest will incorporate a NOC (network-on-a-chip) as a standard feature, and use the CCIX (Cache Coherent Interconnect for Accelerators) interconnect fabric, neither of which appear in current FPGAs.
Everest will offer hardware and software programmability, and stands to be one of the first integrated circuits on the market to use 7nm manufacturing process technology (in this case, TSMC’s). The smaller the manufacturing process technology, the greater the transistor density on processors, which leads to cost and performance efficiency.
Though there is controversy over manufacturing process nomenclature and the relative merits of Intel’s and TSMC’s manufacturing processes, the basic idea is that 7nm is about half the geometry size of the current generation of FPGAs, with four times the performance per square millimeter. Everest devices will have up to 50 billion transistors, compared to, for example, Intel’s (formerly Altera’s) current Stratix 10 FPGAs, which use a 14nm manufacturing process and sport 30 billion transistors.
“We really feel like this is a different product category,” said recently named Xilinx CEO Victor Peng. Xilinx has spent about a billion dollars over the past four years and committed 1,500 engineers to the project.
Xilinx currently claims that its FPGAs, due to their ability to be customized for different workloads, accelerate processing by 40 times for machine-learning inferencing, 10 times for video and image processing, and 100 times for genomics, relative to CPU- or GPU-based systems. ACAPs, it says, will further accelerate AI inferencing by 20 times, and 5G communications by 4 times over its current FPGA architecture.
FPGAs traditionally have offered an array of configurable logical blocks linked via programmable interconnects. Reconfiguration of FPGAs for years was done via Hardware Description Languages (HDLs), but chip developers have started tweaking the architecture of the devices to enable the use of higher level software programming languages.
Xilinx’s recently released Zynq All Programmable SoC (system on a chip) joins the software programmability of the ARM-based processor that has been integrated into the product, with the hardware programmability of an FPGA.
“We’ve been transforming, but ACAP is the inflection point if you will,” Peng said. “Even though before, FPGAs were flexible and adaptable, the level of that is much higher and yes, we started more recently enabling people to develop more at the software level, but the extent to which we’re gonna do it with this class of product is much higher. That makes this a quantum step over what we’ve seen before.”
Using high-level software programming languages
Xilinx says that software developers will be able to work with Everest using tools like C/C++, OpenCL, and Python. Everest also can be programmable at the hardware, register-transfer level (RTL) using HDL tools like Verilog and VHDL.
Karl Freund, an analyst with Moor Insights & Strategy, sees Everest as more of an evolution of Xilinx’s strategy rather than a radical step, but emphasizes that the advances in the hardware and software elements of Everest are significant.
“It’s true it’s a new category but it’s not just the chip that makes it a new category – it’s the chip, the software, the libraries and even the web development models,” Freund said.
“They’ve invested a lot in software stacks, the so-called acceleration stacks that enable you to more quickly deploy FPGA solutions because they’re basically providing some standardized libraries and tools and algorithms and IP blocks that you can just pick up and deploy on your FPGA,” he said.
In addition to an as-yet-unspecified, multicore SoC (system on a chip), Xlinx says that Everest will offer PCIe as well as CCIX interconnect support, multimode Ethernet controllers, on-chip control blocks for security and power management, and programmable I/O interfaces. It also includes different types of SerDes – transceivers that convert parallel data to serial data and vice-versa. Specifically, it will offer 33Gbps NRZ (non-return to zero), 58Gbps PAM-4 (pulse amplitude modulation) and 112G PAM-4 SerDes. Generally, the PAM mechanism offers more bandwidth than NRZ.
Certain Everest chips will also offer high bandwidth memory (HBM), or programmable ADC (analog-to-digital) and DAC (digital-to-analog) converters, Xilinx says.
Network on a Chip, coherent cache are key differentiators
Xilinx says that a key difference between FPGAs and ACAPs is the NOC, which connects the device’s various subsystems such as multiple processors and I/O elements. Up to now FPGAs have not had system-level NOCs, and developers have had to essentially create the connection infrastructure through the chips’ programmable logic. “You can still program the subsystems through programmable logic but in general, you wouldn’t get the same performance characteristics,” Peng said.
Another key element is CCIX. “What is revolutionary is, it’s cache coherent,” Freund said. “For the first time you’ll be able to build a system with a cache-coherent accelerator using a standard network protocol, and that doesn’t exist anywhere in the industry right now.”
CCIX is a set of specifications being developed by the the CCIX Consortium to solve the problem of cache coherence, or how to ensure that different CPUs don’t clash when trying to modify the same memory space or work on stale copies of data.
A big target for Everest is AI. No one expects Everest to compete with the pure horsepower behind Intel CPUs and Nvidia GPUs, which are being used to “train” terabyte-size data sets to work in mammoth, 100-level-deep, machine-learning neural networks.
But the adaptability of Everest, like traditional FPGAs, makes it ideal for “inferencing,” or actually putting the neural networks to use in real-life situations, Freund notes. That’s because each level of a neural network should be processed with the least amount of precision possible, to save time and energy. Unlike CPUs, which have fixed precision, FPGAs can be programmed to process each level of a neural network, once its built, with the least precision suitable for that layer.
Edge devices are a primary target
And while Xilinx says its primary target is the data center, edge devices and IoT may ultimately be where Everest shines. Machine-learning applications will increasingly be incorporated into edge devices, which are extremely power-constrained relative to big servers, making them ideal candidates for FPGAs.
Microsoft, which was the first major cloud provider to announce the deployment of FPGAs for its public cloud infrastructure, last year gave the chips a big vote of confidence for their use in AI, announcing it would use FPGAs for its Project Brainwave deep-learning platform. It’s using Stratix 10 FPGAs from Xilinx’s nemesis, Intel/Altera, but this nevertheless helps cement the idea of using FPGAs for AI inferencing.
The continuing rivalry between Xilinx and Intel will play out as the companies move to smaller manufacturing process technology. Intel has already announced FPGAs code-named Falcon Mesa, to be built with Intel 10nm manufacturing technology, which some industry insiders say will provide equivalent transistor density to TSMC’s 7nm process.
With Everest and possibly Falcon Mesa due out in 2019, it looks like FPGAs – or in Xilinx’s case, ACAPs – will play a more important role in computing trends than they ever have.