In part 1 (Megh’s platform strategy), we talked about the increasingly rapid explosion of data (from the Web, sensors, IOT devices, etc.), the need for efficient processing, and the value proposition of our real-time streaming analytics platform from the perspective of CIOs, data scientists, and developers. In this post, we discuss how the vendor-agnostic architecture of the Megh platform supports a full range of deployment options.
FPGA-based analytics today are being deployed from the edge to the cloud, with each platform having its own development environment and interfaces, making development and deployment fragmented and inefficient.
Intel (through its FPGA Programmable Acceleration Card) and Xilinx (with its Xilinx Alveo Card) are bringing production-ready cards to edge market segments. Major CSPs, like AWS (through F1 instances) and Microsoft, are offering FPGA instances in the cloud. Unfortunately, a problem underlying all these initiatives is that each vendor provides different software stacks and development flows, making it hard to leverage development across vendor/platform.
Another problem is that the size and complexity of the streaming pipeline that needs to be accelerated is increasing every day, such that the complete design cannot fit into a single FPGA. Developers and users need an elastic platform that can shrink or grow by ganging multiple FPGAs together on demand.
Megh’s platform strategy addresses these two problems by providing a vendor-agnostic real-time streaming analytics platform that abstracts the differences among options and scales by networking multiple FPGAs to provide a flexible platform. This is illustrated in the diagram below.
In this example, the Megh platform is on three nodes, each with one or more FPGAs. Heterogeneous combination of FPGA vendor and type is allowed to satisfy end deployment needs, with the ability to add or remove nodes dynamically.
True heterogeneity is achieved by abstracting differences between different vendors through use of a vendor-agnostic shell on the FPGA and a hardware abstraction layer (HAL) on the host CPU.
On the FPGA, users need the ability to read/write to registers, stream data from the host/NIC, and use FPGA-attached memory as a scratchpad. Our Sira Shell does exactly this, abstracting vendor particularities so FPGA/algorithm developers can concentrate on IP rather than worrying about interfaces differences.
On the CPU, users need the ability to search for installed devices, and initialize, push/pull data to and from, and manage one or more FPGAs. Our Arka Runtime includes a HAL that abstracts vendor-specific platform APIs to present a generic interface that can be used by the runtime.
Let’s look at the Sira Shell components in more detail:
Host controller: Shell interface signals and features vary widely between different vendors. The host controller in the Sira shell normalizes these variations to provide MMIO access to registers and streaming interface to/from the host.
Memory controller: Each platform has different memory types (DDR3, HBM), banks (2,4,6), and interface widths (256, 512). The memory controller unifies these variations to provide single, contiguous addressable access to local memory as a scratchpad.
NIC controller: Platforms may provide one or more QSPF connectors that support 10/25/40G interface. The NIC controller implements an Ethernet/MAC layer to attach to QSPF connections to provide FPGA-to-FPGA communication.
Packet processing engine (PPE): The PPE acts as the communication hub for the Megh platform’s scale-out architecture. It implements a routing table to allow FPGA-to-FPGA and FPGA-to-host pathways. The routing table is set up and managed in the Arka Runtime.
Stream processing engine (SPE): User algorithms/IP are implemented with the SPE. As part of the SDK framework, Megh provides RTL and HLS templates to instantiate and build pipelines within the SPE. User pipelines might fit into a single FPGA or span multiple FPGAs. The RTL and HLS templates provide the required infrastructure to forward data to the next algorithm/IP in the pipeline.
The Arka Runtime also merits a closer look.
The HAL is a common interface that abstracts the vendor-specific implementation of each FPGA platform. For each supported platform there exists a platform adapter that implements the generic HAL interface using vendor-specific platform APIs.
While the HAL abstracts the vendor-specific platform APIs, the Sira service takes it a step further by abstracting the lower level I/O APIs and presenting a higher-level interface that exposes the functions on the Sira shell.
Now that the Sira shell is exposed as an Arka service, higher-level workflow services (composed of one or more Sira services) implement full pipelines, such as video analytics, and present an easy-to-use, workload-specific interface to the higher-level application.
We’ve detailed how the Megh platform provides a single development platform across multiple vendors from the edge to the cloud. Stay tuned for more about how application software developers can leverage Megh’s platform with no or minimal changes to current CPU/GPU implementations and IP/RTL designers can develop algorithms for the Megh platform using our SDK.