FPGA Technology at Crossroads

Field Programmable Gate Arrays (FPGAs) have been undergoing rapid and dramatic changes fueled by their expanding use in datacenter computing. Rather than serving as a compromise or alternative to ASICs, FPGA ‘programmable logic’ is emerging as a third paradigm of compute that stands apart from traditional hardware vs. software archetypes. A multi-university, multi-disciplinary team has been formed behind the question:

What should be the future role of FPGAs as a central function in datacenter servers?

Guided by both the demands of modern networked, data-centric computing and the new capabilities from 3D integration, the Intel/VMware Crossroads 3D-FPGA Academic Research Center will investigate a new programmable hardware data-nexus lying at the heart of the server and operating over data ‘on the move’ between network, traditional compute, and storage elements.

The Intel/VMware Crossroads 3D-FPGA Academic Research Center is jointly supported by Intel and VMware. The center is committed to public and free dissemination of its research outcome.

You can find an overview presentation on the center’s YouTube channel. Please contact any of the Crossroads PIs in your research area if you have any questions or interest.


Latest News

November 4th, 2021 | Pigasus Developers Meeting met for the first time on November 4, 2021. Pigasus Developers Meeting is a forum for all users and developers of Pigasus to come together to jointly coordinate the continuing development of the open-sourced Pigasus FPGA accelerated network intrusion detection and prevention system. The Pigasus 2.0 release was announced in this meeting. You are welcome to sign up to the Pigasus mailing list http://crossroadsfpga.org/mailman/listinfo/pigasus_crossroadsfpga.org.

November 4th, 2021 | Pigasus Developers Meeting met for the first time on November 4, 2021. Pigasus Developers Meeting is a forum for all users and developers of Pigasus to come together to jointly coordinate the continuing development of the open-sourced Pigasus FPGA accelerated network intrusion detection and prevention system. The Pigasus 2.0 release was announced in this meeting. You are welcome to sign up to the Pigasus mailing list http://crossroadsfpga.org/mailman/listinfo/pigasus_crossroadsfpga.org. (Read Less)


November 4th, 2021 | A new release of Pigasus FPGA accelerated network intrusion detection and prevention system is available at https://github.com/crossroadsfpga/pigasus. Pigasus 2.0 is a refactoring of the original Pigasus into a disaggregated design where parameterized IPs are connected by standardized abstract connections. In conjunction with FLUID, a high-level end-user could generate specifically tuned Pigasus instances (scaling performance, tuning to new bottlenecks, multi-FPGA mapping, etc.). More advanced users can modify or introduce custom IPs to derive new designs.

November 4th, 2021 | A new release of Pigasus FPGA accelerated network intrusion detection and prevention system is available at https://github.com/crossroadsfpga/pigasus. Pigasus 2.0 is a refactoring of the original Pigasus into a disaggregated design where parameterized IPs are connected by standardized abstract connections. In conjunction with FLUID, a high-level end-user could generate specifically tuned Pigasus instances (scaling performance, tuning to new bottlenecks, multi-FPGA mapping, etc.). More advanced users can modify or introduce custom IPs to derive new designs. (Read Less)


October 2021 | VTR 8: High Performance CAD and Customizable FPGA Architecture Modelling” has received the 2021 best paper award from the ACM Transactions on Reconfigurable Technology and Systems (TRETS) journal; Crossroads researchers Vaughn Betz (University of Toronto) and Jason Luu (Intel) both contributed to this paper. The Crossroads project continues to enhance VTR to improve result quality and to enable new architecture investigations such as embedded NoCs and more-than-2D FPGA systems.

October 2021 | VTR 8: High Performance CAD and Customizable FPGA Architecture Modelling” has received the 2021 best paper award from the ACM Transactions on Reconfigurable Technology and Systems (TRETS) journal; Crossroads researchers Vaughn Betz (University of Toronto) and Jason Luu (Intel) both contributed to this paper. The Crossroads project continues to enhance VTR to improve result quality and to enable new architecture investigations such as embedded NoCs and more-than-2D FPGA systems. (Read Less)


[Find all News here]


Recent Publications

  • End-to-End FPGA-based Object Detection Using Pipelined CNN and Non-Maximum Suppression [abstract]

    A. Na, M. Ibrahim, M. Hall, A. Boutros, A. Mohanty, E. Nurvitadhi, V. Betz, Y. Cao & J. Seo. (2021). End-to-End FPGA-based Object Detection Using Pipelined CNN and Non-Maximum Suppression. In Proceedings of the Int. Conf on Field Programmable Logic and Applications (FPL). [bibtex]

    Abstract:
    Object detection is an important computer vision task, with many applications in autonomous driving, smart surveillance, robotics, and other domains. Single-shot detectors (SSD) coupled with a convolutional neural network (CNN) for feature extraction can efficiently detect, classify and localize various objects in an input image with very high accuracy. In such systems, the convolution layers extract features and predict the bounding box locations for the detected objects as well as their confidence scores. Then, a non-maximum suppression (NMS) algorithm eliminates partially overlapping boxes and selects the bounding box with the highest score per class. However, these two components are strictly sequential; a conventional NMS algorithm needs to wait for all box predictions to be produced before processing them. This prohibits any overlap between the execution of the convolutional layers and NMS, resulting in significant latency overhead and throughput degradation. In this paper, we present a novel NMS algorithm that alleviates this bottleneck and enables a fully-pipelined hardware implementation. We also implement an end-to-end system for low-latency SSD-MobileNet-V1 object detection, which combines a state-of-the-art deeply-pipelined CNN accelerator with a custom hardware implementation of our novel NMS algorithm. As a result of our new algorithm, the NMS module adds a minimal latency overhead of only 0.13 microseconds to the SSD-MobileNet-V1 convolution layers. Our end-to-end object detection system implemented on an Intel Stratix 10 FPGA runs at a maximum operating frequency of 350 MHz, with a throughput of 609 frames-per-second and an end-to-end batch-1 latency of 2.4 ms. Our system achieves 1.5x higher throughput and 4.4x lower latency compared to the current state-of-the-art SSD-based object detection systems on FPGAs.
    BibTeX:
    @inproceedings{hpipe-nms-fpl21,
    author = {Na, A. and Ibrahim, M. and Hall, M. and Boutros, A. and Mohanty, A. and Nurvitadhi, E. and Betz, V. and Cao, Y. and Seo, J.},
    title = {End-to-End FPGA-based Object Detection Using Pipelined CNN and Non-Maximum Suppression},
    year = {2021},
    isbn = {},
    booktitle = {Proceedings of the International Conference on Field-Programmable Logic and Applications},
    pages = {1–8,
    numpages = {8},
    month = aug
    }
    
  • DO-GPU: Domain Optimizable Soft GPUs

    R. Ma, J. Hsu, T. Tan, E. Nurvitadhi, R. Vivekanandham, A. Dasu, M. Langhammer, & D. Chiou. (2021). DO-GPU: Domain Optimizable Soft GPUs. In Proceedings of International Conference on Field-Programmable Logic and Applications (FPL).

  • Pigasus: Efficient Handling of Input-Dependent Streaming on FPGAs [paper]

    Z. Zhao. (2021). Pigasus: Efficient Handling of Input-Dependent Streaming on FPGAs. PhD Thesis, ECE, Carnegie Mellon University.

[Find all Publications and Downloads here]