Crossroads Seminars

The Crossroads seminar series is offered regularly on Fridays 2~3pm (US eastern). On the 4th Friday of each month, we will have a featured talk on the latest research results by the center's PIs and students. In remaining weeks, we will host a diverse range of talks including work-in-progress and outside visitors.

Upcoming Seminars

(Featured) Friday, October 22, 2021 | 2pm~3pm ET

FPGA Placement: Recent Progress and Road Ahead
David Z. Pan, The University of Texas at Austin

Abstract: In the FPGA implementation flow, placement plays a crucial role in determining the overall quality of results and runtime. After synthesis and logic mapping, placement determines the physical locations of heterogeneous instances to optimize wirelength, timing, power, routability, etc., while meeting various constraints in modern FPGAs. This talk will give an overview of recent progress of FPGA placement targeting large-scale heterogeneous FPGAs, including UTPlaceF which won ISPD FPGA Placement Contests before, and the current academic state-of-the-art elfPlace. Since placement may need to be called many times to achieve design closure, it is very important to ensure high scalability with increasing design complexity, e.g., on future 3D FPGAs. We will discuss how to scale-up and accelerate FPGA placement algorithms. We will also discuss some future directions, e.g., open-source to enable cross-team collaborations, and leveraging machine learning hardware/software for FPGA placement.
Bio: David Z. Pan is a Professor and Silicon Laboratories Endowed Chair at the Department of Electrical and Computer Engineering, The University of Texas at Austin.  His research interests include bidirectional AI and IC interactions, electronic design automation, design for manufacturing, hardware security, and CAD for analog/mixed-signal ICs and emerging technologies. He has published over 400 refereed journal/conference papers and 8 US patents. He has served in many journal editorial boards and conference committees, including various leadership roles such as ICCAD 2019 General Chair, ASP-DAC 2017 TPC Chair, and ISPD 2008 General Chair. He has received many awards, including SRC Technical Excellence Award, 19 Best Paper Awards (at DAC, ICCAD, DATE, ASP-DAC, ISPD, HOST, etc.), DAC Top 10 Author Award in Fifth Decade, ASP-DAC Frequently Cited Author Award, Communications of ACM Research Highlights, ACM/SIGDA Outstanding New Faculty Award, NSF CAREER Award, IBM Faculty Award (4 times), and many international CAD contest awards. He has graduated 40 PhD students and postdocs who have won many awards, including the First Place of ACM Student Research Competition Grand Finals (twice, in 2018 and 2021), ACM/SIGDA Student Research Competition Gold Medal (three times), ACM Outstanding PhD Dissertation in EDA Award (twice), EDAA Outstanding Dissertation Award (twice), etc. He is a Fellow of IEEE and SPIE.

Past Seminars

Friday, September 24, 2021 | 2pm~3pm ET

Crossroads RV1: Exploring Data on the Move Applications
Justine Sherry, Carnegie Mellon University

Abstract: The Crossroads FPGA is uniquely positioned to support applications which operate with high throughput over data "on the move" between endpoints such as CPUs, GPUs, Storage, Network, and other platforms. In this talk, we will highlight two applications under development in the Crossroads center. First, the Pigasus IDS is a 100Gbps hybrid FPGA + CPU platform for network security. Next, Norman is a new network dataplane for Linux that offloads the OS networking stack onto a Crossroads FPGA. We will discuss the high level goals of Pigasus and Norman, some of their design details, and finally contrast their two different approaches to using Crossroads: Pigasus, as an "FPGA-centric" design with the CPU working in support of the FPGA, and Norman as a "CPU-centric" design, with the FPGA working in support of the CPU.
Bio: Justine Sherry is an assistant professor at Carnegie Mellon University. Her interests are in computer networking; her work includes middleboxes, networked systems, measurement, cloud computing, and congestion control. Her recent research focuses on new opportunities and challenges arising from the deployment of middleboxes -- such as firewalls and proxies -- as services offered by clouds and ISPs. Dr. Sherry received her PhD (2016) and MS (2012) from UC Berkeley, and her BS and BA (2010) from the University of Washington. She is a recipient of the SIGCOMM doctoral dissertation award, the David J. Sakrison prize, paper awards at USENIX NSDI and ACM SIGCOMM, and an NSF Graduate Research Fellowship. Most importantly, she is always on the lookout for a great cappuccino.

Friday, July 23, 2021 | 2pm~3pm ET

Verilog to Routing (VTR): A Flexible Open-Source CAD Flow to Explore and Target Diverse FPGA Architectures
Vaughn Betz, University of Toronto

Abstract: With the need for improvements in compute performance and efficiency beyond what process scaling can provide, FPGAs and FPGA-like programmable accelerators that can target a range of compute tasks efficiently are of interest in many application areas. However, creating a new CAD flow that can evaluate and map circuits to a new programmable architecture remains a daunting task, making flexible CAD flows that can be quickly retargeted to new architectures highly desirable.

This talk will give an overview of the Verilog-to-Routing (VTR) open source tool flow that addresses this need. We'll discuss recent enhancements to VTR that have broadened the range of architetures it can target, and allow it to not only evaluate new FPGA architectures, but also program the chosen architectures that are committed to silicon.

Architecture flexibility can have a cost however, and a common conception in the FPGA Computer Aided Design (CAD) community is that architecture-specific algorithms and tools will significantly out-perform more general approaches which target a variety of FPGA architectures. In this talk we'll show how through careful algorithm design and code architecture VTR has improved result quality without architecture-specific code, challenging the idea that result quality and architecture flexibility are mutually exclusive. We will detail the key packing and routing enhancements that led to large improvements in wirelength and timing, while simultaneously reducing run time by over 6x.

Finally, we'll present efforts to use Reinforcement Learning to create more adaptable and efficient CAD algorithms. Taking placement as an example, we'll show how an RL-enhanced move generator can improve the quality/run-time trade-off of VTR's placement algorithm.
Bio: Vaughn Betz is a Professor and the NSERC/Intel Industrial Research Chair in Programmable Silicon at the University of Toronto. He is the original developer of the widely used VPR FPGA placement, routing and architecture evaluation CAD flow, and a lead developer in the VTR project that has built upon VPR. He co-founded Right Track CAD to commercialize VPR, and joined Altera upon its acquisition of Right Track CAD. Dr. Betz spent 11 years at Altera, ultimately as Senior Director of software engineering, and is one of the architects of the Quartus CAD system and the first five generations of the Stratix and Cyclone FPGA families. He holds 101 US patents and has published over 100 technical articles in the FPGA area, thirteen of which have won best or most significant paper awards. Dr. Betz is a Fellow of the IEEE and the National Academy of Inventors, and a Faculty Affiliate of the Vector Institute for Artificial Intelligence.

Friday, July 16, 2021 | 2pm~3pm ET

From “Field Programmable” to “Programmable”
James C. Hoe, Carnegie Mellon University

Abstract: This talk is an overview of RV5.

To elevate FPGAs from logic to computing roles, we need to address the greater requirement for programmability beyond being a “field programmable” ASIC. A computing FPGA will be asked to do more tasks than could fit on the fabric at once and to do new tasks that are unknown before deployment. Moreover, dynamically managing the logic resource utilization is a presently under-tapped source of performance optimization----by devoting available resources to only active tasks or by supporting tasks with differently-optimized design variants to changing conditions.

To maximally exploit the benefits of FPGAs’ programmability, the Intel/VMware Crossroads 3D FPGA Academic Research Center aims to make runtime reprogramming a regular mode of operation for Crossroads 3D-FPGA in future datacenter servers. This talk will motivate the need for a new, expanded design mindset by FPGA users and designers to fully pursue FPGAs’ programmability and dynamism. The talk next presents the Crossroads Center’s research toward realizing this new usage and programming on the Crossroads 3D-FPGA for datacenter applications. The talk will present a re-design of the Pigasus network intrusion detection/prevention system (IDS/IPS) following a design methodology to leverage the flexible and dynamic capabilities of FPGA targets.
Bio: James C. Hoe is a Professor of Electrical and Computer Engineering at Carnegie Mellon University. He received his Ph.D. in EECS from Massachusetts Institute of Technology in 2000 (S.M., 1994). He received his B.S. in EECS from UC Berkeley in 1992. He is interested in many aspects of computer architecture and digital hardware design, including the specific areas of FPGA architecture for computing; digital signal processing hardware; and high-level hardware design and synthesis. He is a Fellow of IEEE. For more information, please visit

Friday, July 9, 2021 | 2pm~3pm ET

Pigasus: Efficient Handling of Input-Dependent Streaming on FPGAs
Zhipeng Zhao, Carnegie Mellon University

Abstract: FPGAs have well-demonstrated success in many networking applications but failed in accelerating Intrusion Detection and Prevention Systems(IDS/IPS). The root cause is the mismatch of the traditional static, fixed-performance FPGA design and input-dependent behaviors of IDS/IPS. As a result, the design is provisioned to handle worst-case, losing the opportunity to utilize the resource allocated for worst-case to improve the common-case performance.

In this talk, I will present an FPGA based IDS/IPS called Pigasus which is tailored to the common-case, thus using minimal resources to extract maximum performance. Pigasus can achieve 100Gbps using 1 FPGA and on average 5 CPU cores, 100x faster than CPU-only baseline and 50x faster than existing FPGA designs. A natural objection to this design is that it will suffer from shifting workloads. In the second part, I will show how to use a disaggregated architecture and spillover mechanism to scale subcomponents of the system on demand to address changes in the traffic profile at both compile time and runtime.
Bio: Zhipeng Zhao is a Ph.D. candidate in Electrical and Computer Engineering at Carnegie Mellon University, advised by Prof. James C. Hoe. His research interests broadly lie at the intersection of FPGA and networking. Prior to CMU, he received a BS and an MS in Electrical Engineering, both from Beihang University, China.

Friday, June 25, 2021 | 2pm~3pm ET

Soft Processor Overlays to Improve Time-to-Solution
Derek Chiou, The University of Texas at Austin

Abstract: Soft Processor Overlays are application-specific processors implemented in FPGA logic. Overlays can be more efficient than standard processors because they can be highly specialized and can get to a working implementation faster than dedicated circuits in FPGAs because they have software compile times and are more debuggable. In this talk, I will discuss prior work in overlays, how we plan to experiment with, develop, and use overlays in Research Vector 2 (RV2) of the Intel/VMware Crossroads 3D-FPGA Academic Research Center, and how those overlays will influence and interact with other research vectors in the center, such as investigations on 3D FPGA base-die architecture (RV3) and partial reconfiguration (RV5).
Bio: Derek Chiou is a Research Scientist in the Electrical and Computer Engineering Department at The University of Texas at Austin and a Partner Architect at Microsoft responsible for future infrastructure hardware architecture. He is a co-founder of the Microsoft Azure SmartNIC effort and lead the Bing FPGA team to first deployment of Bing ranking on FPGAs. Until 2016, he was an associate professor at UT. Before UT, Dr. Chiou was a system architect and lead the performance modeling team at Avici Systems, a manufacturer of terabit core routers. Dr. Chiou received his Ph.D., S.M. and S.B. degrees in Electrical Engineering and Computer Science from MIT.

Friday, June 11, 2021 | 2pm~3pm ET

High-Performance Code Generation for Graph Applications
Sanil Rao, Carnegie Mellon University

Abstract: Software libraries have been a staple in computing, providing users with a maintained interface of functions for their applications. One such library, GraphBLAS, is used in the graph processing community because of its foundation in linear algebra, and its clear description of the overarching computation through its library calls. One issue that arises however, is these library calls have the potential to leave performance behind when looking for optimization, especially when one considers multiple library calls. Simply writing additional merged library calls is impractical given the importance of library clarity, and writing a general-purpose compiler that understands library call semantics would be infeasible. Therefore, we propose an approach from a higher level of abstraction, treating the GraphBLAS library as a specification, and generating code that understands the libraries’ semantics. We transform library calls to their linear algebraic descriptions, and use pattern matching techniques to look for optimizations. Preliminary results show that our code generation system, SPIRAL, achieves performance matching that of hand-optimized codes, while keeping the clarity of both the original library and user application.
Bio: Sanil Rao is a second-year PhD student In Electrical and Computer Engineering at CMU advised by Prof. Franz Franchetti and part of the SPIRAL group. His research focus is in the area of programming languages and compilers, specifically code generation. Prior to CMU, he received a BS in Computer Science from the University of Viriginia.

Friday, May 28, 2021 | 2pm~3pm ET

We Need Kernel Interposition over the Network Dataplane
Hugo Sadok, Carnegie Mellon University

Abstract: Kernel-bypass networking, which allows applications to circumvent the kernel and interface directly with NIC hardware, is one of the main tools for improving application network performance. However, allowing applications to circumvent the kernel makes it impossible to use tools (e.g., tcpdump) or impose policies (e.g., QoS and filters) that need to interpose on traffic sent by different applications running on a host. This makes maintainability and manageability a challenge for kernel-bypass applications. In response, we propose Kernel On-Path Interposition (KOPI), in which traditional kernel dataplane functionality is retained but implemented in a fully programmable SmartNIC. We hypothesize that KOPI can support the same tools and policies as the kernel stack while retaining the performance benefits of kernel bypass.
Bio: Hugo Sadok is a second-year PhD student in Computer Science at CMU advised by Prof. Justine Sherry and part of the SNAP Lab. His research interests are broadly in computer networks and computer systems. Prior to CMU, he received a BS in Electronic and Computer Engineering and an MS in Electrical Engineering, both from UFRJ.

Friday, May 14, 2021 | 2pm~3pm ET

The Role for Programmable Logic in Future Datacenter Servers
(An Overview of the Crossroads Center)

James C. Hoe, Carnegie Mellon University

Abstract: This talk is a rerun for those affiliated with the center and intended to introduce the Center to the outside audience.

Field Programmable Gate Arrays (FPGAs) have been undergoing rapid and dramatic changes fueled by their expanding use in datacenter computing. Rather than serving as a compromise or alternative to ASICs, FPGA 'programmable logic' is emerging as a third paradigm of compute that stands apart from traditional hardware vs. software archetypes. The Crossroads 3D-FPGA Research Center has been formed with the goal to define a new role for programmable logic in future datacenter servers. Guided by both the demands of modern network-driven, data-centric computing and the new capabilities from 3D integration, this center is developing the Crossroads 3D-FPGA as a new central fixture component on future server motherboards, serving to connect all server endpoints (network, storage, memory, CPU) intelligently. As a literal crossroads of data, a Crossroads 3D-FPGA can apply application-specific functions over data-on-the-move between any pair of server endpoints, intelligently steer data to the right core or accelerator, and reduce and compress the volume of data that needs to be moved between servers. This talk will overview the Crossroads 3-D FPGA concepts, as well as the associated set of research thrusts to pursue a full-stack solution spanning application, programming support, dynamic runtime, design automation, and architecture.
Bio: James C. Hoe is a Professor of Electrical and Computer Engineering at Carnegie Mellon University. He received his Ph.D. in EECS from Massachusetts Institute of Technology in 2000 (S.M., 1994). He received his B.S. in EECS from UC Berkeley in 1992. He is interested in many aspects of computer architecture and digital hardware design, including the specific areas of FPGA architecture for computing; digital signal processing hardware; and high-level hardware design and synthesis. He is a Fellow of IEEE. For more information, please visit

Friday, April 23, 2021 | 2pm~3pm ET

Raising the Level of Abstraction for FPGA System Design
Joseph Melber, Carnegie Mellon University

Abstract: Current Field Programmable Gate Array (FPGA) programming abstractions give disproportionate emphasis to reducing the design effort for processing kernels than to the memory access side of the design task. Designers are asked to build all the datapaths for on-chip buffering and data movements, as well as the state machines to coordinate these datapath activities. These datapaths are often ad-hoc efforts that are not generally reusable.

Software programmers leverage abstraction to simplify their design efforts⸺hardware designers should be supported by similar abstractions in order to increase FPGA programmability in modern computing systems. In this talk, I will focus on (1) re-imagining what memory should look like for FPGA hardware designers, and (2) virtualizing functionalities, devices and platforms for FPGA computing. I have been investigating a service-oriented abstraction and framework to simplify hardware design efforts for FPGA accelerator’s memory systems. The goal is to enable FPGA accelerator designers to configure a specialized memory system that presents abstract semantic-rich memory operations, across diverse memory devices, without performance overhead. Current efforts also extend this abstraction to virtualize these functionalities across devices and architectures. I will conclude by discussing the future potential of my research and vision for FPGA computing.
Bio: Joseph Melber is a Ph.D. candidate in Electrical and Computer Engineering at Carnegie Mellon University. He is advised by Dr. James C. Hoe. His research interests are reconfigurable computing, and computer architecture. His research focuses on memory systems and programming abstractions for heterogeneous FPGA computing systems. He received his M.S. in Electrical and Computer Engineering from Carnegie Mellon University in 2016, and B.S. in EE from the University at Buffalo in 2014.