Skip to main content Skip to secondary navigation

Day 2: AI Infrastructure for Training and Inference

Main content start

April 21, 2026
Location: Computing and Data Science Building, Simonyi Conference Center 

AI infrastructure is evolving into a tightly integrated computing substrate spanning GPU clusters, accelerators, memory systems, and distributed software for training and inference. Achieving high Model FLOP Utilization (MFU) and reliability requires co-design across hardware and software layers, along with innovations in interconnection fabrics via software-driven approaches. 

This conference explores advances in ML accelerators, compilers, data representations, low-latency inference systems, and large-scale AI platforms. A central theme is reconciling performance with robustness in heterogeneous, failure-prone environments, while leveraging workload and fleet automation to sustain efficiency at scale. 

TimeAgenda
8:00amBreakfast & Registration
8:50amWelcome and Opening Remarks
Balaji Prabhakar | VMware Founders Professor of Computer Science, Stanford University
9:00amKeynote 1: The Evolution of ML Accelerators from General Purpose to Task-optimized
Nafea Bshara | Vice President and Distinguished Engineer, Amazon 
9:40amVoyager: A Compiler and Design-Space Exploration System for AI Accelerators
Priyanka Raina | Associate Professor of Electrical Engineering, Stanford University
10:10amHeterogeneous Data Representations for Efficient AI
Thierry Tambe | Assistant Professor of Electrical Engineering, Stanford University 
10:40amBreak
11:00amSYMI: Efficient Mixture-of-Experts Training via Model and Optimizer State Decoupling
Athinagoras Skiadopoulos | Research Scientist, NVIDIA Research
11:30amMichelangelo: Uber’s AI/ML Platform
Viv Keswani | Senior Director of Engineering, Uber
12:00pmLunch Break
1:00pmKeynote 2: Connecting GPUs Worldwide into an AI Platform for the World
Deepak Bansal | General Manager and Corporate Vice President, Microsoft Azure
1:40pmAre AI Fabrics and Infrastructure Really That Different?
Joseph L. White | ISG-CTO Fellow, Dell 
2:10pmAI Building AI: How AI is Accelerating Model Experimentation and Enabling The Flywheel
Animesh Singh | Senior Director, AI Platform and Infrastructure, LinkedIn
2:40pmDéjà Vu: Reconciling Fabric Perfection with Network Reality
Murai Sridharan | Senior Vice President or Networking, Oracle Cloud
3:10pmBreak
3:30pmSoftware-Driven Fabrics Using Clocks and Shims
Balaji Prabhakar | VMware Founders Professor of Computer Science, Stanford University
4:10pmFireside Panel: From Packets to Parameters
This panel explores how decades of networking and distributed systems innovations underpin modern AI infrastructure, examining evolving system abstractions, scalability challenges, and recurring design patterns. Discussion highlights interconnect design, job scheduling, low-latency real-time response generation and fault tolerance in AI systems and reimagines data center architectures for next-generation large-scale training and inference workloads. 
Albert Greenberg | Chief Architect Officer, Uber 
Sachin Katti | Head of Compute Infrastructure, OpenAI
Ion Stoica | Professor of EECS, UC Berkeley
5:15pmClose