In the past 15 years, cloud computing motivated the careful co-design of software and hardware systems in order to bring substantial efficiency gains for both application developers and cloud operators. This talk will focus on how emerging AI workloads are introducing new challenges and are motivating the next generation of cloud systems, beyond the deployment of AI accelerators. We will review recent Stanford projects to illustrate challenges such as making inference resource efficient, optimizing workloads with multiple AI calls, feeding AI workloads with data, and building scalable AI infrastructure. Our early results suggest that the most significant change is the necessity to broaden the co-design approach to encompass the applications themselves.
Christos Kozyrakis is a Professor of Electrical Engineering and Computer Science at Stanford University. He is also the faculty director of the Stanford Platform Lab. Christos specializes in computer architecture and systems software design. His current research focuses on cloud computing, systems for machine learning, and machine learning for systems. He also held positions on a full-time or part-time basis with Google, Intel, Microsoft, and Nvidia. Christos holds a BS degree from the University of Crete and a PhD degree from the University of California at Berkeley. He is a fellow of the ACM and the IEEE. He has received the ACM SIGARCH Maurice Wilkes Award, the ISCA Influential Paper Award, the NSF Career Award, the Okawa Foundation Research Grant, and faculty awards by IBM, Microsoft, and Google.