Luo Mai

Assistant Professor

University of Edinburgh

About Me

I am an Assistant Professor in the School of Informatics at the University of Edinburgh. Starting August 2025, I’ll be promoted to Associate Professor (UK Reader).

At Edinburgh, I lead the Large-Scale Machine Learning Systems Group. I also co-lead the UK EPSRC Centre for Doctoral Training in Machine Learning Systems and an ARIA Project on Scaling AI Compute by 1000X.

My research interests lie at the intersection of computer systems, machine learning, and data management. My work has resulted in award-winning computer systems, recognized at top conferences such as OSDI, SOSP, NSDI, VLDB, JMLR, ICML, NeurIPS and ECCV. I have received awards from Google, Microsoft, Alibaba, and Tencent. I am the author of the open-source textbook Machine Learning Systems: Design and Implementation and co-founder of several popular open-source projects, such as TensorLayer, TorchOpt, and ServerlessLLM.

Before joining Edinburgh, I was a research associate at Imperial College London, working with Peter Pietzuch and a visiting researcher at Microsoft Research. My PhD, supervised by Paolo Costa and Alexander L. Wolf, was supported by a Google Fellowship in Cloud Computing.

Interests

Computer Systems
Machine Learning
Data Management

Education

PhD in Computer Science, 2018

Imperial College London, UK
MRes in Advanced Computing, 2012

Imperial College London, UK

News

[03/25, Achievement] Starting August 2025, I’ll be Reader (Associate Professor).

[03/25, Paper] WaferLLM, the world fatest LLM inference system, has been accepted to OSDI 2025.

[10/24, Grant] Secured a prestigious ARIA grant with Imperial College & Cambridge University.

[10/24, Paper] Tenplex, the first elastic LLM system, accepted to SOSP 2024.

[07/24, Student Achievement] Congrats to Yao Fu on Winning 2024 Rising Star in ML & Systems.

[07/24, Paper] ServerlessLLM, the first serverless LLM system, accepted to OSDI 2024.

[05/24, Award] Win a Microsoft Research Startrack Scholar Award.

[03/24, Grant] Secured funds from EPSRC and industry partners to build a CDT for ML Systems.

[12/23, Paper & Award] TorchOpt is accepted by JMLR and becomes a PyTorch Ecosystem Project.

[10/23, Award] Finalist for the Chancellor's Rising Star in Research.

See all posts

Publications

Congjie He, Yeqi Huang, Pei Mu, Ziming Miao, Jilong Xue, Lingxiao Ma, Fan Yang, Luo Mai (2025). WaferLLM: Large Language Model Inference at Wafer Scale. In OSDI.

PDF

Marcel Wagenländer, Guo Li, Bo Zhao, Luo Mai, Peter Pietzuch (2024). Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections. In SOSP.

PDF

Chuanhao Sun, Zhihang Yuan, Kai Xu, Luo Mai, N Siddharth, Shuo Chen, Mahesh K Marina (2024). Learning high-frequency functions made easy with sinusoidal positional encoding. In ICML.

PDF Code

Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii Ustiugov, Yuvraj Patel, Luo Mai (2024). ServerlessLLM: Low-Latency Serverless Inference for Large Language Models. In OSDI.

PDF Code

Jie Ren*, Xidong Feng*, Bo Liu*, Xuehai Pan*, Yao Fu, Luo Mai, Yaodong Yang (2023). TorchOpt: An Efficient Library for Differentiable Optimization. In JMLR.

PDF Code

Hanjing Wang*, Man-Kit Sit*, Congjie He, Ying Wen, Weinan Zhang, Jun Wang, Yaodong Yang, Luo Mai (2023). GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models. In ICML.

PDF Code

Chijun Sima*, Yao Fu*, Man-Kit Sit, Liyi Guo, Xuri Gong, Feng Lin, Junyu Wu, Yongsheng Li, Haidong Rong, Pierre-Louis Aublin, Luo Mai (2022). Ekko: A Large-Scale Deep Learning Recommender System with Low-Latency Model Update. In USENIX OSDI.

PDF

Bo Liu, Xidong Feng, Jie Ren, Luo Mai, Rui Zhu, Haifeng Zhang, Jun Wang, Yaodong Yang (2022). A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning. In NeurIPS.

PDF

Jie Ren*, Wenteng Liang*, Ran Yan, Luo Mai, Shiwen Liu, Xiao Liu (2022). MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment. In ECCV.

PDF Code