ServerlessLLM

Oct 1, 2024

ServerlessLLM is an open-source framework dedicated to making custom LLM deployment easy, fast, and affordable. As models grow in size and complexity, deploying them on distributed GPUs has become increasingly costly and technically challenging, limiting the benefits of custom LLM deployment to only a select few. ServerlessLLM tackles these challenges by a full-stack, LLM-centric serverless system design, integrating multiple LLM-optimized layers—from checkpoint formats and inference runtimes to the storage layer and cluster scheduler.

Luo Mai

Assistant Professor

My research interests include computer systems, machine learning systems and data management.