The State of On-Device LLMs

sota
01:39:41
Xuan-Son Nguyen, an engineer at Hugging Face, specializes in on-device large language models (LLMs) and runtime optimization, working extensively with llama.cpp, ONNX, and maintaining the GGUF / llama.cpp integration on the Hugging Face Hub.
He will host a 1-hour deep dive into the current state of on-device LLMs — exploring their advantages, performance trade-offs, and current limitations. The session will conclude with an open Q&A during which you can ask him anything about on-device inference, model formats, and deployment challenges.
Engineering Spicyness level: 🌶️🌶️🌶️
Speaker
Xuan-Son Nguyen
Engineer at Hugging Face

The State of On-Device LLMs
01:39:41