SOTA
Powered by

The State of On-Device LLMs

sota

01:39:41

Watch

Xuan-Son Nguyen, an engineer at Hugging Face, specializes in on-device large language models (LLMs) and runtime optimization, working extensively with llama.cpp, ONNX, and maintaining the GGUF / llama.cpp integration on the Hugging Face Hub.

He will host a 1-hour deep dive into the current state of on-device LLMs — exploring their advantages, performance trade-offs, and current limitations. The session will conclude with an open Q&A during which you can ask him anything about on-device inference, model formats, and deployment challenges.

Engineering Spicyness level: 🌶️🌶️🌶️

Speaker

Xuan-Son Nguyen

Xuan-Son Nguyen

Engineer at Hugging Face

The State of On-Device LLMs

01:39:41

Watch