Client-side Gen AI: Performance demo with a worker ⏱️

This demo showcases web performance questions and tips for client-side Gen AI (in-browser). This web page downloads the Gemma 2B model, and uses it through the MediaPipe LLM Inference API.

The animation below should keep running smoothly without jittering/freezing throughout your LLM usage. That's because we've moved both the model preparation steps and the inference work off the main thread, into a dedicated web worker.

LLM Prompt

LLM Output

Expensive animation

Client-side Gen AI: Performance demo with a worker ⏱️

About this demo