Client-side Gen AI: Performance demo with a worker ⏱️
This demo showcases web performance questions and tips for client-side Gen AI (in-browser). This web page downloads the Gemma 2B model, and uses it through the MediaPipe LLM Inference API.
The animation below should keep running smoothly without jittering/freezing throughout your LLM usage. That's because we've moved both the model preparation steps and the inference work off the main thread, into a dedicated web worker.