jax-js model chat

Chat Gemma 3 270M

Running locally with jax-js + WebGPU

Options

WebGPU uses fp16 weights. Wasm casts weights to fp32 on load.


KV cache is allocated dynamically for the current chat.

Talk to an LLM

The first message downloads and caches a 536 MB fp16 checkpoint. Everything runs locally in your browser.