Offline by design
Download once, run forever — the model caches in the browser and works on a plane. Private by physics: inference never touches a server.
Tiny instruct models that run 100% offline in any phone browser — on our own Rust-to-WASM engine. No GPU, no cloud, no data leaving the device.
Download once, run forever — the model caches in the browser and works on a plane. Private by physics: inference never touches a server.
A from-scratch Rust-to-WASM inference engine with SIMD int8 and ternary paths. No llama.cpp, no transformers.js — every byte of the runtime is ours to optimize.
The eeny / meeny / miny family scales from a 999K-parameter 1.76 MB storyteller to multi-million-parameter ternary models — pick the size your device and latency budget allow.
Models this small fine-tune with LoRA in minutes — on your own device or in the cloud — so every industry gets its own specialist, not a generalist.
Built for the worst-case target: a mid-range phone CPU in a browser tab. Everything stronger — laptops, desktops, WebGPU — is upside.
Use TinyLM for instant, private, on-device answers; escalate to the SPRAPP Panel when a decision needs frontier models debating it.
Free for personal, research, education and evaluation use — and commercial use is free under US$10K/yr product revenue. You only pay once your product makes money.
999K params · 1.76 MB · BPB 0.625 · ~417 tok/s in-browser
Specs & license6.2M params · 6.99 MB · BPB 0.519 · ~343 tok/s in-browser
Specs & licenseOne license covers commercial use in ONE product or domain. Annual includes all minor updates; lifetime includes all updates within the major version. Purchasers in World Bank low- and lower-middle-income countries: use code PPP60 at checkout for 60% off. Scale deployments (over US$1M revenue or 100K MAU) need a custom license — sprappcom@gmail.com.
Early access is open for teams that need offline, private, fine-tuned models. Tell us your use case.