The Local-LLMs Leap: How People are Running Privacy-Respecting, On-Device Models Today

On-device LLMs are no longer niche. The Local-LLMs leap centers on Gemma Web, a fully offline, in-browser AI workspace that runs Gemma models entirely in your browser via WebAssembly using the MediaPipe LLM Task API. It’s serverless, private, and supports local RAG—PDFs and docs processed in a Web Worker and stored in IndexedDB for offline use ^[1].

Pocket LLM is another angle: Pocket LLM lets you chat offline on-device on Apple Silicon with no login or data collection, powered by Apple MLX ^[2]. It’s marketed as fast and private, with multiple models supported.

References

[1]

I built "Gemma Web": A fully private, in-browser AI workspace that runs 100% offline via WASM. Would love your feedback!

Describes Gemma Web; on-device, offline, private LLM; seeks feedback; includes privacy concerns from commenters about data siphoning.

View source

[2]

Pocket LLM: Chat offline on device all private | AI

On-device LLMs like Llama, Gemma, Qwen; privacy, offline use; paid app critique; desire for tools or API integration, and alternatives.

View source

References

I built "Gemma Web": A fully private, in-browser AI workspace that runs 100% offline via WASM. Would love your feedback!

Pocket LLM: Chat offline on device all private | AI

Want to track your own topics?