On-device LLMs are no longer niche. The Local-LLMs leap centers on Gemma Web, a fully offline, in-browser AI workspace that runs Gemma models entirely in your browser via WebAssembly using the MediaPipe LLM Task API. It’s serverless, private, and supports local RAG—PDFs and docs processed in a Web Worker and stored in IndexedDB for offline use [1].
Pocket LLM is another angle: Pocket LLM lets you chat offline on-device on Apple Silicon with no login or data collection, powered by Apple MLX [2]. It’s marketed as fast and private, with multiple models supported.
References
I built "Gemma Web": A fully private, in-browser AI workspace that runs 100% offline via WASM. Would love your feedback!
Describes Gemma Web; on-device, offline, private LLM; seeks feedback; includes privacy concerns from commenters about data siphoning.
View sourcePocket LLM: Chat offline on device all private | AI
On-device LLMs like Llama, Gemma, Qwen; privacy, offline use; paid app critique; desire for tools or API integration, and alternatives.
View source