On-device AI is moving from novelty to normal. Posts spotlight privacy-first, locally run workflows—from a private CLI translator to a WearOS assistant and a local RAG server—sparking chatter about privacy, performance, and UX.
• please — a private on-device CLI that translates English into shell commands (like tar) and runs entirely in your terminal with no telemetry [1].
• Local server for local RAG — discussion about deploying a relatively large LLM (70B) locally, testing in an apartment, and weighing self-hosting versus pay-as-you-go options [2].
• Hopper on WearOS — a privacy-focused AI assistant that supports locally hosted LLMs. It avoids data collection beyond anonymized crash logs, uses OpenAI-compatible endpoints, and includes a companion phone app and webhook-driven tools [3].
• MEGATHREAD Local AI Hardware — November 2025 — users share racks and desktops from CPU-only builds to NVLink-enabled GPUs. Expect NVIDIA GPUs, NVLink bridges, and models like Qwen3-VL-2B and gpt-oss-120b; power, latency, and setups vary widely [4].
Bottom line: avoiding the cloud is gaining momentum, but trade-offs in setup, cost, and ease remain. Watch how hardware density and privacy policies shape the next wave of on-device AI. As more devices gain compute on-device, the privacy-first narrative will push more vendors to offer local options.
References
Show HN: Please – local CLI that translates English –> tar
Local on-device LLM CLI translates English to tar; no telemetry; complements Codex/Crush for quick syntax lookups.
View sourceLocal server for local RAG
Discusses deploying a 70B LLM on a home server for testing and demos vs cloud pay-as-you-go.
View sourceI built a privacy focused AI assistant for WearOS that supports locally hosted LLMs
Built a WearOS AI assistant supporting locally hosted LLMs and OpenAI-compatible endpoints; emphasizes privacy, web search, tools, and integrations extensively.
View source[MEGATHREAD] Local AI Hardware - November 2025
Megathread detailing local hardware for LLMs, sharing setups, model choices, throughput, and power usage across diverse configurations and contexts globally.
View source