The MLE-to-AI-Engineer shift is accelerating. Routine model work is being democratized by automated tools and APIs, while the hard stuff stays in frontier labs and infra [1].
Quotes from Gemini and Claude highlight the drift: model-specific tasks are thinning out as automation expands. The valuable work is now frontier research, highly specialized domains, and scale-focused engineering [1].
Many see a split: (1) research + frontier labs, (2) applying off-the-shelf models to business verticals [1].
Reality on the ground: builders want local LLMs for privacy and offline use, but hardware costs and setup matter [2].
• llama.cpp — Apple silicon is a first-class citizen; GUI routes via Jan Desktop or LM Studio; hardware like Mac Mini M3 Pro can constrain choices [2].
• Gemma3-12B-QAT — underrated for natural language understanding tasks; good for summarization and QA; cheap to serve [3]. • Gemma3-27B — solid all around; some players note Gemma4 changes in the mix [3]. • Qwen3 — praised for tool-calling and general use [3]. • Qwen3next 80b — popular in community comparisons [3]. • gpt-oss — reliable for agentic tool calling [3]. • Hermes 4 70b — strong performer across tasks [3]. • Qwen3 30b — favored for sensitive-data contexts [3]. • oss-20b — a common workhorse for tool integration [3]. • Qwenvl — in testing for vision recognition [3].
Bottom line: local adoption is alive, but success hinges on tooling choices, RAG, and domain know-how. The landscape is evolving—and so should your playbook [3].
References
Are MLE roles being commoditized and squeezed? Are the jobs moving to AI engineering? [D]
Explores whether MLE roles commoditize, shifting toward AI engineering; discusses domain expertise, RAG, agents, and real-world recsys/LMM benefits and limitations.
View sourceTotal noob here who wants to run a local LLM to build my own coach and therapist chatbot
Begins with local LLMs; compares tools (llama.cpp, Koboldcpp, Open WebUI) and cloud options; warns therapy risks privacy, context, cost concerns.
View sourceDrop your underrated models you run LOCALLY
Community debates locally runnable LLMs, compares Gemma, Qwen3, GPT-Oss; tool-calling, performance, hardware, and use cases.
View source