Open vs Closed: The 2025 LLM Community Chooses OSS Weights Over Proprietary Giants

Open-source weights and OSS tooling are winning over the 2025 LLM crowd. The shift is landing in real-world workflows, not just hype.

Dev workflows go OSS - Qwen3, GLM 4.6, and Kimi K2 in a VS Code setup with Hugging Face Copilot Chat and VS Code proves surprisingly solid for coding, cheaper, and free from token anxiety. You can swap models on the fly for different tasks ^[2]. - GPT-5 Mini remains the cheapest option in the mix, underscoring the cost angle critics have long cited for open weights ^[2].

Under the hood, progress is concrete - The Qwen3-Next PR in llamacpp has been validated with a small test model, with a roadmap for performance ramps and new kernels to boost speed ^[3]. - Updates include fixes around multibatch convolution and计划 CUDA kernels, showing real engineering momentum behind open stacks ^[3].

Open data, governance, and tooling - bagofwords showcases an open data layer that connects any LLM to any data source, with agentic workflows, governance, RBAC/SSO, and on-prem deployment options ^[4].

Benchmarks that matter, openly shared - FamilyBench results put Gemini 2.5 Pro at ~81.5% accuracy, with Claude Sonnet 4.5 close behind and Qwen 3 Next 80B gaining ground (70% area) as GLM 4.6 rises in the mix ^[5].

The thread across posts shows openness delivering accessibility, governance, and vibrant developer ecosystems in 2025’s LLM space. Open weights aren’t just a trend; they’re becoming the default for builders who care about control and community.

References: 2, 3, 4, 5

References

[2]

My experience coding with open models (Qwen3, GLM 4.6, Kimi K2) inside VS Code

Discusses open-source coding LLMs (Qwen3, GLM 4.6, Kimi K2) vs GPT-5/Claude; affordability, local use, flexibility, and benchmarks in various tasks

View source

[3]

The qwen3-next pr in llamacpp has been validated with a small test model

Discusses Qwen3-Next in llama.cpp, benchmark debates, CUDA plans, RAM needs, and personal opinions on coding performance versus OSS 120B models.

View source

[4]

HackerNews

Show HN: I built an open-source AI data layer that connects any LLM to any data

Open-source project linking any LLM to data sources with centralized context, governance, observability, and agent-driven workflows for chat with dashboards.

View source

[5]

[Update] FamilyBench: New models tested - Claude Sonnet 4.5 takes 2nd place, Qwen 3 Next breaks 70%, new Kimi weirdly below the old version, same for GLM 4.6

Tests LLMs on FamilyBench family-tree task; reports Claude Sonnet 4.5 gains, Qwen 3 Next 80B progress, GLM 4.6 strong.

View source

References

My experience coding with open models (Qwen3, GLM 4.6, Kimi K2) inside VS Code

The qwen3-next pr in llamacpp has been validated with a small test model

Show HN: I built an open-source AI data layer that connects any LLM to any data

[Update] FamilyBench: New models tested - Claude Sonnet 4.5 takes 2nd place, Qwen 3 Next breaks 70%, new Kimi weirdly below the old version, same for GLM 4.6

Want to track your own topics?