Open vs Closed LLMs in 2025: Benchmarks, World Knowledge, and the Open-Source Crunch

Open-vs-closed LLMs in 2025 is a hot debate: open-source leaderboards, calls for fully open models, and questions about world knowledge vs proprietary weights. Open benchmarks are shaping the convo, with leaderboards and comparisons that include both open and closed teams ^[1] ^[3].

Open-leaderboards and open weights Open-sourcers chase visibility on livebench.ai and lmarena.ai, with broader context from openrouter.ai rankings ^[1]. These scores matter because they feed real-world decisions about which open weights look competitive today.

Fully open models and open labs Is there any truly open LLM? The scene points to labs like AllenAI's open work, including OLMo and OLMo-2, plus LLM360’s K2-65B—all publishing data and training code on HuggingFace and GitHub ^[2]. The consensus: plenty of fully open efforts exist, though resources to train from scratch remain a barrier for many ^[2].

World knowledge and how to handle it Where world knowledge sits is debated. Many argue for combining open weights with retrieval tooling—web search or local knowledge databases—via RAG to shore up factual accuracy. Offline Wikipedia copies in vector databases are part of the playbook ^[3].

Open-source weaknesses exposed A post calls out a key flaw: open models lack feedback loops from usage data, slowing refinement versus closed systems, and synthetic data isn’t a perfect substitute ^[4].

Quantization and perceived quality Quantization work is messy: questions about sensitivity for models like GLM-4.5-Air-FP8 and GLM-4.6-REAP-266B-A32B-awq-sym exist, but a user’s setup error initially misled results. After fixing a docker issue, vLLM-powered runs aligned with expectations ^[5].

Bottom line: openness, tooling, and honest benchmarking will shape the 2025 open-vs-closed story.

References

[1]

Is there a leaderboard of current open source models?

Requests current leaderboard of open source LLMs; cites benchmarks, live rankings sites and runnable locally models for comparison purposes today.

View source

[2]

Is there any truly and fully open source LLL?

Discusses fully open source LLMs, data and training availability, cost of resources, LoRA training, and open labs like AllenAI projects

View source

[3]

which model has the best world knowledge? Open weights and proprietary.

Evaluates open vs closed weights, RAG usage, and world knowledge; compares Kimi, Llama, Ring, Gemini, GPT-4.5, architecture tradeoffs, deployment considerations.

View source

[4]

An inherent weakness in open source models

Discusses feedback gaps in open-source LLMs vs closed-source, data use, RLHF, synthetic data, and opt-in feedback sharing ideas for improvement.

View source

[5]

Is GLM 4.5 / 4.6 really sensitive to quantisation? Or is vLLM stupifying the models?

Discusses GLM quantization effects, vLLM usage, Air versus full models; compares quality, precision, and pruning impacts in practice.

View source

References

Is there a leaderboard of current open source models?

Is there any truly and fully open source LLL?

which model has the best world knowledge? Open weights and proprietary.

An inherent weakness in open source models

Is GLM 4.5 / 4.6 really sensitive to quantisation? Or is vLLM stupifying the models?

Want to track your own topics?