Ablated models are under fire for performance drops, with fans saying MoE setups like Qwen3-30B-A3B-abliterated crash on logic, agentic tasks, and even hallucinations [1]. The thread argues these gaps let non-ablated peers—especially smaller 4-8B models—pull ahead in tests.
Degradation in Ablated Models The claim is stark: stripping safety or constraints hurts accuracy and reliability. Abliterated variants tend to hallucinate more, and the hit is strongest on reasoning and practical tasks [1]. However, the same post notes a twist: models that were abliterated but later finetuned show far less degradation than those left untouched [1].
Recovery via Finetuning After Ablation Two standout examples are highlighted as powerful, uncensored survivors: - mradermacher/Qwen3-30B-A3B-abliterated-erotic-i1-GGUF — powerful and near-original in many areas, with uncensored outputs [1]. - mlabonne/NeuralDaredevil-8B-abliterated — claimed to outperform the original after uncensored fine-tuning; the author provides dataset and training details [1].
Direct Model Comparisons and Uncensoring Claims In the testing batch, several variants are contrasted: - Huihui-Qwen3-30B-A3B-Thinking-2507-abliterated - Huihui-Qwen3-30B-A3B-abliterated-Fusion-9010 - Huihui-Qwen3-30B-A3B-Instruct-2507-abliterated The post notes that when asked, the abliterated Qwen3-30B-a3b models often spit generic business pitches (e.g., for shady activities) instead of realistic, uncensored responses [1]. This is framed as why uncensored, finetuned variants stand out.
Closing thought: the debate hinges on whether post-ablation fine-tuning can restore performance without losing uncensored access—a live benchmark story to watch in 2025 [1].
References
IMPORTANT: Why Abliterated Models SUCK. Here is a better way to uncensor LLMs.
Discusses abliterated LLM degradation, post-ablation fine-tuning recovery, uncensored models, many model comparisons, and potential benchmarks.
View source