Memory-based reasoning is letting tiny LLMs do bigger math without bigger hardware. A test around Qwen3-1.7B-Instruct shows math accuracy jumping from 40% to 48% on MATH Level 3-4 after a memory boost [1].
Memory-augmentation with a reasoning bank — Researchers built a memory system, implemented as reasoning-bank-slm, that extracts generalizable strategies from successful solutions and retrieves them for similar problems. The sprint ran on a Ryzen 9 7950X, 128GB RAM, RTX 4090 and RTX 3090, building a memory bank of 223 strategies from 100 training problems and testing on 100 test problems. Phase 1 showed a clear lift in performance [1].
What improved — Three samples where memory helped: • Complex plane geometry — baseline wrong; retrieved strategy “Vector Magnitude Method”; result correct (25π) [1]. • Polynomial analysis — baseline had no answer; retrieved “Equate Target Value to Function”; result correct (5) [1]. • Fibonacci series summation — baseline had no answer; retrieved “Coefficient Multiplication and Summation”; result correct (1) [1].
Limitations and regressions — Eight problems regressed; pattern was no answer rather than a wrong one. Hypothesis: 223 memories may be too many for retrieval [1].
Next steps and setup — Phase 2 aims to recursively improve by fine-tuning on successful traces [1]. Phase 1 results: accuracy 40.0% baseline to 48.0% with memory; net +8 problems (2:1). The memory bank holds 223 strategies [1].
Hardware and models tested — Qwen3-1.7B-Instruct, Qwen3-4B-Instruct, and Qwen3-Embedding-0.6B were the primary models examined [1].
Memory-based reasoning could keep small LLMs competitive without top-tier hardware, but the proof is still unfolding. Expect Phase 2 updates soon.
References
I tested if tiny LLMs can self-improve through memory: Qwen3-1.7B gained +8% accuracy on MATH problems
Examines memory-based reasoning for small LLMs; shows 1.7B gains 8% in MATH, compares with 4B, explores phase 2 potential risks.
View source