Memory-Driven Gains in Tiny LLMs: A Path to Smarter Models Without Big Compute

Memory-based reasoning is letting tiny LLMs do bigger math without bigger hardware. A test around Qwen3-1.7B-Instruct shows math accuracy jumping from 40% to 48% on MATH Level 3-4 after a memory boost ^[1].

Memory-augmentation with a reasoning bank — Researchers built a memory system, implemented as reasoning-bank-slm, that extracts generalizable strategies from successful solutions and retrieves them for similar problems. The sprint ran on a Ryzen 9 7950X, 128GB RAM, RTX 4090 and RTX 3090, building a memory bank of 223 strategies from 100 training problems and testing on 100 test problems. Phase 1 showed a clear lift in performance ^[1].

What improved — Three samples where memory helped: • Complex plane geometry — baseline wrong; retrieved strategy “Vector Magnitude Method”; result correct (25π) ^[1]. • Polynomial analysis — baseline had no answer; retrieved “Equate Target Value to Function”; result correct (5) ^[1]. • Fibonacci series summation — baseline had no answer; retrieved “Coefficient Multiplication and Summation”; result correct (1) ^[1].

Limitations and regressions — Eight problems regressed; pattern was no answer rather than a wrong one. Hypothesis: 223 memories may be too many for retrieval ^[1].

Next steps and setup — Phase 2 aims to recursively improve by fine-tuning on successful traces ^[1]. Phase 1 results: accuracy 40.0% baseline to 48.0% with memory; net +8 problems (2:1). The memory bank holds 223 strategies ^[1].

Hardware and models tested — Qwen3-1.7B-Instruct, Qwen3-4B-Instruct, and Qwen3-Embedding-0.6B were the primary models examined ^[1].

Memory-based reasoning could keep small LLMs competitive without top-tier hardware, but the proof is still unfolding. Expect Phase 2 updates soon.

References

[1]

I tested if tiny LLMs can self-improve through memory: Qwen3-1.7B gained +8% accuracy on MATH problems

Examines memory-based reasoning for small LLMs; shows 1.7B gains 8% in MATH, compares with 4B, explores phase 2 potential risks.

View source

References

I tested if tiny LLMs can self-improve through memory: Qwen3-1.7B gained +8% accuracy on MATH problems

Want to track your own topics?