DDR6 and Sigma-Delta: Hardware and quantization bets shaping on-device LLMs by 2028

DDR6 could be the unlock for on-device LLMs by 2028. If DDR6 hits 10,000+ MT/s and scales across dual- and quad-channel setups, memory bandwidth may finally stop bottlenecking larger local models ^[1].

DDR6 and Local LLMs In the LocalLLaMA discussions, rising RAM speed and smarter quantization could push 8B- to 27B-parameter models onto devices with chat-ready speeds. Proponents point to benchmarks like Gemma 3 27b, Deepseek V3, and GLM 4.5, and remind us that even modest GPUs—like Nvidia GTX 650—still matter for UI and display bottlenecks ^[1].

Sigma-Delta Quantization (SDQ-LLM) SDQ-LLM enables extremely low-bit LLMs by using upsampling and a Sigma-Delta Quantizer to encode high-precision parameters into 1-bit or roughly 1.58-bit representations, replacing multiplications with additions ^[2]. An adjustable Over-Sampling Ratio (OSR) lets you trade model size and accuracy, with MultiOSR distributing OSR across layers and within each layer according to weight variance ^[2]. Hadamard-based weight smoothing helps stabilize quantization, and tests on OPT and LLaMA families show the approach can stay robust under aggressive low-OSR settings ^[2].

Together, these bets sketch a path to on-device LLMs you can actually run by 2028. Keep an eye on real-world benchmarks, tooling, and the hardware-software handoff as DDR6 memory and SDQ-LLM push local inference from niche to normal.

References

[1]

Will DDR6 be the answer to LLM?

Discusses DDR6 memory for local LLMs, parameter sizes, MoE architectures, bandwidth, and cost vs cloud, predicting local use by 2028.

View source

[2]

SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size

Presents Sigma-Delta 1-bit quantization for LLMs; adjustable OSR, weight smoothing; compares OPT/LLaMA; seeks benchmarks and modest superiority over methods today

View source

References

Will DDR6 be the answer to LLM?

SDQ-LLM: Sigma-Delta Quantization for 1-bit LLMs of any size

Want to track your own topics?