Back to topics

Token Economics in the LLM Era: Can Images Replace Text Tokens and Why Memory Matters

1 min read
227 words
Opinions on LLMs Token Economics

Images as prompts? That idea is getting traction. Two posts from pagewatch.ai ask if you can save on LLM tokens by using images instead of text [1][2]. A counterpoint from a LocalLLaMA discussion argues the real token savings come from a memory layer, not longer chats [3].

Image-as-token concept Two posts ask if images can cut text tokens in LLM workflows [1][2]. The central question is whether visuals could substitute for typed prompts, potentially reducing textual token usage.

Memory-layer argument In a separate thread, the claim is that the real token optimization isn't per chat; it's in the memory layer. The poster argues for centralized, persistent memory over bloated chat threads, shaping context-aware AI workflows [3]. “The real token optimization isn't per chat it's in the memory layer” [3]. The discussion frames centralized memory as the future, with phrases like “Centralized, persistent memory > bloated chat threads” and “context-aware AI workflows” [3].

Image-based prompts - Posts raise token-savings questions by replacing text with visuals [1][2]. • Centralized memory layer - Emphasizes persistent memory across sessions, advocating centralized memory over bloated chat threads [3]. • Architectural implications - The memory view treats token economics as a system-level design problem, not just per-chat tweaks [3].

Takeaway: In 2025, token efficiency may hinge as much on memory layers and adaptive UX as on shorter chats or image prompts [3].

References

[1]
HackerNews

Can you save on LLM tokens using images instead of text?

Discussion on using images as LLM input tokens to reduce text token usage, referencing pagewatch.ai blog

View source
[2]
HackerNews

Can you save on LLM tokens using images instead of text?

Discussion on reducing LLM token usage by turning text into images; evaluates image tokens as an alternative to text prompts

View source
[3]
Reddit

The real token optimization isn't per chat it's in the memory layer

Argues token savings come from centralized, persistent memory rather than per-chat prompts; emphasizes scalable AI architecture and memory-enabled workflows today.

View source

Want to track your own topics?

Create custom trackers and get AI-powered insights from social discussions

Get Started