Back to topics

Plain English Tool Calls: The Practical Impact on LLM Tooling

1 min read
206 words
Opinions on LLMs Plain English

Plain English outperforms JSON for LLM tool calling, boosting accuracy by +18pp and slashing variance by 70% in a study of 6,400 trials across 10 models [1]. This isn’t just theory—hands-on demos like Agentic RAG built with LangGraph show the same vibe in practice [2].

Core idea: the three-stage Natural Language Tools (NLT) framework replaces JSON with natural language. Stage 1—tool selection; Stage 2—tool execution; Stage 3—final response [1]. In the tests, accuracy rose from 69.1% to 87.5%, and variance dropped, with token overhead shrinking by about one-third [1].

Real-world impact: in the Agentic RAG demo, LangGraph powers the orchestration, letting an agent first search summaries and only fetch full documents via long-context LLMs like Gemini 2.0 Flash for better accuracy [2]. The post calls out minimal code and a brainy, production-friendly flow delivered in just a few lines [2].

Why interface choices matter for reliability and adoption - Lighter interfaces reduce token overhead and avoid brittle schemas [1]. - Decoupling tool selection from response generation cuts task interference [1]. - Demos like Agentic RAG with LangGraph show a path from demo to production [2].

Closing thought: as NL-first tool calls mature, more teams will ship real-world RAG apps with cleaner tooling and fewer format bottlenecks.

References

[1]
Reddit

[R] Plain English outperforms JSON for LLM tool calling: +18pp accuracy, -70% variance

Discussion of Natural Language Tools outperforming JSON tool calls in LLMs; shows improved accuracy, reduced variance, broader model compatibility overall.

View source
[2]
Reddit

Agentic RAG for Dummies - A minimal Agentic RAG demo built with LangGraph — learn Retrieval-Augmented Agents in minutes.

Introduces Agentic RAG with LangGraph; argues for smart indexing and long-context LLMs, invites feedback amid mixed opinions online from readers.

View source

Want to track your own topics?

Create custom trackers and get AI-powered insights from social discussions

Get Started