PageIndex MCP is an LLM-internal index that lets models reason over documents inside their context window, sidestepping the vector hustle. It runs as a MCP server that exposes a document’s structure to Claude or Cursor, letting agents navigate and reason through content rather than chase embeddings. [1]
How it works - It uses a hierarchical table-of-contents tree inside the LLM's context. If the TOC is long, it performs a hierarchy search—from parent nodes to children—to keep latency reasonable, and it adds descriptions to help disambiguate near-misses. [1]
Where it fits vs Vector DB - Practitioners find PageIndex MCP shines in general financial/legal/textbook/research-paper contexts, where structured reasoning helps. For recommendation systems, you still need semantic similarity and a Vector DB, so this approach isn't recommended there. [1]
Open questions practitioners are asking - What happens when the TOC is too long? [1] - How does it handle near misses and disambiguation between close titles? [1] - What about documents that aren’t in a strict hierarchy? [1]
Real-world take and next steps - The post notes you can combine the index with a reasoning process and compare it to a Vector DB; examples are at pageindex.ai/mcp. [1]
Bottom line: LLM-native indexing is situational, not a universal replacement for vector stores.
References
Show HN: A Vectorless LLM-Native Document Index Method
Proposes PageIndex MCP: LLM-internal index for reasoning over documents; contrasts with vector databases; limited applicability acknowledged.
View source