Back to topics

Document-Centric Databases: From SQLite-backed Search to Lightweight Document Stores

1 min read
206 words
Database Debates Document-Centric Databases:

Document-centric databases are no longer niche. Here are three practical takes: production-ready SQLite-backed search with embeddings, a lean Python doc store, and real-world doc-server indexing decisions.

Production-ready SQLite-backed search

Flamehaven FileSearch hits the deploy button fast: 5-minute setup, 100% self-hosted, REST API via FastAPI & Swagger UI. It uses SQLite as the store and Gemini embeddings for Q&A [1].

  • 5-minute setup — pip install flamehaven-filesearch[api] [1]
  • Self-hosted — data stays in-house [1]
  • SQLite store for portability [1]
  • Gemini embeddings for natural-language Q&A [1]

YaraDB is a lightweight open-source doc DB built with FastAPI and Pydantic. It offers a core engine, WAL, in-memory lookups, JSON storage, OCC, data integrity hashing, soft deletes, and batch operations [2].

  • Core Engine [2]
  • WAL — crash safety [2]
  • In-Memory First [2]
  • JSON Storage — yaradb_storage.json [2]
  • OCC [2]
  • Python Client — yaradb-client on PyPI [2]

Real-world indexing: tsvector vs Tantivy

The personal doc-server thread flags the indexing choice between tsvector (PostgreSQL) and standalone Tantivy for search [3].

  • tsvector in PostgreSQL [3]
  • Tantivy standalone [3]

Closing thought: for tiny, embedded setups, Flamehaven shines; for rapid prototyping, YaraDB helps; for production-grade indexing, weigh tsvector vs Tantivy by scale and language needs.

References

[1]
HackerNews

Production-ready, self-hosted document search with SQLite, Python SDK, REST API; Gemini embeddings; 5-minute setup; Docker-ready; vendor lock-in-free.

View source
[2]
HackerNews

Show HN: YaraDB – Lightweight open-source document database built with FastAPI

Open-source lightweight document database in Python using FastAPI; features WAL, OCC, JSON storage, REST API, with indexing, replication planned.

View source
[3]
HackerNews

Ask HN: Seeking advice on designing a personal document server

Explores local DB stack choices (PostgreSQL tsvector vs Tantivy), indexing, language detection, and grouping for a multi-language doc server.

View source

Want to track your own topics?

Create custom trackers and get AI-powered insights from social discussions

Get Started