Document-Centric Databases: From SQLite-backed Search to Lightweight Document Stores

Document-centric databases are no longer niche. Here are three practical takes: production-ready SQLite-backed search with embeddings, a lean Python doc store, and real-world doc-server indexing decisions.

Production-ready SQLite-backed search

Flamehaven FileSearch hits the deploy button fast: 5-minute setup, 100% self-hosted, REST API via FastAPI & Swagger UI. It uses SQLite as the store and Gemini embeddings for Q&A ^[1].

5-minute setup — pip install flamehaven-filesearch[api] ^[1]
Self-hosted — data stays in-house ^[1]
SQLite store for portability ^[1]
Gemini embeddings for natural-language Q&A ^[1]

YaraDB is a lightweight open-source doc DB built with FastAPI and Pydantic. It offers a core engine, WAL, in-memory lookups, JSON storage, OCC, data integrity hashing, soft deletes, and batch operations ^[2].

Core Engine ^[2]
WAL — crash safety ^[2]
In-Memory First ^[2]
JSON Storage — yaradb_storage.json ^[2]
OCC ^[2]
Python Client — yaradb-client on PyPI ^[2]

Real-world indexing: tsvector vs Tantivy

The personal doc-server thread flags the indexing choice between tsvector (PostgreSQL) and standalone Tantivy for search ^[3].

tsvector in PostgreSQL ^[3]
Tantivy standalone ^[3]

Closing thought: for tiny, embedded setups, Flamehaven shines; for rapid prototyping, YaraDB helps; for production-grade indexing, weigh tsvector vs Tantivy by scale and language needs.

References

[1]

HackerNews

Production-ready, self-hosted document search with SQLite, Python SDK, REST API; Gemini embeddings; 5-minute setup; Docker-ready; vendor lock-in-free.

View source

[2]

HackerNews

Show HN: YaraDB – Lightweight open-source document database built with FastAPI

Open-source lightweight document database in Python using FastAPI; features WAL, OCC, JSON storage, REST API, with indexing, replication planned.

View source

[3]

HackerNews

Ask HN: Seeking advice on designing a personal document server

Explores local DB stack choices (PostgreSQL tsvector vs Tantivy), indexing, language detection, and grouping for a multi-language doc server.

View source

References

Show HN: YaraDB – Lightweight open-source document database built with FastAPI

Ask HN: Seeking advice on designing a personal document server

Want to track your own topics?