pgvector over a dedicated vector DB
Adding a second data system for one use case was operational overhead we'd regret. pgvector inside the Postgres we already had is good enough for the next three orders of magnitude.
Context
Celer needs vector search — embeddings over reference material so the practice engine can retrieve context for question generation. The decision was where the vectors live.
The fashionable answers in 2026 are still Pinecone, Weaviate, Qdrant. They're genuinely good products. The unfashionable answer is "in your existing Postgres, via pgvector."
Decision
pgvector. The vectors live in the Postgres we already had.
Consequences
One DB to back up. One DB to restore. One DB to monitor. The whole operational story is half as much work.
One transaction boundary. When a deck is updated, the vectors for that deck are updated in the same transaction. No drift. No reconciliation jobs. No "eventually consistent" surprises showing up in user-visible features.
One auth model. Row-level security policies in Postgres already protect deck data. Extending them to embeddings is one column reference, not a second service to harden.
Costs and ceilings
- Throughput. Pinecone is faster at scale. We are not at that scale and won't be for ~5M vectors.
- Index complexity. HNSW indexes in pgvector are good. They are not yet best-in-class. If we hit a recall ceiling, we revisit.
- Migration risk. If we ever do migrate, the data model survives — what changes is the index. The application code stays roughly the same.
When this becomes the wrong answer
When the embedding store is materially larger than the relational data, OR when one of:
- Recall demands are extreme (we'd notice, because users would complain).
- The relational DB starts being a vector-search bottleneck for unrelated queries.
- Postgres-side vector ops exceed 10% of CPU during normal operation.
None of these are true today. Until they are, "one boring database" beats "two interesting databases" for a one-person shop.
Alternatives considered
- Pinecone. Operationally heavier. Better at scale we don't have.
- Qdrant self-hosted. Adds an entire service. Cost and complexity not justified.
- An in-memory index loaded from JSON. Considered seriously for ~1 day. Doesn't survive a restart of any sort. Dropped.