LangChain integration for CockroachDB with native vector support
Quick Start β’ Features β’ Documentation β’ Examples β’ Contributing
Build LLM applications with CockroachDB's distributed SQL database and native vector search capabilities. This integration provides:
- π― Native Vector Support - CockroachDB's
VECTORtype - π C-SPANN Indexes - Distributed vector indexes optimized for scale
- π Automatic Retries - Handles serialization errors transparently
- β‘ Async & Sync APIs - Choose based on your use case
- ποΈ Distributed by Design - Built for CockroachDB's architecture
pip install langchain-cockroachdbimport asyncio
from langchain_cockroachdb import AsyncCockroachDBVectorStore, CockroachDBEngine
from langchain_openai import OpenAIEmbeddings
async def main():
# Initialize
engine = CockroachDBEngine.from_connection_string(
"cockroachdb://user:pass@host:26257/db"
)
await engine.ainit_vectorstore_table(
table_name="documents",
vector_dimension=1536,
)
vectorstore = AsyncCockroachDBVectorStore(
engine=engine,
embeddings=OpenAIEmbeddings(),
collection_name="documents",
)
# Add documents
await vectorstore.aadd_texts([
"CockroachDB is a distributed SQL database",
"LangChain makes building LLM apps easy",
])
# Search
results = await vectorstore.asimilarity_search(
"Tell me about databases",
k=2
)
for doc in results:
print(doc.page_content)
await engine.aclose()
asyncio.run(main())- Native
VECTORtype support with C-SPANN indexes - Advanced metadata filtering (
$and,$or,$gt,$in, etc.) - Hybrid search (full-text + vector similarity)
- Multi-tenancy with namespace-based isolation and C-SPANN prefix columns
- Persistent conversation storage in CockroachDB
- Session management by thread ID
- Drop-in replacement for other LangChain chat history implementations
- Short-term memory for multi-turn LangGraph agents
- Human-in-the-loop with interrupt/resume support
- Both
CockroachDBSaver(sync) andAsyncCockroachDBSaver - Compatible with LangGraph's
compile(checkpointer=...)interface
- Automatic retry logic with exponential backoff
- Connection pooling with health checks
- Configurable for different workloads
- Works with both SERIALIZABLE (default, recommended) and READ COMMITTED isolation
- Async-first design for high concurrency
- Sync wrapper for simple scripts
- Type-safe with full type hints
- Comprehensive test suite (177 tests)
LangChain Official Integration Docs:
Getting Started:
Guides:
- Vector Store
- Vector Indexes
- Hybrid Search
- Chat History
- Multi-Tenancy
- LangGraph Checkpointer
- Async vs Sync
π§ Working Examples
quickstart.py- Get started in 5 minutessync_usage.py- Synchronous APIvector_indexes.py- Index optimizationhybrid_search.py- FTS + vector searchmetadata_filtering.py- Advanced querieschat_history.py- Persistent conversationscheckpointer.py- LangGraph checkpointermulti_tenancy.py- Namespace-based multi-tenancyretry_configuration.py- Configuration patterns
# Clone repository
git clone https://github.com/cockroachdb/langchain-cockroachdb.git
cd langchain-cockroachdb
# Install dependencies
pip install -e ".[dev]"
# Start CockroachDB
docker-compose up -d
# Run tests
make test# Install docs dependencies
pip install -e ".[docs]"
# Serve documentation locally
mkdocs serve
# Open http://127.0.0.1:8000Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
- Distributed SQL - Scale horizontally across regions
- Native Vector Support - First-class
VECTORtype and C-SPANN indexes - Strong Consistency - SERIALIZABLE isolation by default, READ COMMITTED also supported
- Cloud Native - Deploy anywhere (IBM, AWS, GCP, Azure, on-prem)
- PostgreSQL Compatible - Familiar SQL with distributed superpowers
Apache License 2.0 - see LICENSE for details.
Built for the CockroachDB and LangChain communities.
- CockroachDB - Distributed SQL database
- LangChain - LLM application framework