Skip to content

Python SDK

Official Python client for GVDB. Full CRUD, hybrid search, streaming inserts, per-vector TTL, and bulk import from Parquet, NumPy, pandas, CSV, and AnnData.

Install

pip install gvdb

# With bulk import extras (Parquet, NumPy, Pandas, progress bar)
pip install gvdb[import]

# Everything including AnnData for single-cell workflows
pip install gvdb[import-all]

See client API for the full method reference and bulk import for loading large datasets.

Optional dependency extras

Extra Dependencies For
gvdb[parquet] pyarrow import_parquet
gvdb[numpy] numpy import_numpy
gvdb[pandas] pandas, pyarrow import_dataframe, import_csv
gvdb[h5ad] anndata, numpy import_h5ad
gvdb[progress] tqdm Progress bars during bulk imports
gvdb[import] All above except anndata Common ML workflows
gvdb[import-all] Everything + polars All formats

Quick start

from gvdb import GVDBClient

client = GVDBClient("localhost:50051", api_key="your-key")  # api_key optional

# Create a collection
client.create_collection("my_vectors", dimension=768)

# Insert vectors with metadata (so hybrid search has a BM25 field)
client.insert(
    "my_vectors",
    ids=[1, 2],
    vectors=[[0.1]*768, [0.3]*768],
    metadata=[{"description": "running shoes"}, {"description": "kitchen knives"}],
)

# Search
results = client.search("my_vectors", query_vector=[0.1]*768, top_k=10)
for r in results:
    print(f"ID: {r.id}, distance: {r.distance}")

# Hybrid search (BM25 + vector)
results = client.hybrid_search(
    "my_vectors",
    query_vector=[0.1]*768,
    text_query="running shoes",
    text_field="description",
    top_k=10,
    return_metadata=True,
)

# Clean up
client.drop_collection("my_vectors")
client.close()

Next

See also