Skip to content

GVDB

Python SDK

JonathanBerhe/gvdb

Python SDK¶

Official Python client for GVDB. Full CRUD, hybrid search, streaming inserts, per-vector TTL, and bulk import from Parquet, NumPy, pandas, CSV, and AnnData.

Install¶

pip install gvdb

# With bulk import extras (Parquet, NumPy, Pandas, progress bar)
pip install gvdb[import]

# Everything including AnnData for single-cell workflows
pip install gvdb[import-all]

See client API for the full method reference and bulk import for loading large datasets.

Optional dependency extras¶

Extra	Dependencies	For
`gvdb[parquet]`	pyarrow	`import_parquet`
`gvdb[numpy]`	numpy	`import_numpy`
`gvdb[pandas]`	pandas, pyarrow	`import_dataframe`, `import_csv`
`gvdb[h5ad]`	anndata, numpy	`import_h5ad`
`gvdb[progress]`	tqdm	Progress bars during bulk imports
`gvdb[import]`	All above except anndata	Common ML workflows
`gvdb[import-all]`	Everything + polars	All formats

Quick start¶

from gvdb import GVDBClient

client = GVDBClient("localhost:50051", api_key="your-key")  # api_key optional

# Create a collection
client.create_collection("my_vectors", dimension=768)

# Insert vectors with metadata (so hybrid search has a BM25 field)
client.insert(
    "my_vectors",
    ids=[1, 2],
    vectors=[[0.1]*768, [0.3]*768],
    metadata=[{"description": "running shoes"}, {"description": "kitchen knives"}],
)

# Search
results = client.search("my_vectors", query_vector=[0.1]*768, top_k=10)
for r in results:
    print(f"ID: {r.id}, distance: {r.distance}")

# Hybrid search (BM25 + vector)
results = client.hybrid_search(
    "my_vectors",
    query_vector=[0.1]*768,
    text_query="running shoes",
    text_field="description",
    top_k=10,
    return_metadata=True,
)

# Clean up
client.drop_collection("my_vectors")
client.close()

Next¶

Client API — every method and its parameters
Bulk import — Parquet, NumPy, pandas, CSV, h5ad
Examples — runnable scripts

See also¶