Skip to content

Pinecone Vector Configuration Guide

Comprehensive configuration reference for HeliosDB's Pinecone vector protocol support.

Connection Configuration

Basic Connection

from pinecone import Pinecone

# Connect to HeliosDB (Pinecone-compatible)
pc = Pinecone(
    api_key="your-api-key",
    host="http://localhost:8080"
)

# Access index
index = pc.Index("my-vectors")

Connection Parameters

Parameter Type Default Description
api_key string Required API authentication key
host string Required HeliosDB vector endpoint
index string Required Index name
namespace string "" Optional namespace

REST API Configuration

Base URL

http://localhost:8080/vectors/v1

Authentication Headers

curl -X POST "http://localhost:8080/vectors/v1/upsert" \
  -H "Api-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{...}'

Request Headers

Header Required Description
Api-Key Yes API authentication key
Content-Type Yes application/json
Accept No Response format

Index Configuration

Creating an Index

# Create index with configuration
pc.create_index(
    name="my-vectors",
    dimension=1536,
    metric="cosine",
    spec=ServerlessSpec(
        cloud="heliosdb",
        region="default"
    )
)

Index Parameters

Parameter Type Default Description
name string Required Index name
dimension int Required Vector dimension
metric string cosine Similarity metric
pods int 1 Number of pods
replicas int 1 Number of replicas
pod_type string p1.x1 Pod type

Similarity Metrics

Metric Description Use Case
cosine Cosine similarity Text embeddings
euclidean Euclidean distance Dense vectors
dotproduct Dot product Normalized vectors

Vector Operations Configuration

Upsert Configuration

# Upsert with options
index.upsert(
    vectors=[
        {
            "id": "vec1",
            "values": [0.1, 0.2, ...],
            "metadata": {"category": "A"}
        }
    ],
    namespace="production",
    show_progress=True
)

Query Configuration

# Query with options
results = index.query(
    vector=[0.1, 0.2, ...],
    top_k=10,
    filter={"category": {"$eq": "A"}},
    include_values=True,
    include_metadata=True,
    namespace="production"
)

Query Parameters

Parameter Type Default Description
vector list Required Query vector
top_k int 10 Number of results
filter dict None Metadata filter
include_values bool False Include vectors
include_metadata bool False Include metadata
namespace string "" Namespace

Filtering Configuration

Filter Operators

Operator Description Example
$eq Equal {"field": {"$eq": "value"}}
$ne Not equal {"field": {"$ne": "value"}}
$gt Greater than {"field": {"$gt": 10}}
$gte Greater or equal {"field": {"$gte": 10}}
$lt Less than {"field": {"$lt": 10}}
$lte Less or equal {"field": {"$lte": 10}}
$in In list {"field": {"$in": ["a", "b"]}}
$nin Not in list {"field": {"$nin": ["a", "b"]}}

Compound Filters

# AND filter
filter = {
    "$and": [
        {"category": {"$eq": "A"}},
        {"price": {"$lt": 100}}
    ]
}

# OR filter
filter = {
    "$or": [
        {"category": {"$eq": "A"}},
        {"category": {"$eq": "B"}}
    ]
}

Batch Configuration

Batch Upsert

# Batch configuration
index.upsert(
    vectors=large_vector_list,
    batch_size=100,  # Vectors per batch
    show_progress=True
)

Batch Parameters

Parameter Default Description
batch_size 100 Vectors per batch
pool_threads 1 Parallel threads
show_progress False Progress bar

Namespace Configuration

Working with Namespaces

# Upsert to specific namespace
index.upsert(
    vectors=[...],
    namespace="production"
)

# Query specific namespace
results = index.query(
    vector=[...],
    namespace="production"
)

# Delete from namespace
index.delete(
    ids=["vec1", "vec2"],
    namespace="production"
)

# List namespaces
stats = index.describe_index_stats()
namespaces = stats.namespaces

Hybrid Search Configuration

Dense + Sparse Vectors

# Upsert with sparse vectors
index.upsert(
    vectors=[
        {
            "id": "vec1",
            "values": [0.1, 0.2, ...],  # Dense vector
            "sparse_values": {
                "indices": [1, 5, 100],
                "values": [0.5, 0.3, 0.1]
            },
            "metadata": {"category": "A"}
        }
    ]
)

# Hybrid query
results = index.query(
    vector=[0.1, 0.2, ...],
    sparse_vector={
        "indices": [1, 5],
        "values": [0.5, 0.3]
    },
    top_k=10
)

HeliosDB-Specific Settings

Server Configuration

# heliosdb.toml
[pinecone]
enabled = true
port = 8080
bind = "0.0.0.0"
max_connections = 10000

[pinecone.auth]
api_key = "your-secure-api-key"

[pinecone.index]
default_dimension = 1536
default_metric = "cosine"
max_vectors = 10000000
max_metadata_size = 40960

[pinecone.performance]
batch_size = 1000
query_threads = 8
index_threads = 4

Environment Variables

Variable Description
HELIOSDB_PINECONE_PORT Vector API port
HELIOSDB_PINECONE_API_KEY API authentication key
HELIOSDB_PINECONE_MAX_DIMENSION Max vector dimension
HELIOSDB_PINECONE_MAX_VECTORS Max vectors per index

Performance Tuning

Query Optimization

# Optimize for speed
results = index.query(
    vector=[...],
    top_k=10,
    include_values=False,  # Don't return vectors
    include_metadata=False  # Don't return metadata
)

# Optimize for accuracy
results = index.query(
    vector=[...],
    top_k=100,  # Fetch more, filter client-side
    include_metadata=True
)

Batch Insert Optimization

# Optimal batch size
BATCH_SIZE = 100  # Recommended for most cases

# Insert in batches
for i in range(0, len(vectors), BATCH_SIZE):
    batch = vectors[i:i+BATCH_SIZE]
    index.upsert(vectors=batch)

Rate Limiting

Default Limits

Operation Limit Notes
Upsert 100 vectors/request Configurable
Query 10 top_k Default
Fetch 100 IDs/request Maximum
Delete 1000 IDs/request Maximum

Related: README.md | COMPATIBILITY.md | EXAMPLES.md

Last Updated: December 2025