HeliosDB Data Types Reference¶

Last Updated: November 1, 2025 Version: v6.0 Phase 2 M1

Complete reference for all data types supported by HeliosDB.

Table of Contents¶

Numeric Types
String Types
Date/Time Types
Boolean Type
Binary Types
JSON Types
Array Types
UUID Type
Vector Types
Spatial Types
Network Address Types
Custom Types

Numeric Types¶

Integer Types¶

Type	Storage	Range	Use Case
SMALLINT	2 bytes	-32,768 to 32,767	Small counters
INTEGER / INT	4 bytes	-2,147,483,648 to 2,147,483,647	Default integer
BIGINT	8 bytes	-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807	Large numbers
SERIAL	4 bytes	Auto-increment integer	Primary keys
BIGSERIAL	8 bytes	Auto-increment bigint	Large table PKs

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    category_id INT,
    stock_count SMALLINT,
    views BIGINT
);

Decimal Types¶

Type	Storage	Precision	Use Case
DECIMAL(p,s) / NUMERIC(p,s)	Variable	User-defined	Exact decimal (money)
REAL / FLOAT4	4 bytes	6 decimal digits	Approximate
DOUBLE PRECISION / FLOAT8	8 bytes	15 decimal digits	Scientific

CREATE TABLE financial (
    price DECIMAL(10, 2),      -- $99,999,999.99
    exchange_rate DECIMAL(18, 8), -- Precise rates
    temperature REAL,           -- Approximate
    scientific_value DOUBLE PRECISION
);

Important: Use DECIMAL for money to avoid floating-point errors!

--  CORRECT
SELECT 0.1::DECIMAL + 0.2::DECIMAL;  -- Exactly 0.3

-- ❌ WRONG
SELECT 0.1::REAL + 0.2::REAL;        -- 0.30000001 (floating-point error)

String Types¶

Character Types¶

Type	Description	Max Size	Use Case
VARCHAR(n)	Variable-length	n chars	General text
CHAR(n)	Fixed-length (padded)	n chars	Fixed codes
TEXT	Unlimited variable	1 GB	Large text

CREATE TABLE users (
    username VARCHAR(50),     -- Max 50 chars
    country_code CHAR(2),     -- Always 2 chars (US, UK)
    bio TEXT                  -- Unlimited
);

Best Practices: - Use VARCHAR(n) when you have a length limit - Use TEXT when length is unknown or unlimited - Avoid CHAR(n) unless truly fixed-length (codes, hashes)

Text Search Types¶

CREATE TABLE documents (
    content TEXT,
    content_tsv TSVECTOR  -- Full-text search vector
);

-- Create index for fast search
CREATE INDEX idx_content_fts ON documents USING GIN(content_tsv);

Date/Time Types¶

Type	Storage	Resolution	Range	Use Case
DATE	4 bytes	1 day	4713 BC - 5874897 AD	Birthdates
TIME	8 bytes	1 microsecond	00:00:00 - 24:00:00	Time of day
TIMESTAMP	8 bytes	1 microsecond	4713 BC - 294276 AD	Event time (no TZ)
TIMESTAMPTZ	8 bytes	1 microsecond	4713 BC - 294276 AD	Event time (with TZ)
INTERVAL	16 bytes	1 microsecond	+/- 178,000,000 years	Durations

CREATE TABLE events (
    event_date DATE,                  -- 2025-11-01
    event_time TIME,                  -- 14:30:00
    created_at TIMESTAMP,             -- 2025-11-01 14:30:00 (no timezone)
    updated_at TIMESTAMPTZ            -- 2025-11-01 14:30:00+00 (with timezone)
);

Best Practice: Always use TIMESTAMPTZ instead of TIMESTAMP for timestamps!

Date/Time Functions¶

-- Current date/time
SELECT NOW();                    -- 2025-11-01 14:30:00+00
SELECT CURRENT_DATE;             -- 2025-11-01
SELECT CURRENT_TIME;             -- 14:30:00+00
SELECT CURRENT_TIMESTAMP;        -- 2025-11-01 14:30:00+00

-- Date arithmetic
SELECT NOW() + INTERVAL '7 days';
SELECT NOW() - INTERVAL '1 month';
SELECT NOW() + INTERVAL '2 hours 30 minutes';

-- Extract parts
SELECT EXTRACT(YEAR FROM NOW());
SELECT EXTRACT(MONTH FROM NOW());
SELECT EXTRACT(DAY FROM NOW());
SELECT EXTRACT(HOUR FROM NOW());

-- Date truncation
SELECT DATE_TRUNC('day', NOW());      -- Start of today
SELECT DATE_TRUNC('month', NOW());    -- Start of month
SELECT DATE_TRUNC('hour', NOW());     -- Start of hour

-- Age calculation
SELECT AGE(NOW(), '1990-01-15'::DATE); -- 35 years 9 mons 17 days

Boolean Type¶

Type	Storage	Values	Use Case
BOOLEAN / BOOL	1 byte	TRUE, FALSE, NULL	Flags

CREATE TABLE users (
    is_active BOOLEAN DEFAULT TRUE,
    is_verified BOOLEAN DEFAULT FALSE,
    is_admin BOOLEAN
);

-- Query
SELECT * FROM users WHERE is_active = TRUE;
SELECT * FROM users WHERE is_admin;  -- Same as = TRUE
SELECT * FROM users WHERE NOT is_banned;

Boolean literals: - TRUE: TRUE, 't', 'true', 'y', 'yes', 'on', '1' - FALSE: FALSE, 'f', 'false', 'n', 'no', 'off', '0'

Binary Types¶

Type	Max Size	Use Case
BYTEA	1 GB	Binary data, images, files

CREATE TABLE files (
    filename VARCHAR(255),
    content BYTEA,
    content_hash BYTEA  -- SHA-256 hash
);

-- Insert binary data
INSERT INTO files (filename, content) VALUES
('logo.png', '\xDEADBEEF'::BYTEA);

-- Or from file
INSERT INTO files (filename, content) VALUES
('logo.png', pg_read_binary_file('/path/to/logo.png'));

Better Practice: Store large files in object storage (S3, Azure Blob) and store URL in database:

CREATE TABLE files (
    filename VARCHAR(255),
    s3_url TEXT,
    size_bytes BIGINT
);

JSON Types¶

Type	Storage	Indexed	Use Case
JSON	Variable	No	Slow, exact storage
JSONB	Variable	Yes	Fast, recommended

Always use JSONB unless you need to preserve exact JSON formatting.

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    profile JSONB
);

-- Insert JSON
INSERT INTO users (profile) VALUES
('{"name": "Alice", "age": 30, "tags": ["developer", "rust"]}');

-- Query JSON
SELECT profile->>'name' AS name FROM users;
SELECT * FROM users WHERE profile->>'role' = 'admin';
SELECT * FROM users WHERE profile @> '{"verified": true}';

-- Create index for fast queries
CREATE INDEX idx_profile ON users USING GIN(profile);

-- Update JSON
UPDATE users
SET profile = profile || '{"verified": true}'
WHERE id = 1;

JSON Operators¶

Operator	Description	Example
`->`	Get JSON field (as JSON)	`profile->'address'`
`->>`	Get JSON field (as text)	`profile->>'name'`
`@>`	Contains	`profile @> '{"verified": true}'`
`<@`	Contained by	`'{"a":1}' <@ profile`
`?`	Key exists	`profile ? 'premium'`
`?\|`	Any key exists	`profile ?\| array['a','b']`
`?&`	All keys exist	`profile ?& array['a','b']`

Array Types¶

Any type can be an array by adding []:

CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    tags TEXT[],                    -- Array of strings
    ratings INTEGER[],              -- Array of integers
    locations POINT[]               -- Array of points
);

-- Insert arrays
INSERT INTO posts (tags, ratings) VALUES
(ARRAY['rust', 'database'], ARRAY[5, 4, 5]);

-- Alternative syntax
INSERT INTO posts (tags) VALUES
('{rust,database,tutorial}');

-- Query arrays
SELECT * FROM posts WHERE 'rust' = ANY(tags);
SELECT * FROM posts WHERE tags && ARRAY['rust', 'python'];
SELECT * FROM posts WHERE tags @> ARRAY['rust'];

-- Array functions
SELECT array_length(tags, 1) FROM posts;
SELECT unnest(tags) FROM posts;  -- Expand to rows

UUID Type¶

Type	Storage	Format	Use Case
UUID	16 bytes	xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx	Distributed PKs

CREATE TABLE orders (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Insert with auto-generated UUID
INSERT INTO orders (user_id) VALUES ('550e8400-e29b-41d4-a716-446655440000');

-- Query
SELECT * FROM orders WHERE id = '550e8400-e29b-41d4-a716-446655440000'::UUID;

When to use UUID vs SERIAL: - UUID: Distributed systems, public APIs, privacy - SERIAL: Single database, internal PKs, better performance

Vector Types¶

Type	Description	Dimensions	Use Case
VECTOR(n)	Dense embedding	1-4096	Semantic search
SPARSEVEC(n)	Sparse vector	Up to 65535	Keyword search

-- Enable vector extension
CREATE EXTENSION IF NOT EXISTS helios_vector;

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name TEXT,
    embedding VECTOR(768),      -- 768-dimensional dense vector
    keywords SPARSEVEC(10000)   -- Sparse keyword vector
);

-- Insert vectors
INSERT INTO products (name, embedding) VALUES
('Laptop', '[0.1, 0.2, 0.3, ...]'::VECTOR(768));

-- Find similar products (cosine similarity)
SELECT
    name,
    embedding <=> query_vector AS distance
FROM products
ORDER BY distance
LIMIT 10;

-- Create HNSW index for fast search
CREATE INDEX idx_products_embedding ON products
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

See Vector Search Guide.

Spatial Types¶

Type	Description	Use Case
POINT	2D point (x, y)	Locations
LINE	Infinite line	Geometry
LSEG	Line segment	Routes
BOX	Rectangle	Bounding boxes
PATH	Open/closed path	Polygons
POLYGON	Closed shape	Regions
CIRCLE	Circle	Radius search

CREATE TABLE locations (
    name VARCHAR(100),
    location POINT,
    region POLYGON
);

-- Insert spatial data
INSERT INTO locations (name, location) VALUES
('Office', '(37.7749, -122.4194)');  -- San Francisco

-- Distance queries
SELECT
    name,
    location <-> '(37.7749, -122.4194)'::POINT AS distance
FROM locations
ORDER BY distance
LIMIT 10;

For advanced GIS: Use PostGIS extension (fully compatible with HeliosDB):

CREATE EXTENSION postgis;

CREATE TABLE places (
    name VARCHAR(100),
    geom GEOMETRY(Point, 4326)  -- WGS84 coordinate system
);

Network Address Types¶

Type	Description	Example
INET	IPv4 or IPv6 address	`192.168.1.1`
CIDR	Network address	`192.168.1.0/24`
MACADDR	MAC address	`08:00:2b:01:02:03`

CREATE TABLE connections (
    ip_address INET,
    network CIDR,
    mac_address MACADDR
);

-- Insert network data
INSERT INTO connections VALUES
('192.168.1.100', '192.168.1.0/24', '08:00:2b:01:02:03');

-- Query by network
SELECT * FROM connections WHERE ip_address << '192.168.1.0/24';

Custom Types¶

Enum Types¶

-- Create enum type
CREATE TYPE user_status AS ENUM ('active', 'inactive', 'banned', 'pending');

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    status user_status DEFAULT 'pending'
);

-- Insert
INSERT INTO users (status) VALUES ('active');

-- Query
SELECT * FROM users WHERE status = 'active';

Composite Types¶

-- Create composite type
CREATE TYPE address AS (
    street VARCHAR(100),
    city VARCHAR(50),
    state CHAR(2),
    zip_code VARCHAR(10)
);

CREATE TABLE customers (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    shipping_address address
);

-- Insert
INSERT INTO customers (name, shipping_address) VALUES
('Alice', ROW('123 Main St', 'San Francisco', 'CA', '94102'));

-- Query
SELECT
    name,
    (shipping_address).city AS city
FROM customers;

Domain Types (Constrained Types)¶

-- Create domain with constraints
CREATE DOMAIN email_address AS VARCHAR(255)
CHECK (VALUE ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$');

CREATE DOMAIN positive_integer AS INTEGER
CHECK (VALUE > 0);

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email email_address,
    age positive_integer
);

-- This will fail validation
INSERT INTO users (email, age) VALUES ('invalid', -5);

Type Conversion¶

Explicit Casting¶

-- Using ::
SELECT '123'::INTEGER;
SELECT 123::TEXT;
SELECT '2025-11-01'::DATE;

-- Using CAST()
SELECT CAST('123' AS INTEGER);
SELECT CAST(NOW() AS DATE);

Implicit Casting¶

-- Integer to BIGINT (safe)
SELECT 123::INTEGER + 456::BIGINT;

-- String to timestamp
SELECT NOW() > '2025-01-01';

Size Comparison¶

Type	Storage	Max Value/Size
BOOLEAN	1 byte	TRUE/FALSE
SMALLINT	2 bytes	32,767
INTEGER	4 bytes	2.1 billion
BIGINT	8 bytes	9.2 quintillion
DECIMAL(10,2)	Variable	User-defined
REAL	4 bytes	6 decimal places
DOUBLE PRECISION	8 bytes	15 decimal places
CHAR(10)	10 bytes	10 chars
VARCHAR(255)	Variable	255 chars
TEXT	Variable	1 GB
DATE	4 bytes	~10,000 years
TIMESTAMP	8 bytes	Microsecond precision
UUID	16 bytes	128-bit unique ID
JSONB	Variable	~1 GB
VECTOR(768)	3,072 bytes	768 dimensions

Best Practices¶

Use appropriate numeric types:
SMALLINT for small counters (< 32K)
INTEGER for general purpose (< 2B)
BIGINT for large numbers (> 2B)
DECIMAL for money (exact)
DOUBLE PRECISION for scientific (approximate)

Use TEXT instead of VARCHAR (unless you have a specific length limit):

--  GOOD
CREATE TABLE posts (content TEXT);

-- ⚠ OKAY if you have a limit
CREATE TABLE users (username VARCHAR(50));

Always use TIMESTAMPTZ (not TIMESTAMP):

--  GOOD
created_at TIMESTAMPTZ DEFAULT NOW()

-- ❌ BAD (loses timezone info)
created_at TIMESTAMP DEFAULT NOW()

Use JSONB instead of JSON:

--  GOOD (indexable, faster)
profile JSONB

-- ❌ BAD (slower, not indexable)
profile JSON

Index large columns appropriately:

-- B-tree for exact matches
CREATE INDEX idx_email ON users(email);

-- GIN for JSONB
CREATE INDEX idx_profile ON users USING GIN(profile);

-- HNSW for vectors
CREATE INDEX idx_embedding ON products USING hnsw(embedding);

Last Updated: November 1, 2025 Version: v6.0 Phase 2 M1 Maintained by: HeliosDB Documentation Team

Related: Query Guide | FAQ | Getting Started