Skip to content

HeliosDB Data Types Reference

Last Updated: November 1, 2025 Version: v6.0 Phase 2 M1

Complete reference for all data types supported by HeliosDB.


Table of Contents

  1. Numeric Types
  2. String Types
  3. Date/Time Types
  4. Boolean Type
  5. Binary Types
  6. JSON Types
  7. Array Types
  8. UUID Type
  9. Vector Types
  10. Spatial Types
  11. Network Address Types
  12. Custom Types

Numeric Types

Integer Types

Type Storage Range Use Case
SMALLINT 2 bytes -32,768 to 32,767 Small counters
INTEGER / INT 4 bytes -2,147,483,648 to 2,147,483,647 Default integer
BIGINT 8 bytes -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 Large numbers
SERIAL 4 bytes Auto-increment integer Primary keys
BIGSERIAL 8 bytes Auto-increment bigint Large table PKs
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    category_id INT,
    stock_count SMALLINT,
    views BIGINT
);

Decimal Types

Type Storage Precision Use Case
DECIMAL(p,s) / NUMERIC(p,s) Variable User-defined Exact decimal (money)
REAL / FLOAT4 4 bytes 6 decimal digits Approximate
DOUBLE PRECISION / FLOAT8 8 bytes 15 decimal digits Scientific
CREATE TABLE financial (
    price DECIMAL(10, 2),      -- $99,999,999.99
    exchange_rate DECIMAL(18, 8), -- Precise rates
    temperature REAL,           -- Approximate
    scientific_value DOUBLE PRECISION
);

Important: Use DECIMAL for money to avoid floating-point errors!

--  CORRECT
SELECT 0.1::DECIMAL + 0.2::DECIMAL;  -- Exactly 0.3

-- ❌ WRONG
SELECT 0.1::REAL + 0.2::REAL;        -- 0.30000001 (floating-point error)

String Types

Character Types

Type Description Max Size Use Case
VARCHAR(n) Variable-length n chars General text
CHAR(n) Fixed-length (padded) n chars Fixed codes
TEXT Unlimited variable 1 GB Large text
CREATE TABLE users (
    username VARCHAR(50),     -- Max 50 chars
    country_code CHAR(2),     -- Always 2 chars (US, UK)
    bio TEXT                  -- Unlimited
);

Best Practices: - Use VARCHAR(n) when you have a length limit - Use TEXT when length is unknown or unlimited - Avoid CHAR(n) unless truly fixed-length (codes, hashes)

Text Search Types

CREATE TABLE documents (
    content TEXT,
    content_tsv TSVECTOR  -- Full-text search vector
);

-- Create index for fast search
CREATE INDEX idx_content_fts ON documents USING GIN(content_tsv);

Date/Time Types

Type Storage Resolution Range Use Case
DATE 4 bytes 1 day 4713 BC - 5874897 AD Birthdates
TIME 8 bytes 1 microsecond 00:00:00 - 24:00:00 Time of day
TIMESTAMP 8 bytes 1 microsecond 4713 BC - 294276 AD Event time (no TZ)
TIMESTAMPTZ 8 bytes 1 microsecond 4713 BC - 294276 AD Event time (with TZ)
INTERVAL 16 bytes 1 microsecond +/- 178,000,000 years Durations
CREATE TABLE events (
    event_date DATE,                  -- 2025-11-01
    event_time TIME,                  -- 14:30:00
    created_at TIMESTAMP,             -- 2025-11-01 14:30:00 (no timezone)
    updated_at TIMESTAMPTZ            -- 2025-11-01 14:30:00+00 (with timezone)
);

Best Practice: Always use TIMESTAMPTZ instead of TIMESTAMP for timestamps!

Date/Time Functions

-- Current date/time
SELECT NOW();                    -- 2025-11-01 14:30:00+00
SELECT CURRENT_DATE;             -- 2025-11-01
SELECT CURRENT_TIME;             -- 14:30:00+00
SELECT CURRENT_TIMESTAMP;        -- 2025-11-01 14:30:00+00

-- Date arithmetic
SELECT NOW() + INTERVAL '7 days';
SELECT NOW() - INTERVAL '1 month';
SELECT NOW() + INTERVAL '2 hours 30 minutes';

-- Extract parts
SELECT EXTRACT(YEAR FROM NOW());
SELECT EXTRACT(MONTH FROM NOW());
SELECT EXTRACT(DAY FROM NOW());
SELECT EXTRACT(HOUR FROM NOW());

-- Date truncation
SELECT DATE_TRUNC('day', NOW());      -- Start of today
SELECT DATE_TRUNC('month', NOW());    -- Start of month
SELECT DATE_TRUNC('hour', NOW());     -- Start of hour

-- Age calculation
SELECT AGE(NOW(), '1990-01-15'::DATE); -- 35 years 9 mons 17 days

Boolean Type

Type Storage Values Use Case
BOOLEAN / BOOL 1 byte TRUE, FALSE, NULL Flags
CREATE TABLE users (
    is_active BOOLEAN DEFAULT TRUE,
    is_verified BOOLEAN DEFAULT FALSE,
    is_admin BOOLEAN
);

-- Query
SELECT * FROM users WHERE is_active = TRUE;
SELECT * FROM users WHERE is_admin;  -- Same as = TRUE
SELECT * FROM users WHERE NOT is_banned;

Boolean literals: - TRUE: TRUE, 't', 'true', 'y', 'yes', 'on', '1' - FALSE: FALSE, 'f', 'false', 'n', 'no', 'off', '0'


Binary Types

Type Max Size Use Case
BYTEA 1 GB Binary data, images, files
CREATE TABLE files (
    filename VARCHAR(255),
    content BYTEA,
    content_hash BYTEA  -- SHA-256 hash
);

-- Insert binary data
INSERT INTO files (filename, content) VALUES
('logo.png', '\xDEADBEEF'::BYTEA);

-- Or from file
INSERT INTO files (filename, content) VALUES
('logo.png', pg_read_binary_file('/path/to/logo.png'));

Better Practice: Store large files in object storage (S3, Azure Blob) and store URL in database:

CREATE TABLE files (
    filename VARCHAR(255),
    s3_url TEXT,
    size_bytes BIGINT
);

JSON Types

Type Storage Indexed Use Case
JSON Variable No Slow, exact storage
JSONB Variable Yes Fast, recommended

Always use JSONB unless you need to preserve exact JSON formatting.

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    profile JSONB
);

-- Insert JSON
INSERT INTO users (profile) VALUES
('{"name": "Alice", "age": 30, "tags": ["developer", "rust"]}');

-- Query JSON
SELECT profile->>'name' AS name FROM users;
SELECT * FROM users WHERE profile->>'role' = 'admin';
SELECT * FROM users WHERE profile @> '{"verified": true}';

-- Create index for fast queries
CREATE INDEX idx_profile ON users USING GIN(profile);

-- Update JSON
UPDATE users
SET profile = profile || '{"verified": true}'
WHERE id = 1;

JSON Operators

Operator Description Example
-> Get JSON field (as JSON) profile->'address'
->> Get JSON field (as text) profile->>'name'
@> Contains profile @> '{"verified": true}'
<@ Contained by '{"a":1}' <@ profile
? Key exists profile ? 'premium'
?| Any key exists profile ?| array['a','b']
?& All keys exist profile ?& array['a','b']

Array Types

Any type can be an array by adding []:

CREATE TABLE posts (
    id SERIAL PRIMARY KEY,
    tags TEXT[],                    -- Array of strings
    ratings INTEGER[],              -- Array of integers
    locations POINT[]               -- Array of points
);

-- Insert arrays
INSERT INTO posts (tags, ratings) VALUES
(ARRAY['rust', 'database'], ARRAY[5, 4, 5]);

-- Alternative syntax
INSERT INTO posts (tags) VALUES
('{rust,database,tutorial}');

-- Query arrays
SELECT * FROM posts WHERE 'rust' = ANY(tags);
SELECT * FROM posts WHERE tags && ARRAY['rust', 'python'];
SELECT * FROM posts WHERE tags @> ARRAY['rust'];

-- Array functions
SELECT array_length(tags, 1) FROM posts;
SELECT unnest(tags) FROM posts;  -- Expand to rows

UUID Type

Type Storage Format Use Case
UUID 16 bytes xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx Distributed PKs
CREATE TABLE orders (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

-- Insert with auto-generated UUID
INSERT INTO orders (user_id) VALUES ('550e8400-e29b-41d4-a716-446655440000');

-- Query
SELECT * FROM orders WHERE id = '550e8400-e29b-41d4-a716-446655440000'::UUID;

When to use UUID vs SERIAL: - UUID: Distributed systems, public APIs, privacy - SERIAL: Single database, internal PKs, better performance


Vector Types

Type Description Dimensions Use Case
VECTOR(n) Dense embedding 1-4096 Semantic search
SPARSEVEC(n) Sparse vector Up to 65535 Keyword search
-- Enable vector extension
CREATE EXTENSION IF NOT EXISTS helios_vector;

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name TEXT,
    embedding VECTOR(768),      -- 768-dimensional dense vector
    keywords SPARSEVEC(10000)   -- Sparse keyword vector
);

-- Insert vectors
INSERT INTO products (name, embedding) VALUES
('Laptop', '[0.1, 0.2, 0.3, ...]'::VECTOR(768));

-- Find similar products (cosine similarity)
SELECT
    name,
    embedding <=> query_vector AS distance
FROM products
ORDER BY distance
LIMIT 10;

-- Create HNSW index for fast search
CREATE INDEX idx_products_embedding ON products
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

See Vector Search Guide.


Spatial Types

Type Description Use Case
POINT 2D point (x, y) Locations
LINE Infinite line Geometry
LSEG Line segment Routes
BOX Rectangle Bounding boxes
PATH Open/closed path Polygons
POLYGON Closed shape Regions
CIRCLE Circle Radius search
CREATE TABLE locations (
    name VARCHAR(100),
    location POINT,
    region POLYGON
);

-- Insert spatial data
INSERT INTO locations (name, location) VALUES
('Office', '(37.7749, -122.4194)');  -- San Francisco

-- Distance queries
SELECT
    name,
    location <-> '(37.7749, -122.4194)'::POINT AS distance
FROM locations
ORDER BY distance
LIMIT 10;

For advanced GIS: Use PostGIS extension (fully compatible with HeliosDB):

CREATE EXTENSION postgis;

CREATE TABLE places (
    name VARCHAR(100),
    geom GEOMETRY(Point, 4326)  -- WGS84 coordinate system
);

Network Address Types

Type Description Example
INET IPv4 or IPv6 address 192.168.1.1
CIDR Network address 192.168.1.0/24
MACADDR MAC address 08:00:2b:01:02:03
CREATE TABLE connections (
    ip_address INET,
    network CIDR,
    mac_address MACADDR
);

-- Insert network data
INSERT INTO connections VALUES
('192.168.1.100', '192.168.1.0/24', '08:00:2b:01:02:03');

-- Query by network
SELECT * FROM connections WHERE ip_address << '192.168.1.0/24';

Custom Types

Enum Types

-- Create enum type
CREATE TYPE user_status AS ENUM ('active', 'inactive', 'banned', 'pending');

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    status user_status DEFAULT 'pending'
);

-- Insert
INSERT INTO users (status) VALUES ('active');

-- Query
SELECT * FROM users WHERE status = 'active';

Composite Types

-- Create composite type
CREATE TYPE address AS (
    street VARCHAR(100),
    city VARCHAR(50),
    state CHAR(2),
    zip_code VARCHAR(10)
);

CREATE TABLE customers (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    shipping_address address
);

-- Insert
INSERT INTO customers (name, shipping_address) VALUES
('Alice', ROW('123 Main St', 'San Francisco', 'CA', '94102'));

-- Query
SELECT
    name,
    (shipping_address).city AS city
FROM customers;

Domain Types (Constrained Types)

-- Create domain with constraints
CREATE DOMAIN email_address AS VARCHAR(255)
CHECK (VALUE ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$');

CREATE DOMAIN positive_integer AS INTEGER
CHECK (VALUE > 0);

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email email_address,
    age positive_integer
);

-- This will fail validation
INSERT INTO users (email, age) VALUES ('invalid', -5);

Type Conversion

Explicit Casting

-- Using ::
SELECT '123'::INTEGER;
SELECT 123::TEXT;
SELECT '2025-11-01'::DATE;

-- Using CAST()
SELECT CAST('123' AS INTEGER);
SELECT CAST(NOW() AS DATE);

Implicit Casting

-- Integer to BIGINT (safe)
SELECT 123::INTEGER + 456::BIGINT;

-- String to timestamp
SELECT NOW() > '2025-01-01';

Size Comparison

Type Storage Max Value/Size
BOOLEAN 1 byte TRUE/FALSE
SMALLINT 2 bytes 32,767
INTEGER 4 bytes 2.1 billion
BIGINT 8 bytes 9.2 quintillion
DECIMAL(10,2) Variable User-defined
REAL 4 bytes 6 decimal places
DOUBLE PRECISION 8 bytes 15 decimal places
CHAR(10) 10 bytes 10 chars
VARCHAR(255) Variable 255 chars
TEXT Variable 1 GB
DATE 4 bytes ~10,000 years
TIMESTAMP 8 bytes Microsecond precision
UUID 16 bytes 128-bit unique ID
JSONB Variable ~1 GB
VECTOR(768) 3,072 bytes 768 dimensions

Best Practices

  1. Use appropriate numeric types:
  2. SMALLINT for small counters (< 32K)
  3. INTEGER for general purpose (< 2B)
  4. BIGINT for large numbers (> 2B)
  5. DECIMAL for money (exact)
  6. DOUBLE PRECISION for scientific (approximate)

  7. Use TEXT instead of VARCHAR (unless you have a specific length limit):

    --  GOOD
    CREATE TABLE posts (content TEXT);
    
    -- ⚠ OKAY if you have a limit
    CREATE TABLE users (username VARCHAR(50));
    

  8. Always use TIMESTAMPTZ (not TIMESTAMP):

    --  GOOD
    created_at TIMESTAMPTZ DEFAULT NOW()
    
    -- ❌ BAD (loses timezone info)
    created_at TIMESTAMP DEFAULT NOW()
    

  9. Use JSONB instead of JSON:

    --  GOOD (indexable, faster)
    profile JSONB
    
    -- ❌ BAD (slower, not indexable)
    profile JSON
    

  10. Index large columns appropriately:

    -- B-tree for exact matches
    CREATE INDEX idx_email ON users(email);
    
    -- GIN for JSONB
    CREATE INDEX idx_profile ON users USING GIN(profile);
    
    -- HNSW for vectors
    CREATE INDEX idx_embedding ON products USING hnsw(embedding);
    


Last Updated: November 1, 2025 Version: v6.0 Phase 2 M1 Maintained by: HeliosDB Documentation Team

Related: Query Guide | FAQ | Getting Started