HeliosDB Data Types Reference¶
Last Updated: November 1, 2025 Version: v6.0 Phase 2 M1
Complete reference for all data types supported by HeliosDB.
Table of Contents¶
- Numeric Types
- String Types
- Date/Time Types
- Boolean Type
- Binary Types
- JSON Types
- Array Types
- UUID Type
- Vector Types
- Spatial Types
- Network Address Types
- Custom Types
Numeric Types¶
Integer Types¶
| Type | Storage | Range | Use Case |
|---|---|---|---|
| SMALLINT | 2 bytes | -32,768 to 32,767 | Small counters |
| INTEGER / INT | 4 bytes | -2,147,483,648 to 2,147,483,647 | Default integer |
| BIGINT | 8 bytes | -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 | Large numbers |
| SERIAL | 4 bytes | Auto-increment integer | Primary keys |
| BIGSERIAL | 8 bytes | Auto-increment bigint | Large table PKs |
CREATE TABLE products (
id SERIAL PRIMARY KEY,
category_id INT,
stock_count SMALLINT,
views BIGINT
);
Decimal Types¶
| Type | Storage | Precision | Use Case |
|---|---|---|---|
| DECIMAL(p,s) / NUMERIC(p,s) | Variable | User-defined | Exact decimal (money) |
| REAL / FLOAT4 | 4 bytes | 6 decimal digits | Approximate |
| DOUBLE PRECISION / FLOAT8 | 8 bytes | 15 decimal digits | Scientific |
CREATE TABLE financial (
price DECIMAL(10, 2), -- $99,999,999.99
exchange_rate DECIMAL(18, 8), -- Precise rates
temperature REAL, -- Approximate
scientific_value DOUBLE PRECISION
);
Important: Use DECIMAL for money to avoid floating-point errors!
-- CORRECT
SELECT 0.1::DECIMAL + 0.2::DECIMAL; -- Exactly 0.3
-- ❌ WRONG
SELECT 0.1::REAL + 0.2::REAL; -- 0.30000001 (floating-point error)
String Types¶
Character Types¶
| Type | Description | Max Size | Use Case |
|---|---|---|---|
| VARCHAR(n) | Variable-length | n chars | General text |
| CHAR(n) | Fixed-length (padded) | n chars | Fixed codes |
| TEXT | Unlimited variable | 1 GB | Large text |
CREATE TABLE users (
username VARCHAR(50), -- Max 50 chars
country_code CHAR(2), -- Always 2 chars (US, UK)
bio TEXT -- Unlimited
);
Best Practices:
- Use VARCHAR(n) when you have a length limit
- Use TEXT when length is unknown or unlimited
- Avoid CHAR(n) unless truly fixed-length (codes, hashes)
Text Search Types¶
CREATE TABLE documents (
content TEXT,
content_tsv TSVECTOR -- Full-text search vector
);
-- Create index for fast search
CREATE INDEX idx_content_fts ON documents USING GIN(content_tsv);
Date/Time Types¶
| Type | Storage | Resolution | Range | Use Case |
|---|---|---|---|---|
| DATE | 4 bytes | 1 day | 4713 BC - 5874897 AD | Birthdates |
| TIME | 8 bytes | 1 microsecond | 00:00:00 - 24:00:00 | Time of day |
| TIMESTAMP | 8 bytes | 1 microsecond | 4713 BC - 294276 AD | Event time (no TZ) |
| TIMESTAMPTZ | 8 bytes | 1 microsecond | 4713 BC - 294276 AD | Event time (with TZ) |
| INTERVAL | 16 bytes | 1 microsecond | +/- 178,000,000 years | Durations |
CREATE TABLE events (
event_date DATE, -- 2025-11-01
event_time TIME, -- 14:30:00
created_at TIMESTAMP, -- 2025-11-01 14:30:00 (no timezone)
updated_at TIMESTAMPTZ -- 2025-11-01 14:30:00+00 (with timezone)
);
Best Practice: Always use TIMESTAMPTZ instead of TIMESTAMP for timestamps!
Date/Time Functions¶
-- Current date/time
SELECT NOW(); -- 2025-11-01 14:30:00+00
SELECT CURRENT_DATE; -- 2025-11-01
SELECT CURRENT_TIME; -- 14:30:00+00
SELECT CURRENT_TIMESTAMP; -- 2025-11-01 14:30:00+00
-- Date arithmetic
SELECT NOW() + INTERVAL '7 days';
SELECT NOW() - INTERVAL '1 month';
SELECT NOW() + INTERVAL '2 hours 30 minutes';
-- Extract parts
SELECT EXTRACT(YEAR FROM NOW());
SELECT EXTRACT(MONTH FROM NOW());
SELECT EXTRACT(DAY FROM NOW());
SELECT EXTRACT(HOUR FROM NOW());
-- Date truncation
SELECT DATE_TRUNC('day', NOW()); -- Start of today
SELECT DATE_TRUNC('month', NOW()); -- Start of month
SELECT DATE_TRUNC('hour', NOW()); -- Start of hour
-- Age calculation
SELECT AGE(NOW(), '1990-01-15'::DATE); -- 35 years 9 mons 17 days
Boolean Type¶
| Type | Storage | Values | Use Case |
|---|---|---|---|
| BOOLEAN / BOOL | 1 byte | TRUE, FALSE, NULL | Flags |
CREATE TABLE users (
is_active BOOLEAN DEFAULT TRUE,
is_verified BOOLEAN DEFAULT FALSE,
is_admin BOOLEAN
);
-- Query
SELECT * FROM users WHERE is_active = TRUE;
SELECT * FROM users WHERE is_admin; -- Same as = TRUE
SELECT * FROM users WHERE NOT is_banned;
Boolean literals:
- TRUE: TRUE, 't', 'true', 'y', 'yes', 'on', '1'
- FALSE: FALSE, 'f', 'false', 'n', 'no', 'off', '0'
Binary Types¶
| Type | Max Size | Use Case |
|---|---|---|
| BYTEA | 1 GB | Binary data, images, files |
CREATE TABLE files (
filename VARCHAR(255),
content BYTEA,
content_hash BYTEA -- SHA-256 hash
);
-- Insert binary data
INSERT INTO files (filename, content) VALUES
('logo.png', '\xDEADBEEF'::BYTEA);
-- Or from file
INSERT INTO files (filename, content) VALUES
('logo.png', pg_read_binary_file('/path/to/logo.png'));
Better Practice: Store large files in object storage (S3, Azure Blob) and store URL in database:
JSON Types¶
| Type | Storage | Indexed | Use Case |
|---|---|---|---|
| JSON | Variable | No | Slow, exact storage |
| JSONB | Variable | Yes | Fast, recommended |
Always use JSONB unless you need to preserve exact JSON formatting.
CREATE TABLE users (
id SERIAL PRIMARY KEY,
profile JSONB
);
-- Insert JSON
INSERT INTO users (profile) VALUES
('{"name": "Alice", "age": 30, "tags": ["developer", "rust"]}');
-- Query JSON
SELECT profile->>'name' AS name FROM users;
SELECT * FROM users WHERE profile->>'role' = 'admin';
SELECT * FROM users WHERE profile @> '{"verified": true}';
-- Create index for fast queries
CREATE INDEX idx_profile ON users USING GIN(profile);
-- Update JSON
UPDATE users
SET profile = profile || '{"verified": true}'
WHERE id = 1;
JSON Operators¶
| Operator | Description | Example |
|---|---|---|
-> |
Get JSON field (as JSON) | profile->'address' |
->> |
Get JSON field (as text) | profile->>'name' |
@> |
Contains | profile @> '{"verified": true}' |
<@ |
Contained by | '{"a":1}' <@ profile |
? |
Key exists | profile ? 'premium' |
?| |
Any key exists | profile ?| array['a','b'] |
?& |
All keys exist | profile ?& array['a','b'] |
Array Types¶
Any type can be an array by adding []:
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
tags TEXT[], -- Array of strings
ratings INTEGER[], -- Array of integers
locations POINT[] -- Array of points
);
-- Insert arrays
INSERT INTO posts (tags, ratings) VALUES
(ARRAY['rust', 'database'], ARRAY[5, 4, 5]);
-- Alternative syntax
INSERT INTO posts (tags) VALUES
('{rust,database,tutorial}');
-- Query arrays
SELECT * FROM posts WHERE 'rust' = ANY(tags);
SELECT * FROM posts WHERE tags && ARRAY['rust', 'python'];
SELECT * FROM posts WHERE tags @> ARRAY['rust'];
-- Array functions
SELECT array_length(tags, 1) FROM posts;
SELECT unnest(tags) FROM posts; -- Expand to rows
UUID Type¶
| Type | Storage | Format | Use Case |
|---|---|---|---|
| UUID | 16 bytes | xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx | Distributed PKs |
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Insert with auto-generated UUID
INSERT INTO orders (user_id) VALUES ('550e8400-e29b-41d4-a716-446655440000');
-- Query
SELECT * FROM orders WHERE id = '550e8400-e29b-41d4-a716-446655440000'::UUID;
When to use UUID vs SERIAL: - UUID: Distributed systems, public APIs, privacy - SERIAL: Single database, internal PKs, better performance
Vector Types¶
| Type | Description | Dimensions | Use Case |
|---|---|---|---|
| VECTOR(n) | Dense embedding | 1-4096 | Semantic search |
| SPARSEVEC(n) | Sparse vector | Up to 65535 | Keyword search |
-- Enable vector extension
CREATE EXTENSION IF NOT EXISTS helios_vector;
CREATE TABLE products (
id SERIAL PRIMARY KEY,
name TEXT,
embedding VECTOR(768), -- 768-dimensional dense vector
keywords SPARSEVEC(10000) -- Sparse keyword vector
);
-- Insert vectors
INSERT INTO products (name, embedding) VALUES
('Laptop', '[0.1, 0.2, 0.3, ...]'::VECTOR(768));
-- Find similar products (cosine similarity)
SELECT
name,
embedding <=> query_vector AS distance
FROM products
ORDER BY distance
LIMIT 10;
-- Create HNSW index for fast search
CREATE INDEX idx_products_embedding ON products
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
See Vector Search Guide.
Spatial Types¶
| Type | Description | Use Case |
|---|---|---|
| POINT | 2D point (x, y) | Locations |
| LINE | Infinite line | Geometry |
| LSEG | Line segment | Routes |
| BOX | Rectangle | Bounding boxes |
| PATH | Open/closed path | Polygons |
| POLYGON | Closed shape | Regions |
| CIRCLE | Circle | Radius search |
CREATE TABLE locations (
name VARCHAR(100),
location POINT,
region POLYGON
);
-- Insert spatial data
INSERT INTO locations (name, location) VALUES
('Office', '(37.7749, -122.4194)'); -- San Francisco
-- Distance queries
SELECT
name,
location <-> '(37.7749, -122.4194)'::POINT AS distance
FROM locations
ORDER BY distance
LIMIT 10;
For advanced GIS: Use PostGIS extension (fully compatible with HeliosDB):
CREATE EXTENSION postgis;
CREATE TABLE places (
name VARCHAR(100),
geom GEOMETRY(Point, 4326) -- WGS84 coordinate system
);
Network Address Types¶
| Type | Description | Example |
|---|---|---|
| INET | IPv4 or IPv6 address | 192.168.1.1 |
| CIDR | Network address | 192.168.1.0/24 |
| MACADDR | MAC address | 08:00:2b:01:02:03 |
CREATE TABLE connections (
ip_address INET,
network CIDR,
mac_address MACADDR
);
-- Insert network data
INSERT INTO connections VALUES
('192.168.1.100', '192.168.1.0/24', '08:00:2b:01:02:03');
-- Query by network
SELECT * FROM connections WHERE ip_address << '192.168.1.0/24';
Custom Types¶
Enum Types¶
-- Create enum type
CREATE TYPE user_status AS ENUM ('active', 'inactive', 'banned', 'pending');
CREATE TABLE users (
id SERIAL PRIMARY KEY,
status user_status DEFAULT 'pending'
);
-- Insert
INSERT INTO users (status) VALUES ('active');
-- Query
SELECT * FROM users WHERE status = 'active';
Composite Types¶
-- Create composite type
CREATE TYPE address AS (
street VARCHAR(100),
city VARCHAR(50),
state CHAR(2),
zip_code VARCHAR(10)
);
CREATE TABLE customers (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
shipping_address address
);
-- Insert
INSERT INTO customers (name, shipping_address) VALUES
('Alice', ROW('123 Main St', 'San Francisco', 'CA', '94102'));
-- Query
SELECT
name,
(shipping_address).city AS city
FROM customers;
Domain Types (Constrained Types)¶
-- Create domain with constraints
CREATE DOMAIN email_address AS VARCHAR(255)
CHECK (VALUE ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$');
CREATE DOMAIN positive_integer AS INTEGER
CHECK (VALUE > 0);
CREATE TABLE users (
id SERIAL PRIMARY KEY,
email email_address,
age positive_integer
);
-- This will fail validation
INSERT INTO users (email, age) VALUES ('invalid', -5);
Type Conversion¶
Explicit Casting¶
-- Using ::
SELECT '123'::INTEGER;
SELECT 123::TEXT;
SELECT '2025-11-01'::DATE;
-- Using CAST()
SELECT CAST('123' AS INTEGER);
SELECT CAST(NOW() AS DATE);
Implicit Casting¶
-- Integer to BIGINT (safe)
SELECT 123::INTEGER + 456::BIGINT;
-- String to timestamp
SELECT NOW() > '2025-01-01';
Size Comparison¶
| Type | Storage | Max Value/Size |
|---|---|---|
| BOOLEAN | 1 byte | TRUE/FALSE |
| SMALLINT | 2 bytes | 32,767 |
| INTEGER | 4 bytes | 2.1 billion |
| BIGINT | 8 bytes | 9.2 quintillion |
| DECIMAL(10,2) | Variable | User-defined |
| REAL | 4 bytes | 6 decimal places |
| DOUBLE PRECISION | 8 bytes | 15 decimal places |
| CHAR(10) | 10 bytes | 10 chars |
| VARCHAR(255) | Variable | 255 chars |
| TEXT | Variable | 1 GB |
| DATE | 4 bytes | ~10,000 years |
| TIMESTAMP | 8 bytes | Microsecond precision |
| UUID | 16 bytes | 128-bit unique ID |
| JSONB | Variable | ~1 GB |
| VECTOR(768) | 3,072 bytes | 768 dimensions |
Best Practices¶
- Use appropriate numeric types:
SMALLINTfor small counters (< 32K)INTEGERfor general purpose (< 2B)BIGINTfor large numbers (> 2B)DECIMALfor money (exact)-
DOUBLE PRECISIONfor scientific (approximate) -
Use TEXT instead of VARCHAR (unless you have a specific length limit):
-
Always use TIMESTAMPTZ (not TIMESTAMP):
-
Use JSONB instead of JSON:
-
Index large columns appropriately:
Last Updated: November 1, 2025 Version: v6.0 Phase 2 M1 Maintained by: HeliosDB Documentation Team
Related: Query Guide | FAQ | Getting Started