Chapter 13: CVM Architecture
13.1 Introduction to the Cognica Virtual Machine
The Cognica Virtual Machine (CVM) is a specialized bytecode interpreter designed for high-performance query execution. Unlike general-purpose virtual machines such as the JVM or CLR, CVM is purpose-built for database operations, with native support for document manipulation, aggregation, joins, sorting, and other query primitives.
CVM occupies a unique position in the database execution landscape. Traditional database engines use either interpreted expression evaluation (slow but flexible) or compile queries directly to native code (fast but complex). CVM provides a middle ground: a bytecode representation that is faster than interpretation, simpler than native compilation, and provides a clean target for optimization passes.
13.1.1 Design Philosophy
CVM's architecture reflects several key design principles:
-
Query-Centric Instruction Set: Rather than general-purpose operations, CVM provides opcodes specifically designed for database operations—field access, aggregation, sorting, and joins are first-class operations.
-
Register-Based Architecture: Like modern CPUs and high-performance VMs (Lua, Dalvik), CVM uses registers rather than a stack machine. This reduces instruction count and enables better optimization.
-
Computed Goto Dispatch: The interpreter uses computed goto (threaded code) for 10-20% faster dispatch compared to switch-based interpretation.
-
Spillable Data Structures: Memory-intensive operations (sort, hash join, aggregation) can automatically spill to disk when memory is exhausted.
-
Zero-Copy Optimizations: Composite rows enable multi-table joins without deep copying documents.
13.1.2 CVM in the Query Pipeline
CVM integrates into the query execution pipeline as follows:
13.2 Architectural Overview
13.2.1 Core Components
The CVM system consists of several interconnected components:
| Component | Location | Purpose |
|---|---|---|
| Interpreter | interpreter/interpreter.hpp | Bytecode execution with computed-goto dispatch |
| VMContext | interpreter/context.hpp | Runtime state (registers, stacks, buffers) |
| Opcode System | bytecode/opcode.hpp | 512+ opcode definitions |
| Instruction Format | bytecode/instruction.hpp | 32-bit instruction encoding |
| BytecodeModule | bytecode/bytecode_module.hpp | Compiled bytecode unit |
| Value System | types/value.hpp | 24-byte tagged union |
| Type System | types/type_system.hpp | 16 CVM types |
13.2.2 Key Dimensions
| Dimension | Value |
|---|---|
| Primary Opcodes | 256 (0x00-0xFF) |
| Extended Opcodes | 256 (via 0xFE prefix) |
| General Registers | 16 (R0-R15) |
| Float Registers | 8 (F0-F7) |
| CVM Types | 16 |
| VMValue Size | 24 bytes |
| Max Call Stack | 256 frames |
| Max Operand Stack | 1024 values |
| Max Cursors | 16 |
13.3 Type System
CVM implements a rich type system supporting SQL data types, JSON values, and internal execution types.
13.3.1 Type Enumeration
enum class CVMType : uint8_t {
kNull = 0x00, // SQL NULL / JSON null
kBool = 0x01, // Boolean true/false
kInt64 = 0x02, // 64-bit signed integer
kDouble = 0x03, // IEEE 754 double precision
kString = 0x04, // UTF-8 string
kArray = 0x05, // Ordered collection
kDocument = 0x06, // Key-value object
kBinary = 0x07, // Raw byte array (BYTEA)
kTimestamp = 0x08, // TIMESTAMP WITHOUT TIME ZONE
kTimestampTZ = 0x09, // TIMESTAMP WITH TIME ZONE
kDate = 0x0A, // Date (year-month-day)
kTime = 0x0B, // Time of day
kInterval = 0x0C, // Time interval
kDecimal = 0x0D, // Arbitrary precision decimal
kUUID = 0x0E, // 128-bit UUID
kCompositeRow = 0x0F, // Zero-copy JOIN composite
kAggState = 0x10, // Aggregation state (internal)
kVoid = 0xFE, // No value (side-effect only)
kUnknown = 0xFF // Type unknown at compile time
};
13.3.2 Type Categories
Types are classified into categories for operation dispatch:
Numeric Types (support arithmetic):
kInt64: 64-bit signed integerkDouble: IEEE 754 double-precision floating pointkDecimal: Arbitrary precision decimal (128-bit coefficient)
Temporal Types:
kTimestamp: Microseconds since Unix epoch (no timezone)kTimestampTZ: UTC microseconds with timezone awarenesskDate: Days since Unix epochkTime: Microseconds since midnightkInterval: PostgreSQL-style (months + days + microseconds)
Container Types:
kArray: Ordered collection of heterogeneous valueskDocument: Key-value map (JSON object)
Special Types:
kCompositeRow: Zero-copy reference to multiple documents (for joins)kAggState: Internal aggregation state
13.3.3 VMValue Structure
All values in CVM are represented using a 24-byte tagged union:
struct VMValue {
CVMType type; // 1 byte: type discriminant
uint8_t flags; // 1 byte: kFlagOwned, kFlagConst
uint16_t reserved; // 2 bytes: alignment padding
uint32_t padding; // 4 bytes: alignment padding
union { // 16 bytes: value storage
bool bool_val;
int64_t int64_val;
double double_val;
StringRef string_val; // (ptr, len)
ArrayRef array_val; // (ptr, len, cap)
Document* doc_val;
CompositeRow* composite_row_val;
TimestampVal timestamp_val;
DateVal date_val;
TimeVal time_val;
IntervalVal interval_val;
DecimalVal decimal_val;
UUIDVal uuid_val;
BinaryRef binary_val;
void* ptr_val;
uint8_t raw_bytes[16];
};
};
static_assert(sizeof(VMValue) == 24, "Cache-efficient size");
The 24-byte size is carefully chosen for cache efficiency—three VMValues fit exactly in a single 64-byte cache line on most modern processors.
13.3.4 Temporal Value Representations
Timestamp:
Where epoch is 1970-01-01 00:00:00 UTC.
Date:
Time:
Interval (PostgreSQL-compatible):
struct IntervalVal {
int32_t months; // Total months
int32_t days; // Total days (separate for DST)
int64_t microseconds; // Remaining microseconds
};
The three-component interval representation follows PostgreSQL's design, allowing correct handling of month boundaries and daylight saving time transitions.
Decimal (128-bit):
struct DecimalVal {
int64_t coefficient_high; // Upper 64 bits
int64_t coefficient_low; // Lower 64 bits
};
13.4 Instruction Encoding
CVM uses fixed-width 32-bit instructions with six primary formats, enabling efficient decoding and cache-friendly code layout.
13.4.1 Instruction Formats
Format A: 3-Register (32 bits)
Format B: 2-Register + Imm16 (32 bits)
Format C: Jump (32 bits)
Format D: Extended Imm64 (96 bits)
Format A (3-operand register operations):
R[dst] = R[src1] OP R[src2]
Used for arithmetic, comparison, and logical operations.
Format B (2-operand with 16-bit immediate):
R[dst] = R[src] OP sign_extend(imm16)
Used for immediate operations and pool index references.
Format C (conditional jump):
if R[cond]: PC += offset * 4
Used for branches and loops. The 16-bit signed offset is measured in words (4 bytes), providing a range of approximately +/- 128KB.
Format D (64-bit immediate):
R[dst] = imm64
Used for loading large constants (8-byte instruction total).
Format E (unary operations):
R[dst] = OP R[src]
Used for type conversions, negation, and other single-operand operations.
Format F (no operands):
HALT, NOP, YIELD
Used for control operations with no data dependencies.
13.4.2 Extended Instruction Formats
For complex operations like composite rows, CVM uses extended formats:
Format G (register + pool index):
[Opcode:8][Dst:4][Reserved:4][PoolIdx:8][Reserved:8]
Used for batch constant instructions and pool references.
Format H (two-word extended):
Word 1: [0xFE:8][ExtOpcode:8][Dst:8][Src:8]
Word 2: [Operand1:16][Operand2:16]
Used for composite row operations requiring 16-bit operands.
13.4.3 Encoding/Decoding
Instructions are encoded and decoded using helper functions:
// Encoding
auto encode_format_a(Opcode op, uint8_t dst, uint8_t src1,
uint8_t src2, uint8_t flags) -> uint32_t {
return (static_cast<uint32_t>(op) << 24) |
(static_cast<uint32_t>(dst) << 20) |
(static_cast<uint32_t>(src1) << 16) |
(static_cast<uint32_t>(src2) << 12) |
(static_cast<uint32_t>(flags) << 8);
}
// Decoding
auto decode_opcode(uint32_t word) -> Opcode {
return static_cast<Opcode>(word >> 24);
}
auto decode_dst(uint32_t word) -> uint8_t {
return (word >> 20) & 0xF;
}
13.5 Opcode System
CVM defines over 512 opcodes organized into functional categories.
13.5.1 Opcode Organization
0x00-0x0F: Data Movement
0x10-0x1F: Integer Arithmetic
0x20-0x2F: Floating Point Arithmetic
0x30-0x3F: Bitwise Operations
0x40-0x4F: Integer Comparison
0x50-0x57: Float Comparison
0x58-0x5F: String Comparison
0x60-0x6F: Logical Operations & Debug
0x70-0x7F: Control Flow
0x80-0x8F: Type Operations
0x90-0x9F: Field Access
0xA0-0xAF: Array Operations
0xB0-0xBF: String Operations
0xC0-0xCF: Aggregation
0xD0-0xDF: Query Buffers (Window, Hash, Sort, Set)
0xE0-0xEF: Working Tables & Functions
0xF0-0xF9: Cursor Operations
0xFA-0xFD: External Calls
0xFE: Extended Opcode Prefix
0xFF: Undefined
13.5.2 Data Movement Operations (0x00-0x0F)
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0x00 | NOP | No operation |
| 0x01 | MOVE | R[dst] = R[src] |
| 0x02 | MOVE_INT64 | R[dst] = imm64 |
| 0x03 | MOVE_FLOAT64 | R[dst] = imm64 (as double) |
| 0x04 | MOVE_NULL | R[dst] = NULL |
| 0x05 | MOVE_TRUE | R[dst] = true |
| 0x06 | MOVE_FALSE | R[dst] = false |
| 0x07 | LOAD_CONST | R[dst] = constant_pool[imm16] |
| 0x08 | LOAD_STRING | R[dst] = string_pool[imm16] |
| 0x09 | LOAD_ARRAY | R[dst] = array_pool[imm16] |
| 0x0A | LOAD_DOC | R[dst] = doc_pool[imm16] |
| 0x0B | STORE_REG | temp_stack[imm16] = R[src] |
| 0x0C | LOAD_REG | R[dst] = temp_stack[imm16] |
| 0x0D | LOAD_PARAM | R[dst] = parameters[imm16] |
| 0x0E | LOAD_INPUT | R[dst] = input_document |
13.5.3 Arithmetic Operations (0x10-0x2F)
Integer Arithmetic (0x10-0x1F):
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0x10 | ADD_I64 | R[dst] = R[src1] + R[src2] |
| 0x11 | SUB_I64 | R[dst] = R[src1] - R[src2] |
| 0x12 | MUL_I64 | R[dst] = R[src1] * R[src2] |
| 0x13 | DIV_I64 | R[dst] = R[src1] / R[src2] |
| 0x14 | MOD_I64 | R[dst] = R[src1] % R[src2] |
| 0x15 | NEG_I64 | R[dst] = -R[src] |
| 0x16 | ABS_I64 | R[dst] = abs(R[src]) |
| 0x17 | INC_I64 | R[dst] = R[src] + 1 |
| 0x18 | DEC_I64 | R[dst] = R[src] - 1 |
Floating Point Arithmetic (0x20-0x2F):
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0x20 | ADD_F64 | R[dst] = R[src1] + R[src2] |
| 0x21 | SUB_F64 | R[dst] = R[src1] - R[src2] |
| 0x22 | MUL_F64 | R[dst] = R[src1] * R[src2] |
| 0x23 | DIV_F64 | R[dst] = R[src1] / R[src2] |
| 0x24 | NEG_F64 | R[dst] = -R[src] |
| 0x25 | ABS_F64 | R[dst] = abs(R[src]) |
| 0x26 | SQRT | R[dst] = sqrt(R[src]) |
| 0x27 | POW | R[dst] = pow(R[src1], R[src2]) |
| 0x28 | FLOOR | R[dst] = floor(R[src]) |
| 0x29 | CEIL | R[dst] = ceil(R[src]) |
| 0x2A | ROUND | R[dst] = round(R[src]) |
13.5.4 Comparison Operations (0x40-0x5F)
Integer Comparison (0x40-0x4F):
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0x40 | CMP_EQ_I64 | R[dst] = (R[src1] == R[src2]) |
| 0x41 | CMP_NE_I64 | R[dst] = (R[src1] != R[src2]) |
| 0x42 | CMP_LT_I64 | R[dst] = (R[src1] < R[src2]) |
| 0x43 | CMP_LE_I64 | R[dst] = (R[src1] <= R[src2]) |
| 0x44 | CMP_GT_I64 | R[dst] = (R[src1] > R[src2]) |
| 0x45 | CMP_GE_I64 | R[dst] = (R[src1] >= R[src2]) |
String Comparison (0x58-0x5F):
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0x58 | CMP_EQ_STR | R[dst] = (R[src1] == R[src2]) |
| 0x59 | CMP_NE_STR | R[dst] = (R[src1] != R[src2]) |
| 0x5A | CMP_LT_STR | R[dst] = (R[src1] < R[src2]) |
| 0x5B | CMP_LE_STR | R[dst] = (R[src1] <= R[src2]) |
13.5.5 Control Flow Operations (0x70-0x7F)
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0x70 | JUMP | PC += offset * 4 |
| 0x71 | JUMP_TRUE | if R[cond]: PC += offset * 4 |
| 0x72 | JUMP_FALSE | if !R[cond]: PC += offset * 4 |
| 0x73 | JUMP_NULL | if R[cond] is NULL: PC += offset * 4 |
| 0x74 | JUMP_NOT_NULL | if R[cond] is not NULL: PC += offset * 4 |
| 0x75 | CALL | Push frame, jump to function |
| 0x76 | RETURN | Pop frame, return value |
| 0x77 | HALT | Stop execution |
| 0x78 | YIELD | Yield for streaming |
13.5.6 Field Access Operations (0x90-0x9F)
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0x90 | GET_FIELD | R[dst] = R[doc].field[pool_idx] |
| 0x91 | SET_FIELD | R[doc].field[pool_idx] = R[src] |
| 0x92 | HAS_FIELD | R[dst] = R[doc].has(field[pool_idx]) |
| 0x93 | GET_NESTED | R[dst] = R[doc].path[pool_idx] |
| 0x94 | DELETE_FIELD | R[doc].delete(field[pool_idx]) |
In addition to the primary field access opcodes above, the extended opcode GET_OUTER_FIELD (ExtOp 0x85) reads a field from the outer query's current row rather than the current document. This opcode is emitted by the plan lowering pass when compiling correlated subqueries — column references whose table alias does not belong to the inner query's local alias set are lowered to GET_OUTER_FIELD instead of GET_FIELD. The interpreter resolves the field from the outer row context, supporting both Document and CompositeRow outer rows. This enables the CVM to execute correlated subqueries natively without falling back to the Volcano executor.
13.5.7 Aggregation Operations (0xC0-0xCF)
Single Aggregation (0xC0-0xC7):
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0xC0 | AGG_INIT | Initialize aggregation state |
| 0xC1 | AGG_ACCUMULATE | Accumulate value into state |
| 0xC2 | AGG_FINALIZE | Compute final aggregate result |
| 0xC3 | AGG_MERGE | Merge two aggregate states |
Aggregation Table (0xC8-0xCF):
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0xC8 | AGG_TABLE_NEW | Create aggregation table |
| 0xC9 | AGG_GET_OR_CREATE | Get/create group state |
| 0xCA | AGG_ITER_INIT | Initialize table iterator |
| 0xCB | AGG_ITER_NEXT | Get next group |
| 0xCC | AGG_TABLE_NEW_MULTI | Multi-function agg table |
| 0xCD | AGG_STATE_AT | Access state at index |
| 0xCE | AGG_ITER_HAS_NEXT | Check for more groups |
13.5.8 Query Buffer Operations (0xD0-0xDF)
Window Buffer (0xD0-0xD3):
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0xD0 | WIN_BUF_NEW | Create window buffer |
| 0xD1 | WIN_BUF_ADD | Add row to window |
| 0xD2 | WIN_COMPUTE | Compute window function |
| 0xD3 | WIN_BUF_NEXT | Get next windowed row |
Hash Table (0xD4-0xD7):
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0xD4 | HASH_TABLE_NEW | Create hash table |
| 0xD5 | HASH_TABLE_INSERT | Insert key-value pair |
| 0xD6 | HASH_TABLE_PROBE | Probe for key |
| 0xD7 | HASH_TABLE_DESTROY | Destroy hash table |
Sort Buffer (0xD8-0xDB):
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0xD8 | SORT_BUF_NEW | Create sort buffer |
| 0xD9 | SORT_BUF_ADD | Add row to buffer |
| 0xDA | SORT_BUF_NEXT | Get next sorted row |
| 0xDB | SORT_BUF_DESTROY | Destroy sort buffer |
Set Operations (0xDC-0xDF):
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0xDC | SET_OP_NEW | Create set operation buffer |
| 0xDD | SET_OP_ADD | Add row from source |
| 0xDE | SET_OP_NEXT | Get next result row |
| 0xDF | SET_OP_DESTROY | Destroy set buffer |
13.5.9 Cursor Operations (0xF0-0xF7)
| Opcode | Mnemonic | Operation |
|---|---|---|
| 0xF0 | CURSOR_OPEN | Open collection cursor |
| 0xF1 | CURSOR_NEXT | Get next document |
| 0xF2 | CURSOR_CLOSE | Close cursor |
| 0xF3 | CURSOR_RESET | Reset to beginning |
| 0xF4 | CURSOR_IS_VALID | Check if more rows |
| 0xF5 | EMIT_ROW | Emit result row |
| 0xF6 | CURSOR_TAKE_DOC | Move document (no copy) |
13.5.10 Extended Opcodes (0xFE prefix)
Extended opcodes use a two-byte opcode sequence (0xFE + extended opcode):
Array/Object Iteration (0x01-0x0A):
ITER_ARRAY_BEGIN,ITER_ARRAY_NEXT,ITER_ARRAY_ENDITER_OBJECT_BEGIN,ITER_OBJECT_NEXT_KEY,ITER_OBJECT_NEXT_VAL,ITER_OBJECT_ENDITER_RANGE_BEGIN,ITER_RANGE_NEXT,ITER_RANGE_END
Document Construction (0x0B-0x12):
DOC_NEW,DOC_FROM_JSON,DOC_TO_JSONDOC_MERGE,DOC_DEEP_COPY
Composite Row Operations (0x13-0x1C):
COMPOSITE_NEW,COMPOSITE_ADD_SLOTCOMPOSITE_GET_FIELD,COMPOSITE_GET_SLOTCOMPOSITE_MATERIALIZE,COMPOSITE_EMIT
Batch/Vectorized Operations (0x20-0x5F):
BATCH_SCAN_OPEN,BATCH_SCAN_NEXT,BATCH_EMITBATCH_ADD_I64,BATCH_CMP_LT_I64, etc.BATCH_AND,BATCH_OR,BATCH_SELECTBATCH_HASH,BATCH_HASH_BUILD,BATCH_HASH_PROBE
Parallel Execution (0x5C-0x67):
PARALLEL_SCAN_OPEN,PARALLEL_FILTER_EVALPARALLEL_PARTITION,PARALLEL_MERGE,PARALLEL_BARRIER
Outer Context Operations (0x85):
GET_OUTER_FIELD: Read a field from the outer query's current row for correlated subquery support
13.6 Register Architecture
CVM employs a register-based architecture with 24 registers organized into two files.
13.6.1 Register Organization
General Purpose Registers (R0-R15):
- 16 registers for scalar values
- Can hold any CVMType
- Used for integers, booleans, pointers
Float Registers (F0-F7):
- 8 registers for floating-point values
- Optimized for double-precision arithmetic
- Separate to avoid type checks in hot loops
13.6.2 Register Storage
class VMContext {
private:
std::array<VMValue, 16> registers_; // R0-R15
std::array<VMValue, 8> float_registers_; // F0-F7
uint32_t pc_; // Program counter
};
13.6.3 Register Access
auto VMContext::get_register(uint8_t reg) const -> const VMValue& {
assert(reg < 16);
return registers_[reg];
}
void VMContext::set_register(uint8_t reg, VMValue value) {
assert(reg < 16);
registers_[reg] = std::move(value);
}
// Convenience accessors for common types
auto VMContext::get_int64(uint8_t reg) const -> int64_t {
assert(registers_[reg].type == CVMType::kInt64);
return registers_[reg].int64_val;
}
void VMContext::set_int64(uint8_t reg, int64_t value) {
registers_[reg].type = CVMType::kInt64;
registers_[reg].int64_val = value;
}
13.6.4 Register vs Stack Architecture
CVM's register-based design offers several advantages over stack machines:
| Property | Register Machine | Stack Machine |
|---|---|---|
| Instruction count | Lower | Higher |
| Operand encoding | Explicit registers | Implicit stack |
| Code size | Larger instructions | Smaller instructions |
| Optimization | Easier | Harder |
| Dispatch overhead | Lower | Higher |
For database workloads, the reduced instruction count and better optimization potential outweigh the slightly larger code size.
13.7 Execution Context
The VMContext maintains all runtime state during bytecode execution.
13.7.1 State Categories
13.7.2 Context Structure
class VMContext {
private:
// === Registers ===
std::array<VMValue, 16> registers_;
std::array<VMValue, 8> float_registers_;
uint32_t pc_;
// === Control Flow ===
std::vector<VMValue> operand_stack_;
std::vector<CallFrame> call_stack_;
// === Status ===
VMStatus status_;
VMFlags flags_;
std::optional<VMError> error_;
// === Module ===
const BytecodeModule* module_;
// === I/O ===
VMValue input_document_;
RowEmitCallback row_emit_callback_;
uint64_t rows_emitted_;
uint64_t rows_scanned_;
// === Cursors ===
std::array<std::unique_ptr<Cursor>, 16> cursors_;
std::array<VMValue, 16> current_documents_;
std::array<bool, 16> cursor_valid_;
// === Query Buffers (spillable) ===
std::array<SortBuffer, 4> sort_buffers_;
std::array<HashTable, 4> hash_tables_;
std::array<AggregationTable, 4> agg_tables_;
std::array<WindowBuffer, 4> window_buffers_;
std::array<SetOpBuffer, 4> set_op_buffers_;
std::array<WorkingTableSlot, 8> working_tables_;
// === Memory Management ===
std::vector<std::unique_ptr<Document>> owned_documents_;
std::vector<std::unique_ptr<std::string>> owned_strings_;
std::vector<std::unique_ptr<std::vector<VMValue>>> owned_arrays_;
std::vector<std::unique_ptr<CompositeRow>> owned_composite_rows_;
// === Parameters ===
std::array<VMValue, 256> parameters_;
std::bitset<256> parameters_bound_;
// === Statistics ===
uint64_t instructions_executed_;
};
13.7.3 Status and Flags
Execution Status:
enum class VMStatus : uint8_t {
kRunning = 0, // Normal execution
kHalted = 1, // Execution completed
kError = 2, // Runtime error
kBreakpoint = 3, // Hit debugger breakpoint
kYielded = 4 // Yielded for streaming
};
CPU Flags (set by comparisons):
struct VMFlags {
bool zero = false; // Result was zero/equal
bool negative = false; // Result was negative
bool overflow = false; // Arithmetic overflow
bool carry = false; // Carry/borrow
};
13.7.4 Resource Limits
| Resource | Limit | Purpose |
|---|---|---|
| Call Stack | 256 frames | Prevent stack overflow |
| Operand Stack | 1024 values | Function argument passing |
| Cursors | 16 | Concurrent table scans |
| Sort Buffers | 4 | Concurrent ORDER BY |
| Hash Tables | 4 | Concurrent joins |
| Agg Tables | 4 | Concurrent GROUP BY |
| Window Buffers | 4 | Window functions |
| Set Op Buffers | 4 | UNION/INTERSECT/EXCEPT |
| Working Tables | 8 | Recursive CTEs |
| Parameters | 256 | Prepared statement bindings |
13.8 Interpreter Implementation
The CVM interpreter uses computed goto (threaded code) for efficient dispatch.
13.8.1 Dispatch Table
class Interpreter {
private:
// Dispatch table: 256 void* pointers to instruction handlers
static void* dispatch_table_[256];
public:
auto execute(VMContext& ctx) -> ExecutionResult;
};
13.8.2 Computed Goto Dispatch
The interpreter main loop uses computed goto for minimal dispatch overhead:
auto Interpreter::execute(VMContext& ctx) -> ExecutionResult {
// Initialize dispatch table (once)
static void* dispatch_table[] = {
&&op_nop, // 0x00
&&op_move, // 0x01
&&op_move_int64, // 0x02
// ... 256 entries
};
// Fetch first instruction
auto word = ctx.fetch_instruction();
auto opcode = word >> 24;
goto *dispatch_table[opcode];
op_nop:
ctx.advance_pc(4);
word = ctx.fetch_instruction();
opcode = word >> 24;
goto *dispatch_table[opcode];
op_move:
{
auto dst = (word >> 20) & 0xF;
auto src = (word >> 16) & 0xF;
ctx.set_register(dst, ctx.get_register(src));
}
ctx.advance_pc(4);
word = ctx.fetch_instruction();
opcode = word >> 24;
goto *dispatch_table[opcode];
op_add_int64:
{
auto dst = (word >> 20) & 0xF;
auto src1 = (word >> 16) & 0xF;
auto src2 = (word >> 12) & 0xF;
auto a = ctx.get_int64(src1);
auto b = ctx.get_int64(src2);
ctx.set_int64(dst, a + b);
}
ctx.advance_pc(4);
word = ctx.fetch_instruction();
opcode = word >> 24;
goto *dispatch_table[opcode];
// ... handlers for all 256 opcodes
op_halt:
ctx.set_status(VMStatus::kHalted);
return ExecutionResult::kSuccess;
}
13.8.3 Why Computed Goto?
Computed goto provides significant performance benefits:
- No Bounds Check: Direct table lookup, no range validation
- Direct Jumps: Each handler jumps directly to the next
- Better Branch Prediction: Dedicated indirect branch per handler
- 10-20% Faster: Measured improvement over switch-case
Comparison with Switch:
// Switch-based dispatch (slower)
while (running) {
auto opcode = fetch_opcode();
switch (opcode) {
case OP_ADD: /* handler */ break;
case OP_SUB: /* handler */ break;
// ...
}
}
// Computed goto (faster)
goto *dispatch_table[opcode];
op_add:
/* handler */
goto *dispatch_table[next_opcode];
The switch version has a single indirect branch that must predict across all opcodes. Computed goto gives each handler its own branch, allowing the branch predictor to learn patterns specific to that instruction.
13.8.4 Execution Configuration
struct InterpreterConfig {
uint64_t max_instructions = 100'000'000'000; // Anti-DoS limit
bool enable_tracing = false; // Debug tracing
bool enable_profiling = false; // Performance profiling
};
13.8.5 Execution Result
enum class ExecutionResult : uint8_t {
kSuccess, // Completed normally
kError, // Runtime error occurred
kBreakpoint, // Hit debugger breakpoint
kMaxInstructions, // Exceeded instruction limit
kYield // Yielded for streaming
};
13.9 Specialized Execution Buffers
CVM provides specialized data structures for memory-intensive query operations, all capable of spilling to disk when memory is exhausted.
13.9.1 Sort Buffer
The sort buffer implements ORDER BY with automatic external sort fallback:
class SortBuffer {
public:
// Field-based sorting
void configure(const std::vector<SortComparator>& comparators);
// Value-based sorting (for expression keys)
void configure_with_values(size_t num_keys,
const std::vector<bool>& ascending,
const std::vector<bool>& nulls_first);
void add(Document* row);
void add_with_values(Document* row, std::vector<SortableValue> keys);
auto next() -> Document*; // Sorts on first call
void destroy();
private:
std::vector<Document*> rows_;
std::vector<SortComparator> comparators_;
size_t memory_limit_;
bool sorted_ = false;
};
Spill Strategy: When memory is exceeded, the buffer:
- Sorts current in-memory data
- Writes sorted run to temporary file
- Clears memory
- Continues accepting rows
- On iteration, performs merge sort across runs
13.9.2 Hash Table
The hash table supports hash joins and hash aggregation:
class HashTable {
public:
void insert(const VMValue& key, Document* value);
auto probe_first(const VMValue& key) -> Document*;
auto probe_next() -> Document*; // For multi-match
void destroy();
private:
std::unordered_multimap<int64_t, Document*> entries_;
size_t memory_limit_;
};
Spill Strategy (Grace Hash Join): When memory is exceeded:
- Partition data by hash(key) into N partitions
- Spill largest partition to disk
- Continue with remaining partitions in memory
- Process spilled partitions recursively
13.9.3 Aggregation Table
The aggregation table implements GROUP BY with multiple aggregation functions:
class AggregationTable {
public:
void configure(const std::vector<AggregateFunction>& functions);
auto get_or_create(const VMValue& key)
-> std::vector<std::unique_ptr<AggregateState>>*;
void iter_init();
auto iter_next() -> std::pair<VMValue*, std::vector<AggregateState*>*>;
auto iter_has_next() const -> bool;
private:
std::unordered_map<int64_t, AggGroupEntry> groups_;
std::vector<AggregateFunction> functions_;
Iterator current_iter_;
};
Supported Aggregate Functions:
enum class AggregateFunction : uint8_t {
kCount = 0, // COUNT(expr)
kSum = 1, // SUM(expr)
kAvg = 2, // AVG(expr)
kMin = 3, // MIN(expr)
kMax = 4, // MAX(expr)
kCountStar = 5, // COUNT(*)
kStddevPop = 6, // STDDEV_POP(expr)
kStddevSamp = 7, // STDDEV_SAMP(expr)
kVarPop = 8, // VAR_POP(expr)
kVarSamp = 9, // VAR_SAMP(expr)
kFirst = 10, // FIRST(expr)
kLast = 11, // LAST(expr)
kStringAgg = 12, // STRING_AGG(expr, delimiter)
kArrayAgg = 13 // ARRAY_AGG(expr)
};
13.9.4 Window Buffer
The window buffer implements window functions:
class WindowBuffer {
public:
void add(Document* row);
void compute(uint16_t window_spec_index);
auto next() -> Document*;
auto get_results() const -> const WindowResultMap*;
private:
std::vector<Document*> rows_;
std::vector<WindowSpec> window_specs_;
WindowResultMap results_;
};
13.9.5 Set Operation Buffer
The set operation buffer implements UNION, INTERSECT, and EXCEPT:
class SetOpBuffer {
public:
enum class Type { kUnion, kIntersect, kExcept };
void configure(Type type, bool all);
void add(const Document* doc, uint8_t source); // source: 0=left, 1=right
auto next() -> Document*;
private:
Type type_;
bool all_; // ALL variant (no dedup)
std::unordered_set<int64_t> seen_;
std::vector<Document> results_;
};
13.9.6 Working Table (Recursive CTE)
The working table implements recursive common table expressions:
class WorkingTable {
public:
void add(const Document* doc);
void add(Document&& doc); // Move semantics
void swap(WorkingTable& other);
auto scan_next() -> Document*;
auto is_empty() const -> bool;
void scan_reset();
private:
std::vector<Document> rows_;
size_t scan_position_ = 0;
};
Recursive CTE Execution Pattern:
1. Execute anchor query, add to working_table_0
2. While working_table_0 is not empty:
a. Swap working_table_0 and working_table_1
b. Clear working_table_0
c. For each row in working_table_1:
- Execute recursive query
- Add results to working_table_0
d. Emit rows from working_table_1
13.10 Composite Rows (Zero-Copy Joins)
Composite rows enable efficient multi-table joins without deep copying documents.
13.10.1 Motivation
In a traditional join implementation:
// Deep copy approach (expensive)
auto result = Document {};
for (const auto& field : left_doc) {
result.set(field.name, field.value.deep_copy());
}
for (const auto& field : right_doc) {
result.set(field.name, field.value.deep_copy());
}
This involves:
- Memory allocation for new document
- Deep copying all field values
- String duplication
- Array/nested document copying
13.10.2 Composite Row Structure
class CompositeRow {
public:
void add_slot(const std::string& alias, Document* doc);
auto get_field(const std::string& qualified_name) const -> const Value*;
auto get_slot(uint8_t slot_index, uint16_t field_index) const -> const Value*;
auto materialize() const -> Document; // Create actual document
private:
struct Slot {
std::string alias;
Document* doc; // Non-owning pointer
};
std::vector<Slot> slots_;
};
13.10.3 Composite Row Operations
| Opcode | Operation |
|---|---|
| COMPOSITE_NEW | Create empty CompositeRow |
| COMPOSITE_ADD_SLOT | Add aliased document reference |
| COMPOSITE_GET_FIELD | Get field by qualified name |
| COMPOSITE_GET_SLOT | Get field by slot + index |
| COMPOSITE_MATERIALIZE | Create concrete Document |
| COMPOSITE_EMIT | Emit with callback |
| COMPOSITE_CLEAR | Clear slots for reuse |
13.10.4 Performance Impact
For a 3-way join producing 100,000 result rows:
| Approach | Memory | Time |
|---|---|---|
| Deep Copy | ~800 MB | ~500 ms |
| Composite Row | ~8 MB | ~50 ms |
Composite rows reduce both memory usage and execution time by approximately 10x for join-heavy queries.
13.11 Built-in Functions
CVM provides 200+ built-in functions accessible via the CALL_BUILTIN opcode.
13.11.1 Function Categories
| Category | ID Range | Examples |
|---|---|---|
| Mathematical | 0x0000-0x00FF | abs, floor, ceil, sqrt, pow, log |
| String | 0x0100-0x01FF | length, upper, lower, substring, replace |
| Type Conversion | 0x0200-0x02FF | to_int, to_float, to_string, typeof |
| Conditional | 0x0300-0x03FF | min, max, coalesce, nullif, greatest |
| Array | 0x0400-0x04FF | length, element, append, reverse, sort |
| JSON/Document | 0x0500-0x05FF | json_extract, json_set, json_merge |
| Date/Time | 0x0600-0x06FF | now, date_add, date_diff, extract |
13.11.2 Native Operations
Some operations are implemented as dedicated opcodes for maximum performance:
Arithmetic: ADD, SUB, MUL, DIV, MOD, NEG, ABS Comparison: EQ, NE, LT, LE, GT, GE Bitwise: AND, OR, XOR, NOT, SHL, SHR Logical: AND, OR, NOT (with short-circuit) Type Casting: int64 <-> float64, string <-> numeric Field Access: GET_FIELD, SET_FIELD, HAS_FIELD Array Operations: LENGTH, GET, SET, PUSH, POP String Operations: LENGTH, CONCAT, SUBSTRING, LIKE
13.11.3 Aggregate State Machine
Aggregate functions use a three-phase state machine:
Aggregate State Structure:
class AggregateState {
public:
void reset();
void accumulate(const VMValue& value);
auto finalize() -> VMValue;
void merge(const AggregateState& other);
private:
AggregateFunction function_;
int64_t count_ = 0;
int64_t sum_int64_ = 0;
double sum_double_ = 0.0;
double sum_squared_ = 0.0; // For STDDEV/VARIANCE
double mean_ = 0.0; // Welford's algorithm
VMValue min_value_, max_value_;
VMValue first_value_, last_value_;
std::string string_result_; // STRING_AGG
std::vector<VMValue> array_values_; // ARRAY_AGG
};
13.12 Cursor System
Cursors provide the interface between CVM and the storage layer.
13.12.1 Cursor Operations
// Open cursor on collection
auto open_cursor(uint8_t slot, const std::string& name,
bool is_cte = false) -> bool;
// Advance to next document
auto cursor_next(uint8_t slot) -> VMValue;
// Check if cursor has more rows
auto cursor_is_valid(uint8_t slot) const -> bool;
// Reset cursor to beginning (for nested loops)
void cursor_reset(uint8_t slot);
// Close cursor and release resources
void cursor_close(uint8_t slot);
// Move document without copying
auto cursor_take_doc(uint8_t slot) -> VMValue;
13.12.2 Cursor Lifecycle
13.12.3 Iterator Operations
For UNNEST and lateral joins, CVM provides iterators:
Array Iterator:
ITER_ARRAY_BEGIN R1, R0 // Create iterator from array in R0
ITER_ARRAY_NEXT R2, R1, done // Get next element or jump to done
// Process R2
JUMP loop
done:
ITER_ARRAY_END R1 // Clean up
Object Iterator:
ITER_OBJECT_BEGIN R1, R0 // Create iterator from document
ITER_OBJECT_NEXT_KEY R2, R1, done // Get next key
ITER_OBJECT_NEXT_VAL R3, R1 // Get value for current key
// Process key R2 and value R3
JUMP loop
done:
ITER_OBJECT_END R1
Range Iterator:
ITER_RANGE_BEGIN R1, start, end, step
ITER_RANGE_NEXT R2, R1, done // Get next value
// Process R2
JUMP loop
done:
ITER_RANGE_END R1
13.13 Error Handling
CVM provides structured error handling for runtime errors.
13.13.1 Error Structure
struct VMError {
std::string message; // Error description
uint32_t pc; // Program counter at error
uint32_t source_line; // Source line if debug info
Opcode opcode; // Instruction that failed
};
13.13.2 Exception Handling Opcodes
Extended opcodes provide try-catch semantics:
| Opcode | Operation |
|---|---|
| EXCEPTION_PUSH | Push exception handler |
| EXCEPTION_POP | Pop exception handler |
| RAISE_EXCEPTION | Raise exception |
| RERAISE | Re-raise current exception |
| GET_DIAGNOSTICS | Get exception details |
13.13.3 Error Recovery
// Check for error after each instruction
if (ctx.status() == VMStatus::kError) {
auto& error = ctx.error();
log_error("CVM Error at PC {}: {} (opcode 0x{:02X})",
error.pc, error.message,
static_cast<uint8_t>(error.opcode));
return ExecutionResult::kError;
}
13.14 Debugging and Profiling
CVM includes comprehensive debugging support.
13.14.1 Debug Operations
| Opcode | Operation |
|---|---|
| DEBUG_PRINT | Print register value |
| DEBUG_BREAK | Trigger breakpoint |
| DEBUG_TRACE | Emit trace event |
| DEBUG_DUMP | Dump VM state |
| DEBUG_ASSERT | Assert condition |
| DEBUG_PROFILE | Profile section marker |
13.14.2 Debug Callbacks
using DebugPrintCallback = std::function<void(const VMValue&)>;
using DebugTraceCallback = std::function<void(
uint16_t label_index, const VMValue& value, uint32_t pc)>;
using DebugDumpCallback = std::function<void(
uint32_t pc, std::span<const VMValue> registers)>;
using BreakpointCallback = std::function<bool(uint32_t pc)>;
13.14.3 Execution Statistics
auto instructions_executed() const -> uint64_t;
auto rows_emitted() const -> uint64_t;
auto rows_scanned() const -> uint64_t;
13.15 Summary
The Cognica Virtual Machine provides a high-performance bytecode execution environment optimized for database query processing:
-
Register-Based Architecture: 16 general-purpose + 8 floating-point registers minimize memory traffic and enable efficient optimization.
-
Rich Type System: 16 types covering SQL data types, JSON values, temporal types, and internal execution types.
-
Query-Centric Instructions: 512+ opcodes providing first-class support for aggregation, sorting, joins, window functions, and set operations.
-
Computed Goto Dispatch: Threaded interpretation delivers 10-20% better performance than switch-based dispatch.
-
Spillable Buffers: Sort, hash, and aggregation buffers automatically spill to disk when memory is exhausted.
-
Zero-Copy Joins: Composite rows eliminate deep copying during multi-table joins.
-
Comprehensive Cursor System: Efficient iteration over collections with support for nested loops and resets.
The CVM architecture strikes a balance between the flexibility of interpretation and the performance of native code, providing a clean compilation target for query optimization while maintaining reasonable execution speed.