Chapter 14: CVM Compilation Pipeline
14.1 Introduction to Query Compilation
The CVM compilation pipeline transforms SQL expressions into efficient bytecode through a series of well-defined stages. This chapter examines each stage in detail, from the initial lowering of SQL Abstract Syntax Trees to the final bytecode generation.
Query compilation occupies a critical position in database performance. While interpretation offers flexibility, compilation enables optimizations impossible in an interpreted setting—constant folding, common subexpression elimination, and register allocation all occur at compile time, reducing runtime overhead.
14.1.1 Compilation Pipeline Overview
The compilation process follows a classic three-stage compiler architecture:
14.1.2 Design Philosophy
The compilation pipeline embodies several key principles:
-
Separation of Concerns: Lowering, optimization, and code generation are independent phases with clean interfaces.
-
Progressive Refinement: Each stage transforms the program into a form closer to executable bytecode.
-
Optimization Composability: Passes can be added, removed, or reordered without affecting correctness.
-
Debug Transparency: Source location information flows through all stages for error reporting.
14.2 Intermediate Representation
The IR (Intermediate Representation) serves as the central data structure connecting frontend and backend. CVM uses a DAG-based (Directed Acyclic Graph) IR that naturally represents expression trees with sharing.
14.2.1 IR Node Types
All IR nodes inherit from the IRNode base class:
enum class IRNodeKind : uint8_t {
kConstant, // Literal constant value
kParameter, // Input parameter (column, bound value)
kBinaryOperation, // Binary operation (+, -, ==, <, &&)
kUnaryOperation, // Unary operation (-, NOT, ABS)
kCall, // Function call (builtin, scalar, aggregate)
kConditional, // IF-THEN-ELSE / CASE expression
kCast, // Type cast
kFieldAccess, // Document field access
kArrayAccess, // Array element access
kArrayConstruct, // Array literal construction
kCoalesce, // COALESCE expression
kNullTest // IS NULL / IS NOT NULL
};
14.2.2 Node Properties
Each IR node carries metadata essential for optimization and code generation:
class IRNode {
protected:
IRNodeKind kind_; // Node type discriminant
CVMType result_type_; // Result type (kInt64, kDouble, etc.)
bool is_nullable_; // Can result be NULL?
uint32_t node_id_; // Unique ID for CSE
std::vector<IRNodePtr> children_; // Child nodes
};
Type Information: The result_type_ field enables type-specific code generation. When the type is known at compile time, the generator emits type-specialized opcodes (e.g., kAddInt64 vs kAddFloat64).
Nullability Tracking: The is_nullable_ flag propagates through operations, enabling NULL-check elimination when values are guaranteed non-NULL.
14.2.3 Constant Nodes
Constants represent literal values known at compile time:
class IRConstant : public IRNode {
std::variant<
std::monostate, // NULL
bool, // Boolean
int64_t, // Integer
double, // Floating point
std::string // String
> value_;
};
The IRBuilder deduplicates constants—multiple references to the same constant value share a single IRConstant node.
14.2.4 Parameter Nodes
Parameters represent runtime inputs:
class IRParameter : public IRNode {
ParameterSource source_; // Column, bound value, input doc
uint32_t index_; // Parameter index
std::string name_; // Optional name for debugging
};
enum class ParameterSource {
kColumnReference, // Column from input tuple
kBoundValue, // Prepared statement parameter
kInputDocument, // Root input document
kCurrentRow // Current row in iteration
};
14.2.5 Binary Operation Nodes
Binary operations cover arithmetic, comparison, logical, and string operations:
class IRBinaryOperation : public IRNode {
BinaryOperationType op_; // Operation type
IRNodePtr left_; // Left operand
IRNodePtr right_; // Right operand
};
enum class BinaryOperationType {
// Arithmetic
kAdd, kSubtract, kMultiply, kDivide, kModulo,
// Comparison
kEqual, kNotEqual, kLessThan, kLessEqual,
kGreaterThan, kGreaterEqual,
// Logical
kLogicalAnd, kLogicalOr,
// String
kStringConcat, kLike, kILike, kRegexMatch,
// Bitwise
kBitwiseAnd, kBitwiseOr, kBitwiseXor,
kShiftLeft, kShiftRight,
// Array
kArrayContains, kArrayConcat
};
14.2.6 Function Call Nodes
Function calls support multiple calling conventions:
class IRCall : public IRNode {
CallType call_type_; // Builtin, scalar, aggregate, etc.
std::string function_name_; // Function identifier
std::vector<IRNodePtr> arguments_;
bool is_distinct_; // DISTINCT modifier
IRNodePtr filter_; // FILTER clause
bool is_volatile_; // Prevents CSE
};
enum class CallType {
kBuiltin, // Built-in function (UPPER, LENGTH)
kScalar, // User-defined scalar function
kAggregate, // Aggregate function (SUM, COUNT)
kWindow, // Window function (ROW_NUMBER, LAG)
kSubquery, // Scalar subquery
kExternal // External function call
};
The is_volatile_ flag marks non-deterministic functions (like RANDOM() or NOW()) to prevent common subexpression elimination.
14.2.7 IR Module Structure
An IR module packages the compiled expression with metadata:
class IRModule {
IRNodePtr root_; // Root expression node
std::vector<IRNodePtr> nodes_; // All nodes (for traversal)
std::string name_; // Module name
size_t parameter_count_; // Input parameter count
};
14.3 SQL Lowering
The lowering phase transforms SQL AST nodes into CVM IR nodes. This translation handles SQL-specific constructs and prepares the expression for optimization.
14.3.1 Lowering Configuration
struct SQLLoweringOptions {
bool fold_constants = true; // Fold during lowering
bool enable_cse = true; // Enable CSE later
bool preserve_names = true; // Keep column names
FunctionVolatilityChecker volatility_checker;
};
14.3.2 Column Binding
Column bindings map SQL column references to their runtime positions:
struct ColumnBinding {
uint32_t index; // Position in input tuple
std::string name; // Column name
CVMType type; // CVM type
bool is_nullable = true; // Nullability
};
For a query like SELECT * FROM users WHERE age > 21, the column bindings might be:
| Index | Name | Type | Nullable |
|---|---|---|---|
| 0 | id | kInt64 | false |
| 1 | name | kString | true |
| 2 | age | kInt64 | true |
| 3 | kString | true |
14.3.3 Expression Lowering
The SQLLowering class implements a visitor pattern over SQL AST nodes:
class SQLLowering {
public:
auto lower(const Expr& expr) -> std::unique_ptr<IRModule>;
auto lower(const Expr& expr, const std::vector<ColumnBinding>& columns)
-> std::unique_ptr<IRModule>;
private:
auto lower_expr_(const Expr& expr) -> IRNodePtr;
auto lower_column_ref_(const ColumnRef& ref) -> IRNodePtr;
auto lower_constant_(const Constant& c) -> IRNodePtr;
auto lower_binary_expr_(const BinaryExpr& expr) -> IRNodePtr;
auto lower_unary_expr_(const UnaryExpr& expr) -> IRNodePtr;
auto lower_function_call_(const FunctionCall& call) -> IRNodePtr;
auto lower_case_expr_(const CaseExpr& expr) -> IRNodePtr;
auto lower_cast_expr_(const CastExpr& expr) -> IRNodePtr;
auto lower_subquery_expr_(const SubqueryExpr& expr) -> IRNodePtr;
};
14.3.4 Lowering Examples
Column Reference:
SQL: age
auto lower_column_ref_(const ColumnRef& ref) -> IRNodePtr {
auto binding = find_binding_(ref.column_name);
return builder_.make_parameter(
ParameterSource::kColumnReference,
binding.index,
binding.type,
binding.name);
}
Binary Expression:
SQL: age >= 18
auto lower_binary_expr_(const BinaryExpr& expr) -> IRNodePtr {
auto left = lower_expr_(expr.left);
auto right = lower_expr_(expr.right);
auto op = map_binary_op_(expr.op); // SQL op -> IR op
return builder_.make_binary(op, left, right);
}
CASE Expression:
SQL: CASE WHEN age >= 18 THEN 'adult' ELSE 'minor' END
auto lower_case_expr_(const CaseExpr& expr) -> IRNodePtr {
// CASE expressions become nested conditionals
auto condition = lower_expr_(expr.when_clause);
auto then_branch = lower_expr_(expr.then_result);
auto else_branch = lower_expr_(expr.else_result);
return builder_.make_conditional(condition, then_branch, else_branch);
}
14.3.5 Operator Mapping
SQL operators map to IR operations:
| SQL Operator | IR Operation |
|---|---|
+ | kAdd |
- | kSubtract |
* | kMultiply |
/ | kDivide |
% | kModulo |
= | kEqual |
<>, != | kNotEqual |
< | kLessThan |
<= | kLessEqual |
> | kGreaterThan |
>= | kGreaterEqual |
AND | kLogicalAnd |
OR | kLogicalOr |
| ` | |
LIKE | kLike |
ILIKE | kILike |
14.3.6 Subquery Registration
Scalar subqueries are registered for runtime execution:
auto lower_subquery_expr_(const SubqueryExpr& expr) -> IRNodePtr {
// Register subquery for runtime execution
auto subquery_index = register_subquery_(expr);
// Create call node referencing the registered subquery
return builder_.make_call(
CallType::kSubquery,
"subquery_" + std::to_string(subquery_index),
{} // Arguments bound at runtime
);
}
14.4 IR Optimization
The optimization phase transforms the IR to improve execution efficiency. Optimizations are implemented as composable passes.
14.4.1 Optimization Configuration
struct OptimizerConfig {
uint8_t optimization_level = 1; // 0-3
bool enable_constant_folding = true;
bool enable_dce = true;
bool enable_cse = true;
bool enable_strength_reduction = true;
size_t max_iterations = 10;
};
enum class OptimizationLevel : uint8_t {
kNone = 0, // O0: No optimization
kBasic = 1, // O1: Constant folding + DCE
kStandard = 2, // O2: O1 + CSE + strength reduction
kAggressive = 3 // O3: O2 + advanced transforms
};
14.4.2 Pass Framework
All optimization passes inherit from a common base:
class OptimizationPass {
public:
virtual auto run(std::shared_ptr<IRModule> module)
-> OptimizationResult = 0;
virtual auto name() const -> std::string = 0;
virtual auto enabled_at_level(uint8_t level) const -> bool = 0;
};
struct OptimizationResult {
bool changed; // IR was modified
size_t nodes_removed; // Nodes eliminated
size_t nodes_added; // New nodes created
std::string pass_name; // Pass identifier
};
14.4.3 Constant Folding
Purpose: Evaluate constant expressions at compile time.
Algorithm:
- Post-order traversal of IR DAG
- Recursively fold children first
- When all operands are constants, evaluate operation
- Replace subtree with
IRConstantresult
Example:
Before:
BinaryOp(kMultiply)
BinaryOp(kAdd)
Constant(3)
Constant(4)
Constant(2)
After:
Constant(14)
Implementation:
auto ConstantFoldingPass::fold_(IRNodePtr node) -> IRNodePtr {
// Fold children first
for (auto& child : node->children()) {
child = fold_(child);
}
// Check if all children are constants
if (!all_constants_(node->children())) {
return node;
}
// Skip volatile functions
if (is_volatile_(node)) {
return node;
}
// Evaluate at compile time
auto result = evaluate_(node);
return builder_.make_constant(result);
}
Folded Operations:
- Arithmetic:
3 + 4->7 - Comparison:
5 > 3->true - String:
'Hello' || ' World'->'Hello World' - Boolean:
true AND false->false
14.4.4 Dead Code Elimination
Purpose: Remove unreachable or unused code.
Algorithm:
- Mark live nodes reachable from root
- Count uses of each node
- Eliminate branches with dead successors
- Simplify conditionals with constant conditions
Example:
Before:
Conditional
Constant(false)
Call(expensive_function) // then branch
Parameter(x) // else branch
After:
Parameter(x)
Implementation:
auto DeadCodeEliminationPass::eliminate_(IRNodePtr node) -> IRNodePtr {
if (node->kind() == IRNodeKind::kConditional) {
auto* cond = static_cast<IRConditional*>(node.get());
// Check if condition is constant
if (auto* const_cond = as_constant_(cond->condition())) {
if (const_cond->as_bool()) {
return cond->then_branch(); // Condition always true
} else {
return cond->else_branch(); // Condition always false
}
}
}
return node;
}
14.4.5 Common Subexpression Elimination
Purpose: Deduplicate equivalent expressions.
Algorithm:
- Hash each node based on operation and operand IDs
- Find nodes with identical hash
- Perform structural equality check
- Replace duplicates with first occurrence
Example:
Before:
BinaryOp(kAdd)
BinaryOp(kMultiply)
Parameter(a)
Parameter(b)
BinaryOp(kMultiply) // Duplicate!
Parameter(a)
Parameter(b)
After:
BinaryOp(kAdd)
t1 = BinaryOp(kMultiply)
Parameter(a)
Parameter(b)
t1 // Reuse!
Safety Considerations:
CSE must respect function volatility:
auto CSEPass::can_eliminate_(const IRNode* node) const -> bool {
if (node->kind() == IRNodeKind::kCall) {
auto* call = static_cast<const IRCall*>(node);
// Don't eliminate volatile functions
return !call->is_volatile();
}
return true;
}
Volatile functions include:
RANDOM()- Non-deterministicNOW()- Time-dependentNEXTVAL()- Sequence-dependent
14.4.6 Strength Reduction
Purpose: Replace expensive operations with cheaper equivalents.
Transformations:
| Original | Optimized | Condition |
|---|---|---|
x * 2 | x + x | Always |
x * 2^n | x << n | n is constant |
x / 2^n | x >> n | n is constant, x unsigned |
x % 2 | x & 1 | Always |
x * 0 | 0 | Always |
x * 1 | x | Always |
x + 0 | x | Always |
x - 0 | x | Always |
Implementation:
auto StrengthReductionPass::reduce_(IRNodePtr node) -> IRNodePtr {
if (node->kind() != IRNodeKind::kBinaryOperation) {
return node;
}
auto* binop = static_cast<IRBinaryOperation*>(node.get());
// x * 2 -> x + x
if (binop->op() == BinaryOperationType::kMultiply) {
if (is_constant_int_(binop->right(), 2)) {
return builder_.make_binary(
BinaryOperationType::kAdd,
binop->left(),
binop->left());
}
}
// x * 2^n -> x << n
if (binop->op() == BinaryOperationType::kMultiply) {
if (auto power = get_power_of_two_(binop->right())) {
return builder_.make_binary(
BinaryOperationType::kShiftLeft,
binop->left(),
builder_.make_constant(*power));
}
}
return node;
}
14.4.7 Register Allocation
Purpose: Assign virtual registers to IR values.
Algorithm: Linear scan with graph coloring hints
- Liveness Analysis: Compute live ranges for each node
- Interference Graph: Build graph where edges indicate simultaneous liveness
- Graph Coloring: Assign registers (colors) minimizing conflicts
- Coalescing: Merge nodes with no interference
- Spilling: Push excess values to stack
Configuration:
struct RegisterAllocationConfig {
uint32_t num_general_registers = 16;
uint32_t num_float_registers = 8;
bool enable_coalescing = true;
};
Output:
// Maps IR nodes to allocated registers
std::unordered_map<const IRNode*, uint8_t> allocation_map_;
14.4.8 Optimization Pipeline
The optimizer runs passes iteratively until fixed point:
auto Optimizer::optimize(std::shared_ptr<IRModule> module)
-> std::shared_ptr<IRModule> {
for (size_t iter = 0; iter < config_.max_iterations; ++iter) {
auto any_changed = false;
for (auto& pass : passes_) {
if (!pass->enabled_at_level(config_.optimization_level)) {
continue;
}
auto result = pass->run(module);
if (result.changed) {
any_changed = true;
stats_.nodes_removed += result.nodes_removed;
}
}
if (!any_changed) {
break; // Fixed point reached
}
}
return module;
}
14.5 Code Generation
The code generation phase transforms optimized IR into CVM bytecode.
14.5.1 Generator Configuration
struct BytecodeGenOptions {
bool optimize_constants = true; // Deduplicate constant loads
bool optimize_registers = true; // Minimize register usage
bool emit_debug_info = true; // Source mapping
uint8_t max_registers = 16; // Available registers
};
14.5.2 Code Generation Algorithm
The generator uses a depth-first visitor pattern:
class BytecodeGenerator {
public:
auto generate(const IRModule& module) -> std::unique_ptr<BytecodeModule>;
private:
// Returns register containing result
auto visit_(const IRNode& node) -> uint8_t;
auto visit_constant_(const IRConstant& node) -> uint8_t;
auto visit_parameter_(const IRParameter& node) -> uint8_t;
auto visit_binary_operation_(const IRBinaryOperation& node) -> uint8_t;
auto visit_unary_operation_(const IRUnaryOperation& node) -> uint8_t;
auto visit_call_(const IRCall& node) -> uint8_t;
auto visit_conditional_(const IRConditional& node) -> uint8_t;
auto visit_cast_(const IRCast& node) -> uint8_t;
auto visit_field_access_(const IRFieldAccess& node) -> uint8_t;
};
14.5.3 Register Management
The generator maintains a register allocator:
class BytecodeGenerator {
private:
// Register allocation state
std::bitset<16> registers_in_use_;
std::deque<uint8_t> allocation_order_; // FIFO for eviction
auto allocate_register_() -> uint8_t;
auto allocate_temporary_() -> uint8_t;
void free_register_(uint8_t reg);
auto evict_register_() -> uint8_t; // Spill to stack
};
FIFO Eviction: When all registers are in use, the oldest allocated register is evicted to the stack:
auto BytecodeGenerator::evict_register_() -> uint8_t {
auto reg = allocation_order_.front();
allocation_order_.pop_front();
// Spill to stack
emit_(Opcode::kStoreReg, reg, next_stack_slot_++);
return reg;
}
14.5.4 Type-Aware Opcode Selection
The generator selects type-specific opcodes when types are known:
auto BytecodeGenerator::select_comparison_opcode_(
BinaryOperationType op, CVMType type) -> Opcode {
if (type == CVMType::kInt64) {
switch (op) {
case BinaryOperationType::kEqual:
return Opcode::kCompareEqualInt64;
case BinaryOperationType::kLessThan:
return Opcode::kCompareLessThanInt64;
// ...
}
} else if (type == CVMType::kDouble) {
switch (op) {
case BinaryOperationType::kEqual:
return Opcode::kCompareEqualFloat64;
// ...
}
} else if (type == CVMType::kString) {
switch (op) {
case BinaryOperationType::kEqual:
return Opcode::kCompareEqualString;
// ...
}
}
// Fallback to polymorphic
return Opcode::kCompareEqualPolymorphic;
}
14.5.5 Constant Pool Management
Constants are deduplicated in the constant pool:
class BytecodeGenerator {
private:
ConstantPool constant_pool_;
// Deduplication maps
std::unordered_map<std::string, uint16_t> string_map_;
std::unordered_map<int64_t, uint16_t> int64_map_;
std::unordered_map<uint64_t, uint16_t> double_map_; // Bit pattern
auto add_string_constant_(const std::string& str) -> uint16_t {
auto it = string_map_.find(str);
if (it != string_map_.end()) {
return it->second;
}
auto index = constant_pool_.add_string(str);
string_map_[str] = index;
return index;
}
};
14.5.6 Jump Target Resolution
Jumps use PC-relative offsets that are patched after code generation:
struct PendingJump {
uint32_t instruction_offset; // Where to patch
std::string target_label; // Target label
};
void BytecodeGenerator::emit_jump_(const std::string& label) {
auto offset = current_offset_();
// Emit placeholder
emit_instruction_(encode_format_c(
Opcode::kJump, 0, 0)); // offset = 0 (placeholder)
// Record for later patching
pending_jumps_.push_back({offset, label});
}
void BytecodeGenerator::resolve_jumps_() {
for (const auto& jump : pending_jumps_) {
auto target = label_offsets_[jump.target_label];
auto relative = (target - jump.instruction_offset) / 4;
module_->patch_jump_offset(jump.instruction_offset, relative);
}
}
14.6 Bytecode Module Structure
The compiled bytecode is packaged in a BytecodeModule with metadata.
14.6.1 Module Header
struct BytecodeHeader {
uint32_t magic; // 0x304D5643 ("CVM0")
uint16_t version_major; // Major version
uint16_t version_minor; // Minor version
uint32_t flags; // Module flags
uint32_t constant_pool_offset; // Constant pool location
uint32_t constant_pool_size; // Constant pool size
uint32_t code_offset; // Code section location
uint32_t code_size; // Code section size
uint32_t debug_offset; // Debug info location
uint32_t debug_size; // Debug info size
uint32_t entry_point; // Entry point offset
uint64_t expression_hash; // For caching
uint8_t max_registers; // GPRs used
uint8_t max_float_registers; // FPRs used
uint8_t stack_depth; // Stack depth required
uint8_t reserved[13]; // Future expansion
};
static_assert(sizeof(BytecodeHeader) == 64);
14.6.2 Module Flags
enum class ModuleFlags : uint32_t {
kNone = 0,
kDebugInfo = 1 << 0, // Contains debug info
kOptimized = 1 << 1, // Has been optimized
kJitReady = 1 << 2, // Prepared for JIT
kSourceMapped = 1 << 3 // Contains source mapping
};
14.6.3 Constant Pool
The constant pool stores all constants referenced by bytecode:
enum class ConstantType : uint8_t {
kNull = 0, // NULL value
kBool = 1, // Boolean
kInt64 = 2, // 64-bit integer
kDouble = 3, // 64-bit float
kString = 4, // String (string table index)
kFieldRef = 5, // Field name reference
kFuncRef = 6, // Function reference
kTypeId = 7, // Type identifier
kArray = 8 // Array of values
};
14.6.4 Debug Information
Debug info maps bytecode offsets to source locations:
struct DebugLocation {
uint32_t code_offset; // Bytecode offset
uint32_t source_line; // Source line (1-based)
uint32_t source_column; // Source column (1-based)
uint32_t source_length; // Source span length
};
struct DebugInfo {
std::string source_name; // Source identifier
std::string source_text; // Original source
std::vector<DebugLocation> locations; // Offset mapping
};
14.6.5 Module Layout
14.7 Compilation Examples
14.7.1 Simple Comparison: age >= 18
SQL AST:
BinaryExpr(GreaterEqual,
ColumnRef("age"),
Constant(18))
Lowered IR:
IRBinaryOperation(kGreaterEqual,
IRParameter(index=0, name="age", type=kInt64),
IRConstant(18))
Generated Bytecode:
LOAD_PARAM R0, 0 ; R0 = age
LOAD_CONST R1, pool[0] ; R1 = 18
CMP_GE_I64 R2, R0, R1 ; R2 = (R0 >= R1)
RET_VALUE R2 ; return R2
Bytecode Encoding:
0x0D 0x00 0x00 0x00 ; LOAD_PARAM R0, idx=0
0x07 0x10 0x00 0x00 ; LOAD_CONST R1, pool[0]
0x45 0x20 0x01 0x00 ; CMP_GE_I64 R2, R0, R1
0x78 0x20 0x00 0x00 ; RET_VALUE R2
14.7.2 Conditional: CASE WHEN age >= 18 THEN 'adult' ELSE 'minor' END
Lowered IR:
IRConditional(
condition=IRBinaryOperation(kGreaterEqual, ...),
then_branch=IRConstant("adult"),
else_branch=IRConstant("minor"))
Generated Bytecode:
LOAD_PARAM R0, 0 ; R0 = age
LOAD_CONST R1, pool[0] ; R1 = 18
CMP_GE_I64 R2, R0, R1 ; R2 = condition
JUMP_FALSE R2, else_label ; if false, goto else
LOAD_CONST R3, pool[1] ; R3 = "adult"
JUMP end_label
else_label:
LOAD_CONST R3, pool[2] ; R3 = "minor"
end_label:
RET_VALUE R3 ; return result
14.7.3 Constant Folding: (100 + 50) * 2
Before Optimization:
IRBinaryOperation(kMultiply,
IRBinaryOperation(kAdd,
IRConstant(100),
IRConstant(50)),
IRConstant(2))
After Constant Folding:
IRConstant(300)
Generated Bytecode (Optimized):
MOVE_I64 R0, 300 ; R0 = 300 (folded constant)
RET_VALUE R0
14.7.4 CSE: (a + b) + (a + b)
Before CSE:
IRBinaryOperation(kAdd,
IRBinaryOperation(kAdd, ; First (a + b)
IRParameter(a),
IRParameter(b)),
IRBinaryOperation(kAdd, ; Second (a + b) - duplicate!
IRParameter(a),
IRParameter(b)))
After CSE (DAG with sharing):
t1 = IRBinaryOperation(kAdd,
IRParameter(a),
IRParameter(b))
IRBinaryOperation(kAdd, t1, t1) ; Reuse t1
Generated Bytecode:
LOAD_PARAM R0, 0 ; R0 = a
LOAD_PARAM R1, 1 ; R1 = b
ADD_I64 R2, R0, R1 ; R2 = a + b (computed once)
ADD_I64 R3, R2, R2 ; R3 = R2 + R2 (reused)
RET_VALUE R3
14.7.5 Function Call: UPPER(name)
Lowered IR:
IRCall(
call_type=kBuiltin,
function_name="upper",
arguments=[IRParameter(index=1, name="name")])
Generated Bytecode:
LOAD_PARAM R0, 1 ; R0 = name
CALL_BUILTIN R1, funcid=42 ; R1 = upper(R0)
RET_VALUE R1
14.8 Compiler Integration
14.8.1 Compiler Class
The Compiler class provides the main compilation interface:
class Compiler {
public:
// Main entry points
auto compile_sql_expression(const sql::ast::Expr& expr)
-> CompilationResult;
auto compile_sql_expression(
const sql::ast::Expr& expr,
const std::vector<ColumnBinding>& bindings) -> CompilationResult;
// Phase access for testing
auto lower_sql_expression(const sql::ast::Expr& expr)
-> std::shared_ptr<IRModule>;
auto optimize_ir(std::shared_ptr<IRModule> module)
-> std::shared_ptr<IRModule>;
auto generate_bytecode(const IRModule& module)
-> std::unique_ptr<BytecodeModule>;
private:
CompilerOptions options_;
std::unique_ptr<CompilationCache> cache_;
};
14.8.2 Compiler Options
struct CompilerOptions {
// Optimization control
OptimizationLevel optimization_level = OptimizationLevel::kBasic;
bool fold_constants = true;
bool eliminate_dead_code = true;
bool enable_cse = true;
bool strength_reduction = true;
bool register_coalescing = true;
// Debug output
bool generate_debug_info = false;
bool preserve_names = false;
bool trace_compilation = false;
// Code generation
bool use_extended_opcodes = true;
uint32_t max_registers = 16;
// Cache
bool enable_cache = true;
size_t cache_max_entries = 1024;
};
14.8.3 Compilation Result
struct CompilationResult {
bool success;
std::unique_ptr<BytecodeModule> module;
std::string error_message;
uint32_t error_line;
// Registered subqueries
std::vector<const sql::ast::SubqueryExpr*> subqueries;
// Statistics
size_t ir_nodes_before_opt;
size_t ir_nodes_after_opt;
size_t bytecode_size;
double compilation_time_ms;
};
14.8.4 Compilation Cache
The compiler caches compiled modules to avoid recompilation:
struct CompilationCacheKey {
uint64_t hash; // Expression hash
std::string source_type; // "sql", "filter", etc.
};
class CompilationCache {
public:
auto lookup(const CompilationCacheKey& key) const
-> const BytecodeModule*;
void insert(const CompilationCacheKey& key,
std::unique_ptr<BytecodeModule> module);
void clear();
auto hit_count() const -> uint64_t;
auto miss_count() const -> uint64_t;
auto hit_rate() const -> double;
private:
std::unordered_map<uint64_t, std::unique_ptr<BytecodeModule>> cache_;
mutable uint64_t hits_ = 0;
mutable uint64_t misses_ = 0;
};
Cache Effectiveness:
- Same expression: 100% hit rate
- Similar expressions (different literals): Cache miss
- Typical mixed workload: 60-85% hit rate
14.9 Performance Characteristics
14.9.1 Compilation Time
| Expression Complexity | Typical Time |
|---|---|
Simple comparison (a > 5) | 50-100 us |
Medium (a > 5 AND b < 10) | 100-200 us |
| Complex (10+ operations) | 200-500 us |
| Very complex (subqueries) | 500-1000 us |
14.9.2 Bytecode Size
| Expression | Instructions | Module Size |
|---|---|---|
age >= 18 | 4 | ~100 bytes |
a AND b AND c | 8-10 | ~150 bytes |
| Complex CASE | 15-20 | ~250 bytes |
14.9.3 Optimization Impact
| Optimization | Typical Reduction |
|---|---|
| Constant Folding | 10-30% fewer nodes |
| Dead Code Elimination | 5-15% fewer nodes |
| CSE | 5-20% fewer nodes |
| Strength Reduction | 0-5% faster execution |
14.10 Summary
The CVM compilation pipeline transforms SQL expressions into efficient bytecode through a well-structured sequence of phases:
-
Lowering: SQL AST nodes are translated to IR nodes, establishing type information and column bindings.
-
Optimization: Multiple passes improve the IR:
- Constant folding evaluates compile-time expressions
- Dead code elimination removes unreachable code
- CSE deduplicates equivalent subexpressions
- Strength reduction replaces expensive operations
- Register allocation assigns virtual registers
-
Code Generation: The optimized IR is translated to bytecode with type-aware opcode selection and constant pool management.
-
Module Packaging: The bytecode, constant pool, and metadata are packaged into a
BytecodeModulefor execution.
The compilation pipeline balances compilation speed against execution performance, with configurable optimization levels allowing users to choose the appropriate trade-off for their workload.