Chapter 14: CVM Compilation Pipeline

14.1 Introduction to Query Compilation

The CVM compilation pipeline transforms SQL expressions into efficient bytecode through a series of well-defined stages. This chapter examines each stage in detail, from the initial lowering of SQL Abstract Syntax Trees to the final bytecode generation.

Query compilation occupies a critical position in database performance. While interpretation offers flexibility, compilation enables optimizations impossible in an interpreted setting—constant folding, common subexpression elimination, and register allocation all occur at compile time, reducing runtime overhead.

14.1.1 Compilation Pipeline Overview

The compilation process follows a classic three-stage compiler architecture:

Loading diagram...

14.1.2 Design Philosophy

The compilation pipeline embodies several key principles:

Separation of Concerns: Lowering, optimization, and code generation are independent phases with clean interfaces.
Progressive Refinement: Each stage transforms the program into a form closer to executable bytecode.
Optimization Composability: Passes can be added, removed, or reordered without affecting correctness.
Debug Transparency: Source location information flows through all stages for error reporting.

14.2 Intermediate Representation

The IR (Intermediate Representation) serves as the central data structure connecting frontend and backend. CVM uses a DAG-based (Directed Acyclic Graph) IR that naturally represents expression trees with sharing.

14.2.1 IR Node Types

All IR nodes inherit from the IRNode base class:

enum class IRNodeKind : uint8_t {
  kConstant,         // Literal constant value
  kParameter,        // Input parameter (column, bound value)
  kBinaryOperation,  // Binary operation (+, -, ==, <, &&)
  kUnaryOperation,   // Unary operation (-, NOT, ABS)
  kCall,             // Function call (builtin, scalar, aggregate)
  kConditional,      // IF-THEN-ELSE / CASE expression
  kCast,             // Type cast
  kFieldAccess,      // Document field access
  kArrayAccess,      // Array element access
  kArrayConstruct,   // Array literal construction
  kCoalesce,         // COALESCE expression
  kNullTest          // IS NULL / IS NOT NULL
};

14.2.2 Node Properties

Each IR node carries metadata essential for optimization and code generation:

class IRNode {
protected:
  IRNodeKind kind_;           // Node type discriminant
  CVMType result_type_;       // Result type (kInt64, kDouble, etc.)
  bool is_nullable_;          // Can result be NULL?
  uint32_t node_id_;          // Unique ID for CSE
  std::vector<IRNodePtr> children_;  // Child nodes
};

Type Information: The result_type_ field enables type-specific code generation. When the type is known at compile time, the generator emits type-specialized opcodes (e.g., kAddInt64 vs kAddFloat64).

Nullability Tracking: The is_nullable_ flag propagates through operations, enabling NULL-check elimination when values are guaranteed non-NULL.

14.2.3 Constant Nodes

Constants represent literal values known at compile time:

class IRConstant : public IRNode {
  std::variant<
    std::monostate,    // NULL
    bool,              // Boolean
    int64_t,           // Integer
    double,            // Floating point
    std::string        // String
  > value_;
};

The IRBuilder deduplicates constants—multiple references to the same constant value share a single IRConstant node.

14.2.4 Parameter Nodes

Parameters represent runtime inputs:

class IRParameter : public IRNode {
  ParameterSource source_;    // Column, bound value, input doc
  uint32_t index_;            // Parameter index
  std::string name_;          // Optional name for debugging
};

enum class ParameterSource {
  kColumnReference,   // Column from input tuple
  kBoundValue,        // Prepared statement parameter
  kInputDocument,     // Root input document
  kCurrentRow         // Current row in iteration
};

14.2.5 Binary Operation Nodes

Binary operations cover arithmetic, comparison, logical, and string operations:

class IRBinaryOperation : public IRNode {
  BinaryOperationType op_;    // Operation type
  IRNodePtr left_;            // Left operand
  IRNodePtr right_;           // Right operand
};

enum class BinaryOperationType {
  // Arithmetic
  kAdd, kSubtract, kMultiply, kDivide, kModulo,

  // Comparison
  kEqual, kNotEqual, kLessThan, kLessEqual,
  kGreaterThan, kGreaterEqual,

  // Logical
  kLogicalAnd, kLogicalOr,

  // String
  kStringConcat, kLike, kILike, kRegexMatch,

  // Bitwise
  kBitwiseAnd, kBitwiseOr, kBitwiseXor,
  kShiftLeft, kShiftRight,

  // Array
  kArrayContains, kArrayConcat
};

14.2.6 Function Call Nodes

Function calls support multiple calling conventions:

class IRCall : public IRNode {
  CallType call_type_;              // Builtin, scalar, aggregate, etc.
  std::string function_name_;       // Function identifier
  std::vector<IRNodePtr> arguments_;
  bool is_distinct_;                // DISTINCT modifier
  IRNodePtr filter_;                // FILTER clause
  bool is_volatile_;                // Prevents CSE
};

enum class CallType {
  kBuiltin,     // Built-in function (UPPER, LENGTH)
  kScalar,      // User-defined scalar function
  kAggregate,   // Aggregate function (SUM, COUNT)
  kWindow,      // Window function (ROW_NUMBER, LAG)
  kSubquery,    // Scalar subquery
  kExternal     // External function call
};

The is_volatile_ flag marks non-deterministic functions (like RANDOM() or NOW()) to prevent common subexpression elimination.

14.2.7 IR Module Structure

An IR module packages the compiled expression with metadata:

class IRModule {
  IRNodePtr root_;                  // Root expression node
  std::vector<IRNodePtr> nodes_;    // All nodes (for traversal)
  std::string name_;                // Module name
  size_t parameter_count_;          // Input parameter count
};

14.3 SQL Lowering

The lowering phase transforms SQL AST nodes into CVM IR nodes. This translation handles SQL-specific constructs and prepares the expression for optimization.

14.3.1 Lowering Configuration

struct SQLLoweringOptions {
  bool fold_constants = true;       // Fold during lowering
  bool enable_cse = true;           // Enable CSE later
  bool preserve_names = true;       // Keep column names
  FunctionVolatilityChecker volatility_checker;
};

14.3.2 Column Binding

Column bindings map SQL column references to their runtime positions:

struct ColumnBinding {
  uint32_t index;          // Position in input tuple
  std::string name;        // Column name
  CVMType type;            // CVM type
  bool is_nullable = true; // Nullability
};

For a query like SELECT * FROM users WHERE age > 21, the column bindings might be:

Index	Name	Type	Nullable
0	id	kInt64	false
1	name	kString	true
2	age	kInt64	true
3	email	kString	true

14.3.3 Expression Lowering

The SQLLowering class implements a visitor pattern over SQL AST nodes:

class SQLLowering {
public:
  auto lower(const Expr& expr) -> std::unique_ptr<IRModule>;
  auto lower(const Expr& expr, const std::vector<ColumnBinding>& columns)
      -> std::unique_ptr<IRModule>;

private:
  auto lower_expr_(const Expr& expr) -> IRNodePtr;
  auto lower_column_ref_(const ColumnRef& ref) -> IRNodePtr;
  auto lower_constant_(const Constant& c) -> IRNodePtr;
  auto lower_binary_expr_(const BinaryExpr& expr) -> IRNodePtr;
  auto lower_unary_expr_(const UnaryExpr& expr) -> IRNodePtr;
  auto lower_function_call_(const FunctionCall& call) -> IRNodePtr;
  auto lower_case_expr_(const CaseExpr& expr) -> IRNodePtr;
  auto lower_cast_expr_(const CastExpr& expr) -> IRNodePtr;
  auto lower_subquery_expr_(const SubqueryExpr& expr) -> IRNodePtr;
};

14.3.4 Lowering Examples

Column Reference:

SQL: age

auto lower_column_ref_(const ColumnRef& ref) -> IRNodePtr {
  auto binding = find_binding_(ref.column_name);
  return builder_.make_parameter(
      ParameterSource::kColumnReference,
      binding.index,
      binding.type,
      binding.name);
}

Binary Expression:

SQL: age >= 18

auto lower_binary_expr_(const BinaryExpr& expr) -> IRNodePtr {
  auto left = lower_expr_(expr.left);
  auto right = lower_expr_(expr.right);
  auto op = map_binary_op_(expr.op);  // SQL op -> IR op
  return builder_.make_binary(op, left, right);
}

CASE Expression:

SQL: CASE WHEN age >= 18 THEN 'adult' ELSE 'minor' END

auto lower_case_expr_(const CaseExpr& expr) -> IRNodePtr {
  // CASE expressions become nested conditionals
  auto condition = lower_expr_(expr.when_clause);
  auto then_branch = lower_expr_(expr.then_result);
  auto else_branch = lower_expr_(expr.else_result);
  return builder_.make_conditional(condition, then_branch, else_branch);
}

14.3.5 Operator Mapping

SQL operators map to IR operations:

SQL Operator	IR Operation
`+`	`kAdd`
`-`	`kSubtract`
`*`	`kMultiply`
`/`	`kDivide`
`%`	`kModulo`
`=`	`kEqual`
`<>`, `!=`	`kNotEqual`
`<`	`kLessThan`
`<=`	`kLessEqual`
`>`	`kGreaterThan`
`>=`	`kGreaterEqual`
`AND`	`kLogicalAnd`
`OR`	`kLogicalOr`
`
`LIKE`	`kLike`
`ILIKE`	`kILike`

14.3.6 Subquery Registration

Scalar subqueries are registered for runtime execution:

auto lower_subquery_expr_(const SubqueryExpr& expr) -> IRNodePtr {
  // Register subquery for runtime execution
  auto subquery_index = register_subquery_(expr);

  // Create call node referencing the registered subquery
  return builder_.make_call(
      CallType::kSubquery,
      "subquery_" + std::to_string(subquery_index),
      {}  // Arguments bound at runtime
  );
}

14.4 IR Optimization

The optimization phase transforms the IR to improve execution efficiency. Optimizations are implemented as composable passes.

14.4.1 Optimization Configuration

struct OptimizerConfig {
  uint8_t optimization_level = 1;      // 0-3
  bool enable_constant_folding = true;
  bool enable_dce = true;
  bool enable_cse = true;
  bool enable_strength_reduction = true;
  size_t max_iterations = 10;
};

enum class OptimizationLevel : uint8_t {
  kNone = 0,       // O0: No optimization
  kBasic = 1,      // O1: Constant folding + DCE
  kStandard = 2,   // O2: O1 + CSE + strength reduction
  kAggressive = 3  // O3: O2 + advanced transforms
};

14.4.2 Pass Framework

All optimization passes inherit from a common base:

class OptimizationPass {
public:
  virtual auto run(std::shared_ptr<IRModule> module)
      -> OptimizationResult = 0;
  virtual auto name() const -> std::string = 0;
  virtual auto enabled_at_level(uint8_t level) const -> bool = 0;
};

struct OptimizationResult {
  bool changed;              // IR was modified
  size_t nodes_removed;      // Nodes eliminated
  size_t nodes_added;        // New nodes created
  std::string pass_name;     // Pass identifier
};

14.4.3 Constant Folding

Purpose: Evaluate constant expressions at compile time.

Algorithm:

Post-order traversal of IR DAG
Recursively fold children first
When all operands are constants, evaluate operation
Replace subtree with IRConstant result

Example:

Before:

BinaryOp(kMultiply)
  BinaryOp(kAdd)
    Constant(3)
    Constant(4)
  Constant(2)

After:

Constant(14)

Implementation:

auto ConstantFoldingPass::fold_(IRNodePtr node) -> IRNodePtr {
  // Fold children first
  for (auto& child : node->children()) {
    child = fold_(child);
  }

  // Check if all children are constants
  if (!all_constants_(node->children())) {
    return node;
  }

  // Skip volatile functions
  if (is_volatile_(node)) {
    return node;
  }

  // Evaluate at compile time
  auto result = evaluate_(node);
  return builder_.make_constant(result);
}

Folded Operations:

Arithmetic: 3 + 4 -> 7
Comparison: 5 > 3 -> true
String: 'Hello' || ' World' -> 'Hello World'
Boolean: true AND false -> false

14.4.4 Dead Code Elimination

Purpose: Remove unreachable or unused code.

Algorithm:

Mark live nodes reachable from root
Count uses of each node
Eliminate branches with dead successors
Simplify conditionals with constant conditions

Example:

Before:

Conditional
  Constant(false)
  Call(expensive_function)  // then branch
  Parameter(x)              // else branch

After:

Parameter(x)

Implementation:

auto DeadCodeEliminationPass::eliminate_(IRNodePtr node) -> IRNodePtr {
  if (node->kind() == IRNodeKind::kConditional) {
    auto* cond = static_cast<IRConditional*>(node.get());

    // Check if condition is constant
    if (auto* const_cond = as_constant_(cond->condition())) {
      if (const_cond->as_bool()) {
        return cond->then_branch();  // Condition always true
      } else {
        return cond->else_branch();  // Condition always false
      }
    }
  }

  return node;
}

14.4.5 Common Subexpression Elimination

Purpose: Deduplicate equivalent expressions.

Algorithm:

Hash each node based on operation and operand IDs
Find nodes with identical hash
Perform structural equality check
Replace duplicates with first occurrence

Example:

Before:

BinaryOp(kAdd)
  BinaryOp(kMultiply)
    Parameter(a)
    Parameter(b)
  BinaryOp(kMultiply)   // Duplicate!
    Parameter(a)
    Parameter(b)

After:

BinaryOp(kAdd)
  t1 = BinaryOp(kMultiply)
    Parameter(a)
    Parameter(b)
  t1                     // Reuse!

Safety Considerations:

CSE must respect function volatility:

auto CSEPass::can_eliminate_(const IRNode* node) const -> bool {
  if (node->kind() == IRNodeKind::kCall) {
    auto* call = static_cast<const IRCall*>(node);
    // Don't eliminate volatile functions
    return !call->is_volatile();
  }
  return true;
}

Volatile functions include:

RANDOM() - Non-deterministic
NOW() - Time-dependent
NEXTVAL() - Sequence-dependent

14.4.6 Strength Reduction

Purpose: Replace expensive operations with cheaper equivalents.

Transformations:

Original	Optimized	Condition
`x * 2`	`x + x`	Always
`x * 2^n`	`x << n`	n is constant
`x / 2^n`	`x >> n`	n is constant, x unsigned
`x % 2`	`x & 1`	Always
`x * 0`	`0`	Always
`x * 1`	`x`	Always
`x + 0`	`x`	Always
`x - 0`	`x`	Always

Implementation:

auto StrengthReductionPass::reduce_(IRNodePtr node) -> IRNodePtr {
  if (node->kind() != IRNodeKind::kBinaryOperation) {
    return node;
  }

  auto* binop = static_cast<IRBinaryOperation*>(node.get());

  // x * 2 -> x + x
  if (binop->op() == BinaryOperationType::kMultiply) {
    if (is_constant_int_(binop->right(), 2)) {
      return builder_.make_binary(
          BinaryOperationType::kAdd,
          binop->left(),
          binop->left());
    }
  }

  // x * 2^n -> x << n
  if (binop->op() == BinaryOperationType::kMultiply) {
    if (auto power = get_power_of_two_(binop->right())) {
      return builder_.make_binary(
          BinaryOperationType::kShiftLeft,
          binop->left(),
          builder_.make_constant(*power));
    }
  }

  return node;
}

14.4.7 Register Allocation

Purpose: Assign virtual registers to IR values.

Algorithm: Linear scan with graph coloring hints

Liveness Analysis: Compute live ranges for each node
Interference Graph: Build graph where edges indicate simultaneous liveness
Graph Coloring: Assign registers (colors) minimizing conflicts
Coalescing: Merge nodes with no interference
Spilling: Push excess values to stack

Configuration:

struct RegisterAllocationConfig {
  uint32_t num_general_registers = 16;
  uint32_t num_float_registers = 8;
  bool enable_coalescing = true;
};

Output:

// Maps IR nodes to allocated registers
std::unordered_map<const IRNode*, uint8_t> allocation_map_;

14.4.8 Optimization Pipeline

The optimizer runs passes iteratively until fixed point:

auto Optimizer::optimize(std::shared_ptr<IRModule> module)
    -> std::shared_ptr<IRModule> {

  for (size_t iter = 0; iter < config_.max_iterations; ++iter) {
    auto any_changed = false;

    for (auto& pass : passes_) {
      if (!pass->enabled_at_level(config_.optimization_level)) {
        continue;
      }

      auto result = pass->run(module);
      if (result.changed) {
        any_changed = true;
        stats_.nodes_removed += result.nodes_removed;
      }
    }

    if (!any_changed) {
      break;  // Fixed point reached
    }
  }

  return module;
}

14.5 Code Generation

The code generation phase transforms optimized IR into CVM bytecode.

14.5.1 Generator Configuration

struct BytecodeGenOptions {
  bool optimize_constants = true;   // Deduplicate constant loads
  bool optimize_registers = true;   // Minimize register usage
  bool emit_debug_info = true;      // Source mapping
  uint8_t max_registers = 16;       // Available registers
};

14.5.2 Code Generation Algorithm

The generator uses a depth-first visitor pattern:

class BytecodeGenerator {
public:
  auto generate(const IRModule& module) -> std::unique_ptr<BytecodeModule>;

private:
  // Returns register containing result
  auto visit_(const IRNode& node) -> uint8_t;

  auto visit_constant_(const IRConstant& node) -> uint8_t;
  auto visit_parameter_(const IRParameter& node) -> uint8_t;
  auto visit_binary_operation_(const IRBinaryOperation& node) -> uint8_t;
  auto visit_unary_operation_(const IRUnaryOperation& node) -> uint8_t;
  auto visit_call_(const IRCall& node) -> uint8_t;
  auto visit_conditional_(const IRConditional& node) -> uint8_t;
  auto visit_cast_(const IRCast& node) -> uint8_t;
  auto visit_field_access_(const IRFieldAccess& node) -> uint8_t;
};

14.5.3 Register Management

The generator maintains a register allocator:

class BytecodeGenerator {
private:
  // Register allocation state
  std::bitset<16> registers_in_use_;
  std::deque<uint8_t> allocation_order_;  // FIFO for eviction

  auto allocate_register_() -> uint8_t;
  auto allocate_temporary_() -> uint8_t;
  void free_register_(uint8_t reg);
  auto evict_register_() -> uint8_t;  // Spill to stack
};

FIFO Eviction: When all registers are in use, the oldest allocated register is evicted to the stack:

auto BytecodeGenerator::evict_register_() -> uint8_t {
  auto reg = allocation_order_.front();
  allocation_order_.pop_front();

  // Spill to stack
  emit_(Opcode::kStoreReg, reg, next_stack_slot_++);

  return reg;
}

14.5.4 Type-Aware Opcode Selection

The generator selects type-specific opcodes when types are known:

auto BytecodeGenerator::select_comparison_opcode_(
    BinaryOperationType op, CVMType type) -> Opcode {

  if (type == CVMType::kInt64) {
    switch (op) {
      case BinaryOperationType::kEqual:
        return Opcode::kCompareEqualInt64;
      case BinaryOperationType::kLessThan:
        return Opcode::kCompareLessThanInt64;
      // ...
    }
  } else if (type == CVMType::kDouble) {
    switch (op) {
      case BinaryOperationType::kEqual:
        return Opcode::kCompareEqualFloat64;
      // ...
    }
  } else if (type == CVMType::kString) {
    switch (op) {
      case BinaryOperationType::kEqual:
        return Opcode::kCompareEqualString;
      // ...
    }
  }

  // Fallback to polymorphic
  return Opcode::kCompareEqualPolymorphic;
}

14.5.5 Constant Pool Management

Constants are deduplicated in the constant pool:

class BytecodeGenerator {
private:
  ConstantPool constant_pool_;

  // Deduplication maps
  std::unordered_map<std::string, uint16_t> string_map_;
  std::unordered_map<int64_t, uint16_t> int64_map_;
  std::unordered_map<uint64_t, uint16_t> double_map_;  // Bit pattern

  auto add_string_constant_(const std::string& str) -> uint16_t {
    auto it = string_map_.find(str);
    if (it != string_map_.end()) {
      return it->second;
    }
    auto index = constant_pool_.add_string(str);
    string_map_[str] = index;
    return index;
  }
};

14.5.6 Jump Target Resolution

Jumps use PC-relative offsets that are patched after code generation:

struct PendingJump {
  uint32_t instruction_offset;  // Where to patch
  std::string target_label;     // Target label
};

void BytecodeGenerator::emit_jump_(const std::string& label) {
  auto offset = current_offset_();

  // Emit placeholder
  emit_instruction_(encode_format_c(
      Opcode::kJump, 0, 0));  // offset = 0 (placeholder)

  // Record for later patching
  pending_jumps_.push_back({offset, label});
}

void BytecodeGenerator::resolve_jumps_() {
  for (const auto& jump : pending_jumps_) {
    auto target = label_offsets_[jump.target_label];
    auto relative = (target - jump.instruction_offset) / 4;
    module_->patch_jump_offset(jump.instruction_offset, relative);
  }
}

14.6 Bytecode Module Structure

The compiled bytecode is packaged in a BytecodeModule with metadata.

14.6.1 Module Header

struct BytecodeHeader {
  uint32_t magic;                  // 0x304D5643 ("CVM0")
  uint16_t version_major;          // Major version
  uint16_t version_minor;          // Minor version
  uint32_t flags;                  // Module flags
  uint32_t constant_pool_offset;   // Constant pool location
  uint32_t constant_pool_size;     // Constant pool size
  uint32_t code_offset;            // Code section location
  uint32_t code_size;              // Code section size
  uint32_t debug_offset;           // Debug info location
  uint32_t debug_size;             // Debug info size
  uint32_t entry_point;            // Entry point offset
  uint64_t expression_hash;        // For caching
  uint8_t max_registers;           // GPRs used
  uint8_t max_float_registers;     // FPRs used
  uint8_t stack_depth;             // Stack depth required
  uint8_t reserved[13];            // Future expansion
};
static_assert(sizeof(BytecodeHeader) == 64);

14.6.2 Module Flags

enum class ModuleFlags : uint32_t {
  kNone = 0,
  kDebugInfo = 1 << 0,      // Contains debug info
  kOptimized = 1 << 1,      // Has been optimized
  kJitReady = 1 << 2,       // Prepared for JIT
  kSourceMapped = 1 << 3    // Contains source mapping
};

14.6.3 Constant Pool

The constant pool stores all constants referenced by bytecode:

enum class ConstantType : uint8_t {
  kNull = 0,       // NULL value
  kBool = 1,       // Boolean
  kInt64 = 2,      // 64-bit integer
  kDouble = 3,     // 64-bit float
  kString = 4,     // String (string table index)
  kFieldRef = 5,   // Field name reference
  kFuncRef = 6,    // Function reference
  kTypeId = 7,     // Type identifier
  kArray = 8       // Array of values
};

14.6.4 Debug Information

Debug info maps bytecode offsets to source locations:

struct DebugLocation {
  uint32_t code_offset;     // Bytecode offset
  uint32_t source_line;     // Source line (1-based)
  uint32_t source_column;   // Source column (1-based)
  uint32_t source_length;   // Source span length
};

struct DebugInfo {
  std::string source_name;                // Source identifier
  std::string source_text;                // Original source
  std::vector<DebugLocation> locations;   // Offset mapping
};

14.6.5 Module Layout

Loading diagram...

14.7 Compilation Examples

14.7.1 Simple Comparison: `age >= 18`

SQL AST:

BinaryExpr(GreaterEqual,
  ColumnRef("age"),
  Constant(18))

Lowered IR:

IRBinaryOperation(kGreaterEqual,
  IRParameter(index=0, name="age", type=kInt64),
  IRConstant(18))

Generated Bytecode:

LOAD_PARAM R0, 0              ; R0 = age
LOAD_CONST R1, pool[0]        ; R1 = 18
CMP_GE_I64 R2, R0, R1         ; R2 = (R0 >= R1)
RET_VALUE R2                   ; return R2

Bytecode Encoding:

0x0D 0x00 0x00 0x00    ; LOAD_PARAM R0, idx=0
0x07 0x10 0x00 0x00    ; LOAD_CONST R1, pool[0]
0x45 0x20 0x01 0x00    ; CMP_GE_I64 R2, R0, R1
0x78 0x20 0x00 0x00    ; RET_VALUE R2

14.7.2 Conditional: `CASE WHEN age >= 18 THEN 'adult' ELSE 'minor' END`

Lowered IR:

IRConditional(
  condition=IRBinaryOperation(kGreaterEqual, ...),
  then_branch=IRConstant("adult"),
  else_branch=IRConstant("minor"))

Generated Bytecode:

LOAD_PARAM R0, 0              ; R0 = age
LOAD_CONST R1, pool[0]        ; R1 = 18
CMP_GE_I64 R2, R0, R1         ; R2 = condition
JUMP_FALSE R2, else_label     ; if false, goto else
LOAD_CONST R3, pool[1]        ; R3 = "adult"
JUMP end_label
else_label:
LOAD_CONST R3, pool[2]        ; R3 = "minor"
end_label:
RET_VALUE R3                   ; return result

14.7.3 Constant Folding: `(100 + 50) * 2`

Before Optimization:

IRBinaryOperation(kMultiply,
  IRBinaryOperation(kAdd,
    IRConstant(100),
    IRConstant(50)),
  IRConstant(2))

After Constant Folding:

IRConstant(300)

Generated Bytecode (Optimized):

MOVE_I64 R0, 300              ; R0 = 300 (folded constant)
RET_VALUE R0

14.7.4 CSE: `(a + b) + (a + b)`

Before CSE:

IRBinaryOperation(kAdd,
  IRBinaryOperation(kAdd,     ; First (a + b)
    IRParameter(a),
    IRParameter(b)),
  IRBinaryOperation(kAdd,     ; Second (a + b) - duplicate!
    IRParameter(a),
    IRParameter(b)))

After CSE (DAG with sharing):

t1 = IRBinaryOperation(kAdd,
       IRParameter(a),
       IRParameter(b))
IRBinaryOperation(kAdd, t1, t1)  ; Reuse t1

Generated Bytecode:

LOAD_PARAM R0, 0              ; R0 = a
LOAD_PARAM R1, 1              ; R1 = b
ADD_I64 R2, R0, R1            ; R2 = a + b (computed once)
ADD_I64 R3, R2, R2            ; R3 = R2 + R2 (reused)
RET_VALUE R3

14.7.5 Function Call: `UPPER(name)`

Lowered IR:

IRCall(
  call_type=kBuiltin,
  function_name="upper",
  arguments=[IRParameter(index=1, name="name")])

Generated Bytecode:

LOAD_PARAM R0, 1              ; R0 = name
CALL_BUILTIN R1, funcid=42    ; R1 = upper(R0)
RET_VALUE R1

14.8 Compiler Integration

14.8.1 Compiler Class

The Compiler class provides the main compilation interface:

class Compiler {
public:
  // Main entry points
  auto compile_sql_expression(const sql::ast::Expr& expr)
      -> CompilationResult;

  auto compile_sql_expression(
      const sql::ast::Expr& expr,
      const std::vector<ColumnBinding>& bindings) -> CompilationResult;

  // Phase access for testing
  auto lower_sql_expression(const sql::ast::Expr& expr)
      -> std::shared_ptr<IRModule>;

  auto optimize_ir(std::shared_ptr<IRModule> module)
      -> std::shared_ptr<IRModule>;

  auto generate_bytecode(const IRModule& module)
      -> std::unique_ptr<BytecodeModule>;

private:
  CompilerOptions options_;
  std::unique_ptr<CompilationCache> cache_;
};

14.8.2 Compiler Options

struct CompilerOptions {
  // Optimization control
  OptimizationLevel optimization_level = OptimizationLevel::kBasic;
  bool fold_constants = true;
  bool eliminate_dead_code = true;
  bool enable_cse = true;
  bool strength_reduction = true;
  bool register_coalescing = true;

  // Debug output
  bool generate_debug_info = false;
  bool preserve_names = false;
  bool trace_compilation = false;

  // Code generation
  bool use_extended_opcodes = true;
  uint32_t max_registers = 16;

  // Cache
  bool enable_cache = true;
  size_t cache_max_entries = 1024;
};

14.8.3 Compilation Result

struct CompilationResult {
  bool success;
  std::unique_ptr<BytecodeModule> module;
  std::string error_message;
  uint32_t error_line;

  // Registered subqueries
  std::vector<const sql::ast::SubqueryExpr*> subqueries;

  // Statistics
  size_t ir_nodes_before_opt;
  size_t ir_nodes_after_opt;
  size_t bytecode_size;
  double compilation_time_ms;
};

14.8.4 Compilation Cache

The compiler caches compiled modules to avoid recompilation:

struct CompilationCacheKey {
  uint64_t hash;           // Expression hash
  std::string source_type; // "sql", "filter", etc.
};

class CompilationCache {
public:
  auto lookup(const CompilationCacheKey& key) const
      -> const BytecodeModule*;
  void insert(const CompilationCacheKey& key,
              std::unique_ptr<BytecodeModule> module);
  void clear();

  auto hit_count() const -> uint64_t;
  auto miss_count() const -> uint64_t;
  auto hit_rate() const -> double;

private:
  std::unordered_map<uint64_t, std::unique_ptr<BytecodeModule>> cache_;
  mutable uint64_t hits_ = 0;
  mutable uint64_t misses_ = 0;
};

Cache Effectiveness:

Same expression: 100% hit rate
Similar expressions (different literals): Cache miss
Typical mixed workload: 60-85% hit rate

14.9 Performance Characteristics

14.9.1 Compilation Time

Expression Complexity	Typical Time
Simple comparison (`a > 5`)	50-100 us
Medium (`a > 5 AND b < 10`)	100-200 us
Complex (10+ operations)	200-500 us
Very complex (subqueries)	500-1000 us

14.9.2 Bytecode Size

Expression	Instructions	Module Size
`age >= 18`	4	~100 bytes
`a AND b AND c`	8-10	~150 bytes
Complex CASE	15-20	~250 bytes

14.9.3 Optimization Impact

Optimization	Typical Reduction
Constant Folding	10-30% fewer nodes
Dead Code Elimination	5-15% fewer nodes
CSE	5-20% fewer nodes
Strength Reduction	0-5% faster execution

14.10 Summary

The CVM compilation pipeline transforms SQL expressions into efficient bytecode through a well-structured sequence of phases:

Lowering: SQL AST nodes are translated to IR nodes, establishing type information and column bindings.
Optimization: Multiple passes improve the IR:
- Constant folding evaluates compile-time expressions
- Dead code elimination removes unreachable code
- CSE deduplicates equivalent subexpressions
- Strength reduction replaces expensive operations
- Register allocation assigns virtual registers
Code Generation: The optimized IR is translated to bytecode with type-aware opcode selection and constant pool management.
Module Packaging: The bytecode, constant pool, and metadata are packaged into a BytecodeModule for execution.

The compilation pipeline balances compilation speed against execution performance, with configurable optimization levels allowing users to choose the appropriate trade-off for their workload.

Chapter 14: CVM Compilation Pipeline

14.1 Introduction to Query Compilation

14.1.1 Compilation Pipeline Overview

14.1.2 Design Philosophy

14.2 Intermediate Representation

14.2.1 IR Node Types

14.2.2 Node Properties

14.2.3 Constant Nodes

14.2.4 Parameter Nodes

14.2.5 Binary Operation Nodes

14.2.6 Function Call Nodes

14.2.7 IR Module Structure

14.3 SQL Lowering

14.3.1 Lowering Configuration

14.3.2 Column Binding

14.3.3 Expression Lowering

14.3.4 Lowering Examples

14.3.5 Operator Mapping

14.3.6 Subquery Registration

14.4 IR Optimization

14.4.1 Optimization Configuration

14.4.2 Pass Framework

14.4.3 Constant Folding

14.4.4 Dead Code Elimination

14.4.5 Common Subexpression Elimination

14.4.6 Strength Reduction

14.4.7 Register Allocation

14.4.8 Optimization Pipeline

14.5 Code Generation

14.5.1 Generator Configuration

14.5.2 Code Generation Algorithm

14.5.3 Register Management

14.5.4 Type-Aware Opcode Selection

14.5.5 Constant Pool Management

14.5.6 Jump Target Resolution

14.6 Bytecode Module Structure

14.6.1 Module Header

14.6.2 Module Flags

14.6.3 Constant Pool

14.6.4 Debug Information

14.6.5 Module Layout

14.7 Compilation Examples

14.7.1 Simple Comparison: age >= 18

14.7.2 Conditional: CASE WHEN age >= 18 THEN 'adult' ELSE 'minor' END

14.7.3 Constant Folding: (100 + 50) * 2

14.7.4 CSE: (a + b) + (a + b)

14.7.5 Function Call: UPPER(name)

14.8 Compiler Integration

14.8.1 Compiler Class

14.8.2 Compiler Options

14.8.3 Compilation Result

14.8.4 Compilation Cache

14.9 Performance Characteristics

14.9.1 Compilation Time

14.9.2 Bytecode Size

14.9.3 Optimization Impact

14.10 Summary

14.7.1 Simple Comparison: `age >= 18`

14.7.2 Conditional: `CASE WHEN age >= 18 THEN 'adult' ELSE 'minor' END`

14.7.3 Constant Folding: `(100 + 50) * 2`

14.7.4 CSE: `(a + b) + (a + b)`

14.7.5 Function Call: `UPPER(name)`