Chapter 32: Observability and Debugging

Observability forms the foundation for understanding, diagnosing, and optimizing database behavior in production environments. A database engine operates as a complex system with numerous interacting components—query processing, storage, replication, memory management—each generating telemetry that operators and developers need to diagnose issues and tune performance. This chapter examines Cognica's comprehensive observability infrastructure, from structured logging to execution tracing, providing the instrumentation necessary for operating a production database system.

32.1 Observability Architecture Overview

Cognica's observability architecture spans multiple dimensions:

Loading diagram...

The observability stack addresses three fundamental questions:

  1. What happened? Logging captures discrete events with contextual information
  2. What is the current state? Metrics provide point-in-time measurements and aggregations
  3. How did execution proceed? Tracing reconstructs the causal flow through the system

The relationship between these dimensions follows the telemetry hierarchy:

Observability=LogsMetricsTraces\text{Observability} = \text{Logs} \cup \text{Metrics} \cup \text{Traces}

where each provides complementary visibility into system behavior.

32.2 Structured Logging Framework

The logging framework provides categorized, level-filtered logging with source location tracking, built on the high-performance spdlog library.

32.2.1 Log Categories

Cognica partitions logs into semantic categories, each targeting specific operational concerns:

enum Category : int32_t {
  kGeneral,   // General application logs
  kError,     // Error conditions and exceptions
  kAccess,    // Access patterns and authentication
  kQueryLog,  // Query execution logs
  kSlowLog,   // Slow query detection
  kSystem,    // System-level events
};

Each category maps to a dedicated logger with independent configuration:

const std::shared_ptr<Logger>& get(Category category);

The category design enables:

  1. Selective Filtering: Enable verbose logging for specific concerns
  2. Separate Rotation: Different retention policies per category
  3. Targeted Analysis: Query-specific logs for optimization
  4. Security Auditing: Access logs for compliance

32.2.2 Log Level Hierarchy

The logging levels follow the standard severity hierarchy:

TRACE<DEBUG<INFO<WARN<ERROR<CRITICAL\text{TRACE} < \text{DEBUG} < \text{INFO} < \text{WARN} < \text{ERROR} < \text{CRITICAL}

Compile-time filtering eliminates overhead for disabled levels:

#if LOGGER_ACTIVE_LEVEL <= LOGGER_LEVEL_DEBUG
#define LOGGER_DEBUG(category, ...) \
  LOGGER_CALL(category, spdlog::level::debug, __VA_ARGS__)
#else
#define LOGGER_DEBUG(category, ...) (void)0
#endif

This macro approach achieves zero overhead when a log level is disabled at compile time, which is critical for performance-sensitive paths like the query execution inner loop.

32.2.3 Source Location Tracking

Every log entry captures precise source location:

#define LOGGER_CALL(category, level, ...)                              \
  ::cognica::logger::get(category)->log(                               \
      spdlog::source_loc {__FILE__, __LINE__, LOGGER_FUNCTION}, level, \
      __VA_ARGS__)

The source location includes:

  • File path: Source file generating the log
  • Line number: Exact line within the file
  • Function name: Enclosing function via __PRETTY_FUNCTION__

This context proves invaluable for debugging, enabling developers to locate the exact code path that generated a particular log entry without manual searching.

32.2.4 Slow Query Log

The slow query log captures queries exceeding a configured threshold:

LOGGER_INFO(kSlowLog, "Slow query: {} ms - {}",
            duration_ms, query_text);

Slow query detection involves:

  1. Threshold Configuration: Configurable cutoff (e.g., 100ms)
  2. Query Text Capture: Full SQL or document query
  3. Timing Breakdown: Total time with phase attribution
  4. Execution Context: Connection ID, user, database

The slow log enables systematic performance optimization by identifying queries that consume disproportionate resources.

32.3 System Profiler

The system profiler provides hierarchical timing and memory tracking with per-thread visibility, enabling detailed performance analysis during development and debugging.

32.3.1 Profiler Architecture

The profiler maintains a tree structure of profiling nodes:

class ProfilerNode {
public:
  using Clock = std::chrono::high_resolution_clock;
  using TimePoint = Clock::time_point;
  using Duration = std::chrono::nanoseconds;

  explicit ProfilerNode(const std::string_view& name,
                        ProfilerNode* parent = nullptr);

  ProfilerNode* add_child(const std::string_view& name);
  void start();
  void stop();

  void track_allocation(size_t size);
  void track_deallocation(size_t size);

  void generate_report(std::ostream& out, int32_t depth = 0) const;

private:
  std::string name_;
  ProfilerNode* parent_;
  TimePoint start_time_;
  Duration total_time_;
  uint64_t call_count_;
  bool is_active_;
  Map<std::string, std::unique_ptr<ProfilerNode>> children_;

  size_t current_memory_;
  size_t peak_memory_;
  uint64_t total_allocations_;
  uint64_t total_deallocations_;
};

Each node tracks:

  • Timing: Total accumulated time and call count
  • Memory: Current usage, peak usage, allocation counts
  • Hierarchy: Parent-child relationships for call tree construction

32.3.2 Thread-Local Profiling

The profiler maintains separate state per thread:

class Profiler {
public:
  static Profiler& instance();

  void begin_scope(const std::string_view& name,
                   const std::source_location location
                   = std::source_location::current());
  void end_scope();

  void track_allocation(size_t size);
  void track_deallocation(size_t size);

  void generate_report(std::ostream& out = std::cout) const;

private:
  struct ThreadData {
    ProfilerNode* current_node;
    std::string thread_name;
    std::stack<ProfilerNode*> scope_stack;
  };

  ProfilerNode root_;
  mutable std::mutex lock_;
  std::unordered_map<std::thread::id, ThreadData> thread_data_;
};

Thread-local tracking enables accurate attribution in multi-threaded query execution where multiple queries execute concurrently.

32.3.3 Scoped Profiling

RAII wrappers automate scope entry/exit:

class ScopedProfiler final {
public:
  explicit ScopedProfiler(const std::string_view& name,
                          const std::source_location location
                          = std::source_location::current())
      : name_(name), location_(location) {
    Profiler::instance().begin_scope(name_, location_);
  }

  ~ScopedProfiler() {
    Profiler::instance().end_scope();
  }
};

Convenience macros simplify instrumentation:

#define PROFILE_SCOPE(name)                           \
  auto _profile_scope_##__LINE__                      \
      = ::cognica::system::profiler::ScopedProfiler { \
    name                                              \
  }

#define PROFILE_FUNCTION()                            \
  auto _profile_function_##__LINE__                   \
      = ::cognica::system::profiler::ScopedProfiler { \
    __func__                                          \
  }

Usage is trivial:

void process_query(const Query& query) {
  PROFILE_FUNCTION();

  {
    PROFILE_SCOPE("Parse");
    parse(query);
  }

  {
    PROFILE_SCOPE("Plan");
    plan(query);
  }

  {
    PROFILE_SCOPE("Execute");
    execute(query);
  }
}

32.3.4 Memory Tracking

The profiler integrates memory tracking alongside timing:

namespace memory {

inline void track_allocation(size_t size) {
  Profiler::instance().track_allocation(size);
}

inline void track_deallocation(size_t size) {
  Profiler::instance().track_deallocation(size);
}

class ScopedMemoryTracker final {
public:
  explicit ScopedMemoryTracker(size_t size,
                               const std::string_view& description = "")
      : size_(size), description_(description) {
    if (!description_.empty()) {
      auto scope_name = fmt::format("Memory: {}", description_);
      Profiler::instance().begin_scope(scope_name);
    }
    Profiler::instance().track_allocation(size_);
  }

  ~ScopedMemoryTracker() {
    Profiler::instance().track_deallocation(size_);
    if (!description_.empty()) {
      Profiler::instance().end_scope();
    }
  }
};

}  // namespace memory

Memory tracking propagates up the hierarchy, so parent nodes accumulate the memory usage of their children, providing both granular and aggregate visibility.

32.3.5 Profiler Output

The profiler generates hierarchical reports:

Performance and Memory Profile Report
=====================================

Overall Memory Statistics:
  Current memory usage: 1024.00 KB
  Peak memory usage: 4096.00 KB

Thread 140735340765312 (main)
-----------------------------
Root:
  process_query:123 [process_query]:
    Calls: 1000, Total: 5234.567 ms, Avg: 5234.567 us
    Memory: Current: 0.00 KB, Peak: 2048.00 KB, Allocs: 50000, Deallocs: 50000
    Parse:
      Calls: 1000, Total: 234.567 ms, Avg: 234.567 us
      Memory: Current: 0.00 KB, Peak: 256.00 KB, Allocs: 5000, Deallocs: 5000
    Plan:
      Calls: 1000, Total: 1000.000 ms, Avg: 1000.000 us
      Memory: Current: 0.00 KB, Peak: 512.00 KB, Allocs: 10000, Deallocs: 10000
    Execute:
      Calls: 1000, Total: 4000.000 ms, Avg: 4000.000 us
      Memory: Current: 0.00 KB, Peak: 1280.00 KB, Allocs: 35000, Deallocs: 35000

The report shows:

  1. Hierarchical Structure: Nested scopes with indentation
  2. Call Statistics: Count, total time, average time
  3. Memory Statistics: Current, peak, allocation counts
  4. Sorted by Time: Children sorted by total time descending

32.4 CVM Execution Tracer

The CVM execution tracer provides detailed visibility into bytecode execution, enabling debugging of query compilation and runtime behavior.

32.4.1 Trace Entry Types

The tracer captures diverse event types:

enum class TraceEntryType : uint8_t {
  kInstruction,     // Instruction execution
  kRegisterWrite,   // Register value change
  kMemoryAccess,    // Memory/field access
  kFunctionCall,    // Function call
  kFunctionReturn,  // Function return
  kBranchTaken,     // Branch instruction taken
  kBranchNotTaken,  // Branch instruction not taken
  kError,           // Runtime error
};

Each entry captures comprehensive context:

struct TraceEntry {
  TraceEntryType type;
  uint64_t sequence;         // Sequence number
  uint64_t timestamp_ns;     // Nanoseconds since trace start
  uint32_t address;          // Program counter
  Opcode opcode;             // Instruction opcode
  uint32_t raw_instruction;  // Raw instruction word

  uint8_t reg_index;         // Register info (for writes)
  VMValue reg_value;

  bool branch_taken;         // Branch info
  uint32_t branch_target;

  std::string comment;       // Additional context
};

32.4.2 Trace Filtering

Selective tracing reduces overhead:

struct TraceFilter {
  bool trace_instructions = true;
  bool trace_register_writes = true;
  bool trace_memory_access = true;
  bool trace_function_calls = true;
  bool trace_branches = true;
  bool trace_errors = true;

  std::optional<uint8_t> opcode_min;
  std::optional<uint8_t> opcode_max;

  std::optional<uint32_t> address_min;
  std::optional<uint32_t> address_max;

  size_t max_entries = 0;  // 0 = unlimited
};

Filtering enables targeted analysis:

  1. By Event Type: Focus on branches, function calls, or errors
  2. By Opcode Range: Trace specific instruction categories
  3. By Address Range: Trace specific code regions
  4. By Count: Limit trace size for long executions

32.4.3 Trace Statistics

The tracer computes aggregate statistics:

struct TraceStatistics {
  uint64_t total_instructions = 0;
  uint64_t total_branches = 0;
  uint64_t branches_taken = 0;
  uint64_t function_calls = 0;
  uint64_t function_returns = 0;
  uint64_t errors = 0;

  uint64_t start_time_ns = 0;
  uint64_t end_time_ns = 0;

  std::array<uint64_t, 256> opcode_counts = {};

  std::vector<std::pair<uint32_t, uint64_t>> hot_addresses;
};

Statistics enable performance analysis:

  • Branch Ratio: branches_takentotal_branches\frac{\text{branches\_taken}}{\text{total\_branches}} indicates branch predictability
  • Opcode Distribution: Per-opcode counts reveal workload characteristics
  • Hot Spots: Most-executed addresses guide optimization

32.4.4 Tracer Implementation

The tracer integrates with the CVM interpreter:

class Tracer {
public:
  Tracer();
  explicit Tracer(TraceFilter filter);

  void start();
  void stop();
  void clear();

  void trace_instruction(uint32_t address, Opcode opcode, uint32_t instr);
  void trace_register_write(uint8_t reg, const VMValue& value);
  void trace_memory_access(uint32_t address, bool is_write);
  void trace_function_call(uint32_t address, uint32_t target);
  void trace_function_return(uint32_t address, uint32_t return_to);
  void trace_branch(uint32_t address, bool taken, uint32_t target);
  void trace_error(uint32_t address, Opcode opcode, const std::string& message);

  auto entries() const -> const std::vector<TraceEntry>&;
  auto statistics() const -> TraceStatistics;

  void write(std::ostream& out, TraceFormat format = TraceFormat::kText) const;

  auto make_trace_hook() -> std::function<void(const VMContext&, uint32_t)>;

private:
  bool tracing_ = false;
  TraceFilter filter_;
  std::vector<TraceEntry> entries_;
  uint64_t sequence_ = 0;
  std::chrono::steady_clock::time_point start_time_;

  mutable TraceStatistics stats_;
  std::unordered_map<uint32_t, uint64_t> address_counts_;
};

The trace hook integrates non-invasively with the interpreter loop, enabling tracing without modifying execution logic.

32.4.5 Trace Output Formats

The tracer supports multiple output formats:

enum class TraceFormat {
  kText,    // Human-readable text
  kJson,    // JSON format
  kBinary,  // Compact binary format
};

Text format provides human-readable output:

[    1] 0x0000: MOV R0, #42                   ; Initialize counter
[    2] 0x0004: ADD R1, R0, #1                ; R1 = 43
[    3] 0x0008: CMP R1, #100                  ; Compare with limit
[    4] 0x000C: BLT 0x0004        [TAKEN]     ; Loop if R1 < 100
...

JSON format enables programmatic analysis:

{
  "entries": [
    {
      "sequence": 1,
      "timestamp_ns": 0,
      "address": 0,
      "opcode": "MOV",
      "type": "instruction"
    }
  ],
  "statistics": {
    "total_instructions": 1000000,
    "total_branches": 100000,
    "branches_taken": 99000
  }
}

32.5 Index Statistics

Accurate statistics drive query optimization decisions. Cognica maintains comprehensive index statistics including cardinality estimates, value distributions, and multi-column correlations.

32.5.1 Statistics Structure

The index statistics collector maintains detailed metadata:

struct Statistics {
  // Basic cardinality
  int64_t total_keys = 0;
  int64_t distinct_values = 0;
  std::optional<HyperLogLog> hll;

  // Range information (numeric)
  std::unordered_map<std::string, double> min_values;
  std::unordered_map<std::string, double> max_values;

  // Range information (string)
  std::unordered_map<std::string, std::string> min_string_values;
  std::unordered_map<std::string, std::string> max_string_values;

  // Null counts
  std::unordered_map<std::string, int64_t> null_counts;

  // Value distribution histograms
  std::unordered_map<std::string, Histogram> histograms;

  // Multi-column N-Distinct (PostgreSQL-style)
  std::unordered_map<std::string, int64_t> multi_column_ndistinct;

  // Functional dependencies
  std::unordered_map<std::string, double> functional_dependencies;

  // Most Common Values
  struct MCVEntry {
    std::string value_key;
    int64_t count = 0;
    double frequency = 0.0;
  };
  std::vector<MCVEntry> multi_column_mcv;

  // 2D Histogram for correlated columns
  struct Histogram2D {
    std::string col_x;
    std::string col_y;
    double x_min, x_max, y_min, y_max;
    int32_t num_buckets_x, num_buckets_y;
    std::vector<int64_t> counts;
    int64_t total_count = 0;
  };
  std::vector<Histogram2D> histograms_2d;

  chrono::TimePoint last_updated;
  bool is_stale = false;
};

32.5.2 Equi-Depth Histograms

Histograms model value distributions for selectivity estimation:

struct HistogramBucket {
  double lower_bound;  // Inclusive
  double upper_bound;  // Inclusive
  int64_t count;       // Values in bucket
  int64_t distinct;    // Distinct values
};

class Histogram {
public:
  static constexpr int32_t kDefaultNumBuckets = 100;

  static auto build(const std::vector<double>& sorted_values,
                    int32_t num_buckets = kDefaultNumBuckets) -> Histogram;

  auto estimate_equality_selectivity(double value) const -> double;
  auto estimate_range_selectivity(double lower, double upper) const -> double;
  auto estimate_less_than_selectivity(double upper) const -> double;
  auto estimate_greater_than_selectivity(double lower) const -> double;

private:
  std::vector<HistogramBucket> buckets_;
  int64_t total_count_ = 0;
  int64_t total_distinct_ = 0;
};

The equi-depth design ensures each bucket contains approximately equal row counts:

BiNk|B_i| \approx \frac{N}{k}

where NN is the total row count and kk is the number of buckets. This provides better estimation accuracy for skewed distributions compared to equi-width histograms.

32.5.3 Selectivity Estimation

For equality predicates, selectivity estimation uses:

Seq(v)=1distinct(Bv)S_{eq}(v) = \frac{1}{\text{distinct}(B_v)}

where BvB_v is the bucket containing value vv.

For range predicates [l,u][l, u], selectivity interpolates across buckets:

Srange(l,u)=ifiBiNS_{range}(l, u) = \sum_{i} f_i \cdot \frac{|B_i|}{N}

where fif_i is the fraction of bucket ii overlapping the range:

fi=min(u,Bimax)max(l,Bimin)BimaxBiminf_i = \frac{\min(u, B_i^{max}) - \max(l, B_i^{min})}{B_i^{max} - B_i^{min}}

32.5.4 Multi-Column Statistics

Cognica supports PostgreSQL-style extended statistics for multi-column correlation:

N-Distinct: Tracks distinct value counts for column combinations:

// Key: sorted comma-separated column names
std::unordered_map<std::string, int64_t> multi_column_ndistinct;
// Example: {"a,b" -> 10000, "a,b,c" -> 50000}

Functional Dependencies: Captures column determination relationships:

AB:degree={(a,b):unique b for each a}NA \to B: \text{degree} = \frac{|\{(a,b): \text{unique } b \text{ for each } a\}|}{N}
// Key: "A->B" format
std::unordered_map<std::string, double> functional_dependencies;
// Example: {"country->currency" -> 0.95}

2D Histograms: Model joint distributions of correlated numeric columns:

struct Histogram2D {
  std::string col_x, col_y;
  double x_min, x_max, y_min, y_max;
  int32_t num_buckets_x, num_buckets_y;
  std::vector<int64_t> counts;  // Row-major grid
};

The 2D histogram enables accurate selectivity estimation for conjunctive predicates on correlated columns:

S(pxpy)S(px)S(py)(when correlated)S(p_x \land p_y) \neq S(p_x) \cdot S(p_y) \quad \text{(when correlated)}

32.5.5 Statistics Collection

Statistics collection operates in two modes:

// Lightweight: metadata only (fast)
static auto collect(const Index* index, cognica::rdb::DB* db) -> Statistics;

// Heavyweight: full scan (accurate)
static auto collect_detailed(const Index* index, cognica::rdb::DB* db)
    -> std::expected<Statistics, Status>;

Incremental updates maintain freshness without full scans:

static auto update_on_write(const Statistics& stats, const Document& doc)
    -> Statistics;

The staleness detection triggers re-collection:

is_stale=(writes_since_collection>threshold)\text{is\_stale} = (\text{writes\_since\_collection} > \text{threshold})

32.6 Full-Text Search Statistics

Full-text search requires specialized statistics for BM25 scoring and query optimization.

32.6.1 Index Statistics

struct IndexStatsSnapshot {
  std::string field;
  int64_t total_doc_count;   // Total documents
  int64_t total_doc_size;    // Total size
  int64_t doc_count;         // Documents with field
  int64_t doc_size;          // Size of documents with field
  int64_t sum_term_freq;     // Total tokens
  int64_t sum_doc_freq;      // Sum of unique terms per document
};

These statistics feed into BM25 scoring:

avgdl=sum_term_freqdoc_count\text{avgdl} = \frac{\text{sum\_term\_freq}}{\text{doc\_count}}

32.6.2 Term Statistics

Per-term statistics drive IDF calculation:

struct TermStatsSnapshot {
  Term term;
  int64_t doc_freq;        // Documents containing term
  int64_t total_term_freq; // Total occurrences across all documents
};

The IDF component uses document frequency:

IDF(t)=log(Ndf(t)+0.5df(t)+0.5+1)\text{IDF}(t) = \log\left(\frac{N - \text{df}(t) + 0.5}{\text{df}(t) + 0.5} + 1\right)

32.7 Replication Metrics

The replication subsystem exposes comprehensive metrics for monitoring cluster health and performance.

32.7.1 Metrics Categories

class ReplicationMetrics {
public:
  // Transaction metrics
  void increment_transactions_replicated();
  void increment_transactions_failed();
  void add_bytes_replicated(uint64_t bytes);

  // Network metrics
  void increment_network_errors();
  void increment_messages_sent();
  void increment_messages_received();

  // Consistency metrics
  void increment_gaps_detected();
  void increment_gaps_recovered();
  void increment_out_of_order_detected();

  // Election metrics
  void increment_elections_started();
  void increment_elections_completed();
  void increment_role_changes();

  // Heartbeat metrics
  void increment_heartbeats_sent();
  void increment_heartbeats_received();
  void increment_heartbeat_failures();

  // Gauges
  void set_replication_lag(int64_t lag_ms);
  void set_connected_nodes(int32_t count);
  void set_current_role(NodeRole role);
};

32.7.2 Latency Histograms

Latency distributions use percentile tracking:

class LatencyHistogram {
public:
  void record(std::chrono::microseconds latency);

  struct Percentiles {
    std::chrono::microseconds p50;
    std::chrono::microseconds p95;
    std::chrono::microseconds p99;
    std::chrono::microseconds p999;
    std::chrono::microseconds max;
  };

  Percentiles get_percentiles() const;
};

Tracked latencies include:

  • Commit Latency: Time from commit request to acknowledgment
  • Replication Latency: Time to replicate to followers
  • Election Duration: Time to complete leader election

32.7.3 Per-Node Metrics

Individual node tracking enables targeted diagnosis:

struct NodeMetrics {
  NodeId node_id;
  bool is_connected;
  std::chrono::milliseconds last_heartbeat_ago;
  uint64_t bytes_sent;
  uint64_t bytes_received;
  uint64_t messages_sent;
  uint64_t messages_received;
  int64_t replication_lag_ms;
};

32.7.4 Metrics Snapshot

The complete metrics snapshot aggregates all telemetry:

struct MetricsSnapshot {
  // Transactions
  uint64_t total_transactions_replicated;
  uint64_t total_transactions_failed;
  double transactions_per_second;

  // Network
  uint64_t total_messages_sent;
  uint64_t total_messages_received;
  double messages_per_second;

  // Consistency
  uint32_t gaps_detected;
  uint32_t gaps_recovered;

  // Current state
  int64_t replication_lag_ms;
  int32_t connected_nodes;
  NodeRole current_role;
  SequenceNumber current_sequence_number;
  SequenceNumber applied_sequence_number;

  // Latencies
  LatencyHistogram::Percentiles commit_latency;
  LatencyHistogram::Percentiles replication_latency;

  std::chrono::system_clock::time_point timestamp;
};

32.7.5 Convenience Macros

Zero-overhead macros for metrics recording:

#define REPLICATION_METRICS_INCREMENT(counter)          \
  do {                                                  \
    if (ReplicationMetrics::instance().is_enabled()) {  \
      ReplicationMetrics::instance().counter();         \
    }                                                   \
  } while (0)

#define REPLICATION_METRICS_RECORD(histogram, value)    \
  do {                                                  \
    if (ReplicationMetrics::instance().is_enabled()) {  \
      ReplicationMetrics::instance().histogram(value);  \
    }                                                   \
  } while (0)

The enable check allows disabling metrics collection entirely in performance-critical scenarios.

32.8 JIT Execution Profiling

The JIT compiler uses execution profiling to make tiered compilation decisions.

32.8.1 Branch Profiling

Branch execution profiles guide branch prediction optimization:

struct BranchProfile {
  std::atomic<uint32_t> taken_count {0};
  std::atomic<uint32_t> not_taken_count {0};

  void record(bool taken);

  auto taken_ratio() const -> float {
    auto total = taken_count + not_taken_count;
    if (total == 0) return 0.5f;  // Unknown
    return static_cast<float>(taken_count) / total;
  }

  auto is_biased() const -> bool {
    auto ratio = taken_ratio();
    return ratio > 0.8f || ratio < 0.2f;
  }
};

Biased branches (>80%>80\% or <20%<20\% taken) indicate optimization opportunities for branch elimination or layout.

32.8.2 Type Profiling

Type profiles track value types at operation sites:

struct TypeProfile {
  static constexpr size_t kMaxTypes = 8;
  static constexpr uint32_t kMonomorphicThreshold = 95;

  std::array<std::atomic<uint32_t>, kMaxTypes> type_counts {};

  void record(JITType type);

  auto dominant_type() const -> std::optional<JITType> {
    // Return type if >= 95% of observations
  }

  auto is_monomorphic() const -> bool {
    return dominant_type().has_value();
  }

  auto is_polymorphic() const -> bool {
    return distinct_type_count() >= 2 && distinct_type_count() <= 3;
  }

  auto is_megamorphic() const -> bool {
    return distinct_type_count() > 3;
  }
};

Type stability determines specialization strategy:

  • Monomorphic (95%\geq 95\% same type): Generate specialized code with type guard
  • Polymorphic (2-3 types): Inline cache with dispatch
  • Megamorphic (>3>3 types): Generic handling

32.8.3 Execution Statistics

Per-module execution statistics drive tiering:

struct ExecutionProfile {
  std::atomic<uint64_t> execution_count {0};
  std::atomic<uint64_t> total_time_ns {0};
  std::atomic<uint64_t> instructions_executed {0};

  std::chrono::steady_clock::time_point first_execution;
  std::chrono::steady_clock::time_point last_execution;

  std::atomic<bool> is_jit_compiled {false};
  std::atomic<uint8_t> current_tier {0};

  std::unordered_map<uint32_t, BranchProfile> branch_profiles;
  std::unordered_map<uint32_t, TypeProfile> type_profiles;

  void record_execution(uint64_t time_ns, uint64_t instr_count);

  auto average_time_ns() const -> uint64_t;
  auto execution_frequency() const -> double;
};

32.8.4 Tiered Compilation Thresholds

Compilation decisions use configurable thresholds:

struct TierThresholds {
  // Tier 1 (baseline JIT)
  uint64_t tier1_min_executions = 100;
  uint64_t tier1_min_time_ns = 1'000'000;  // 1ms total

  // Tier 2 (optimized JIT)
  uint64_t tier2_min_executions = 10'000;
  uint64_t tier2_min_time_ns = 100'000'000;  // 100ms total

  // Simple expressions stay interpreted
  uint64_t simple_expr_threshold_ns = 500;
};

The tiering decision follows:

tier={0if n<100t<1ms1if n<10000t<100ms2otherwise\text{tier} = \begin{cases} 0 & \text{if } n < 100 \lor t < 1\text{ms} \\ 1 & \text{if } n < 10000 \lor t < 100\text{ms} \\ 2 & \text{otherwise} \end{cases}

where nn is execution count and tt is total accumulated time.

32.8.5 Execution Timer

RAII timing helper:

class ExecutionTimer {
public:
  ExecutionTimer(ExecutionProfiler& profiler, uint64_t module_hash)
      : profiler_(profiler), module_hash_(module_hash),
        start_(std::chrono::high_resolution_clock::now()) {}

  ~ExecutionTimer() {
    auto duration = std::chrono::high_resolution_clock::now() - start_;
    profiler_.record_execution(
        module_hash_,
        std::chrono::duration_cast<std::chrono::nanoseconds>(duration).count(),
        instr_count_);
  }

  void add_instructions(uint64_t count) {
    instr_count_ += count;
  }

private:
  ExecutionProfiler& profiler_;
  uint64_t module_hash_;
  std::chrono::high_resolution_clock::time_point start_;
  uint64_t instr_count_ = 0;
};

32.9 Query Plan Explanation

The EXPLAIN facility provides visibility into query planning decisions.

32.9.1 Explain Formats

enum class ExplainFormat : uint8_t {
  kText,  // Human-readable
  kJSON,  // Programmatic
};

enum class ExplainVerbosity : uint8_t {
  kBasic,    // Plan structure only
  kAnalyze,  // Include statistics
  kVerbose,  // All details
};

32.9.2 Explain Output

struct ExplainOutput {
  std::string formatted_output;
  int64_t estimated_rows = -1;
  int64_t estimated_cost = -1;
  bool uses_index = false;
  std::string index_name;
};

32.9.3 Text Format

Text format provides hierarchical plan visualization:

Limit(10)
+-- Sort(name ASC)
    +-- Filter(age > 18)
        +-- Scan(users)

32.9.4 JSON Format

JSON format enables programmatic analysis:

{
  "type": "Limit",
  "count": 10,
  "estimated_rows": 10,
  "input": {
    "type": "Sort",
    "keys": [{"field": "name", "order": "ASC"}],
    "estimated_rows": 1000,
    "input": {
      "type": "Filter",
      "condition": "age > 18",
      "selectivity": 0.3,
      "input": {
        "type": "Scan",
        "table": "users",
        "index": "users_age_idx"
      }
    }
  }
}

32.9.5 ExplainFormatter Implementation

class ExplainFormatter final {
public:
  auto format(const LogicalPlan& plan,
              ExplainFormat format = ExplainFormat::kText,
              ExplainVerbosity verbosity = ExplainVerbosity::kBasic)
      -> ExplainOutput;

  auto format_with_analysis(const LogicalPlan& plan,
                            const QueryAnalysis& analysis,
                            ExplainFormat format,
                            ExplainVerbosity verbosity) -> ExplainOutput;

  auto format_ast(const ast::QueryAST& ast,
                  ExplainFormat format) -> std::string;

private:
  auto format_operator_(const LogicalOperator* op, int depth,
                        ExplainFormat format, ExplainVerbosity verbosity)
      -> std::string;

  auto format_branch_(int depth, bool is_last) -> std::string;
};

32.10 Telemetry Integration

The telemetry subsystem provides event tracking for analytics and monitoring.

32.10.1 Event Tracking API

namespace telemetry {

bool initialize(const std::string_view& device_id,
                const std::string_view& login_name);

bool uninitialize();

bool track(const std::string_view& event_type,
           const ordered_json& event_properties);

}  // namespace telemetry

32.10.2 Event Categories

Tracked events include:

  • Query Events: Execution time, row counts, error conditions
  • System Events: Startup, shutdown, configuration changes
  • Performance Events: Slow queries, resource exhaustion
  • Error Events: Exceptions, failures, recovery actions

32.11 Summary

Cognica's observability infrastructure provides comprehensive visibility into system behavior:

  1. Structured Logging: Categorized, level-filtered logging with source location tracking enables targeted diagnostics

  2. System Profiler: Hierarchical timing and memory tracking with per-thread visibility reveals performance bottlenecks

  3. CVM Tracer: Detailed bytecode execution tracing enables debugging of query compilation and runtime behavior

  4. Index Statistics: Histograms, multi-column statistics, and functional dependencies drive accurate selectivity estimation

  5. Replication Metrics: Comprehensive transaction, network, consistency, and latency metrics monitor cluster health

  6. JIT Profiling: Branch and type profiling guide tiered compilation decisions for optimal code generation

  7. Query Explanation: EXPLAIN output in text and JSON formats provides visibility into query planning decisions

The observability architecture follows the principle that production systems require comprehensive instrumentation. The overhead of observability is justified by the operational benefits—faster diagnosis, better optimization, and confident operation. A database without observability is a black box; a database with comprehensive observability becomes a transparent system that operators can understand, tune, and trust.

The layered design—from low-level execution tracing to high-level metrics aggregation—provides appropriate visibility at each level of abstraction. Developers debugging CVM bytecode need instruction-level traces; operators monitoring cluster health need aggregate metrics and percentile latencies. Cognica's observability stack serves both audiences through a unified architecture.

Copyright (c) 2023-2026 Cognica, Inc.