Appendix C: Configuration Reference
This appendix provides a comprehensive reference for all Cognica configuration options. The configuration system uses YAML format with support for size suffixes, nested structures, and sensible defaults. Understanding these options enables operators to tune Cognica for specific workloads and deployment environments.
C.1 Configuration System Overview
C.1.1 Configuration File Format
Cognica uses YAML as the primary configuration format, supporting:
- Nested structures: Hierarchical organization of related options
- Size suffixes: Human-readable size specifications (KB, MB, GB)
- Comments: Documentation within configuration files
- Defaults: Sensible defaults for all options
Configuration File Locations:
| File | Purpose |
|---|---|
conf/default.yaml | Default configuration template |
conf/server.yaml | Production server configuration |
Size Suffix Support:
| Suffix | Multiplier | Example |
|---|---|---|
| B | 1 | 1024B = 1024 bytes |
| K, KB | 1024 | 256KB = 262,144 bytes |
| M, MB | 1024^2 | 256MB = 268,435,456 bytes |
| G, GB | 1024^3 | 16GB = 17,179,869,184 bytes |
| T, TB | 1024^4 | 1TB = 1,099,511,627,776 bytes |
Underscores are optional for readability: 256_MB equals 256MB.
C.1.2 Configuration Structure
# Root configuration structure
thread_pool: # Thread pool configuration
...
db: # Database storage configuration
...
sql: # SQL execution configuration
...
logger: # Logging configuration
...
model_serving: # ML model serving configuration
...
scheduler: # Task scheduler configuration
...
network: # Network services configuration
...
replication: # Cluster replication configuration
...
C.2 Thread Pool Configuration
Thread pools control parallelism for different operation categories. Setting num_threads: 0 enables automatic detection based on available CPU cores.
C.2.1 Thread Pool Options
thread_pool:
generic:
num_threads: 8 # General-purpose operations
batch:
num_threads: 16 # Batch processing operations
query:
num_threads: 64 # SQL query execution
schema:
num_threads: 4 # Schema operations (DDL)
disk_io:
num_threads: 8 # Disk I/O operations
| Option | Type | Default | Description |
|---|---|---|---|
generic.num_threads | int32 | 0 (auto) | Generic operation threads |
batch.num_threads | int32 | 0 (auto) | Batch operation threads |
query.num_threads | int32 | 0 (auto) | Query execution threads |
schema.num_threads | int32 | 0 (auto) | Schema operation threads |
disk_io.num_threads | int32 | 0 (auto) | Disk I/O threads |
Tuning Guidelines:
- Query threads: Set to 2-4x CPU cores for OLTP workloads
- Batch threads: Set equal to CPU cores for bulk operations
- Disk I/O threads: Set based on storage device parallelism (NVMe: 8-16, SSD: 4-8, HDD: 2-4)
C.3 Storage Configuration
Storage configuration controls the LSM-tree storage engine, caching layers, and compression.
C.3.1 Block-Based Table Options
Controls SST file format and block organization:
db:
table:
format_version: 6
block_size: 32KB
index_shortening: kShortenSeparators
enable_index_compression: true
block_restart_interval: 16
index_block_restart_interval: 1
data_block_hash_table_util_ratio: 0.75
optimize_filters_for_memory: true
partition_filters: true
decouple_partitioned_filters: true
metadata_block_size: 4KB
cache_index_and_filter_blocks: true
cache_index_and_filter_blocks_with_high_priority: true
whole_key_filtering: true
filter_policy: ribbon
| Option | Type | Default | Description |
|---|---|---|---|
format_version | uint32 | 6 | Block format version (4-6) |
block_size | size | 32KB | Data block size |
index_shortening | string | kShortenSeparators | Index key shortening strategy |
enable_index_compression | bool | true | Compress block indices |
block_restart_interval | int32 | 16 | Keys between restart points |
data_block_hash_table_util_ratio | double | 0.75 | Hash table utilization |
optimize_filters_for_memory | bool | true | Memory-optimize filters |
partition_filters | bool | true | Partition bloom filters |
filter_policy | string | ribbon | Filter type (bloom, ribbon) |
Index Shortening Strategies:
| Strategy | Description |
|---|---|
kNoShortening | No key shortening |
kShortenSeparators | Shorten separator keys |
kShortenSeparatorsAndSuccessor | Shorten separators and successors |
C.3.2 Block Cache Configuration
L1 cache for frequently accessed SST blocks:
db:
block_cache:
enabled: true
cache_capacity: 16GB
cache_shard_bits: 4
strict_capacity_limit: true
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable block cache |
cache_capacity | size | 2GB | Total cache capacity |
cache_shard_bits | int32 | 3 | Shards = 2^N (concurrency) |
strict_capacity_limit | bool | false | Enforce strict capacity |
Capacity Guidelines:
| Workload | Recommended Cache Size |
|---|---|
| Small (< 100GB data) | 25-50% of data size |
| Medium (100GB-1TB) | 10-25% of data size |
| Large (> 1TB) | 8-16GB minimum |
C.3.3 Secondary Cache Configuration
L2 cache for block cache overflow:
db:
secondary_cache:
enabled: false
cache_capacity: 8GB
cache_shard_bits: 3
strict_capacity_limit: false
enable_custom_split_merge: false
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable secondary cache |
cache_capacity | size | 2GB | Secondary cache capacity |
cache_shard_bits | int32 | 3 | Number of shards |
enable_custom_split_merge | bool | false | Custom split/merge logic |
C.3.4 Row Cache Configuration
In-memory cache for complete rows:
db:
row_cache:
enabled: false
cache_capacity: 4GB
cache_shard_bits: 3
strict_capacity_limit: false
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable row cache |
cache_capacity | size | 2GB | Row cache capacity |
cache_shard_bits | int32 | 3 | Number of shards |
When to Enable Row Cache:
- Point lookups dominate workload
- Hot rows fit in memory
- Block cache has high miss rate
C.3.5 Compression Configuration
Data compression for SST files:
db:
compression:
enabled: true
algorithm: zstd
max_dict_bytes: 32KB
zstd_max_train_bytes: 3MB
parallel_threads: 2
bottommost_compression:
enabled: true
algorithm: zstd
max_dict_bytes: 65536
zstd_max_train_bytes: 10MB
compression_per_level:
- lz4
- lz4
- lz4
- lz4
- zstd
- zstd
- zstd
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable compression |
algorithm | string | zstd | Compression algorithm |
max_dict_bytes | size | 32KB | Dictionary size limit |
zstd_max_train_bytes | size | 3MB | ZSTD training data |
parallel_threads | size | 2 | Parallel compression threads |
Compression Algorithms:
| Algorithm | Speed | Ratio | CPU | Use Case |
|---|---|---|---|---|
none | Fastest | 1:1 | None | NVMe, hot data |
lz4 | Fast | 2-3:1 | Low | Upper LSM levels |
snappy | Fast | 2-3:1 | Low | General purpose |
zstd | Medium | 3-5:1 | Medium | Lower LSM levels |
C.3.6 Compaction Configuration
LSM-tree compaction triggers and behavior:
db:
auto_compaction:
sliding_window_size: 256000
deletion_trigger: 64000
deletion_ratio: 0.0
manual_compaction:
exclusive_manual_compaction: false
change_level: true
bottommost_level_compaction: kIfHaveCompactionFilter
allow_write_stall: false
max_subcompactions: 2
Auto Compaction Options:
| Option | Type | Default | Description |
|---|---|---|---|
sliding_window_size | size | 256K | Files in compaction window |
deletion_trigger | size | 64K | Deletions to trigger compaction |
deletion_ratio | double | 0.0 | Deletion ratio threshold |
Manual Compaction Options:
| Option | Type | Default | Description |
|---|---|---|---|
exclusive_manual_compaction | bool | false | Exclusive compaction lock |
change_level | bool | true | Allow level changes |
bottommost_level_compaction | string | kIfHaveCompactionFilter | Bottommost strategy |
max_subcompactions | uint32 | 2 | Parallel subcompactions |
C.3.7 Core Storage Options
Fundamental storage engine parameters:
db:
storage:
db_path: ./data/cognica.db
info_log_level: info
# Write buffer configuration
write_buffer_size: 256MB
min_write_buffer_number_to_merge: 1
max_write_buffer_number: 16
arena_block_size: 16MB
# WAL configuration
max_total_wal_size: 4GB
wal_bytes_per_sync: 128MB
wal_compression: zstd
wal_ttl_seconds: 3600
wal_size_limit_mb: 0
# Background operations
max_background_jobs: 8
max_subcompactions: 2
max_open_files: -1
max_file_opening_threads: 32
# Read/Write optimization
readahead_size: 256KB
compaction_readahead_size: 8MB
bulk_scan_fill_cache: false
advise_random_on_open: true
# Write optimization
enable_pipelined_write: true
enable_unordered_write: false
avoid_unnecessary_blocking_io: true
# Direct I/O
use_direct_reads: false
use_direct_io_for_flush_and_compaction: false
writable_file_max_buffer_size: 16MB
# Transaction
transaction_lock_timeout: 5000
# Maintenance
keep_log_file_num: 10
recycle_log_file_num: 16
periodic_compaction_seconds: 0xfffffffffffffffe
# Debugging
paranoid_checks: true
force_consistency_checks: false
dump_malloc_stats: true
report_bg_io_stats: true
dump_storage_stats: true
Key Storage Options:
| Option | Type | Default | Description |
|---|---|---|---|
db_path | string | ./data/cognica.db | Database directory |
write_buffer_size | size | 256MB | Memtable size |
max_write_buffer_number | int32 | 16 | Max pending memtables |
max_background_jobs | int32 | 8 | Background job threads |
max_open_files | int32 | -1 | Max file descriptors (-1=unlimited) |
transaction_lock_timeout | int64 | 5000 | Lock timeout (ms) |
enable_pipelined_write | bool | true | Pipelined writes |
Write Buffer Tuning:
| Workload | write_buffer_size | max_write_buffer_number |
|---|---|---|
| OLTP | 64-128MB | 4-8 |
| Mixed | 256MB | 16 |
| Bulk Load | 512MB-1GB | 32-64 |
C.3.8 Statistics Configuration
Performance statistics collection:
db:
statistics:
enabled: true
level: kExceptDetailedTimers
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable statistics |
level | string | kExceptDetailedTimers | Statistics detail level |
Statistics Levels:
| Level | Description |
|---|---|
kExceptDetailedTimers | All except detailed timing |
kExceptTimersAndLocking | Exclude timers and lock stats |
kAll | Full statistics (performance impact) |
C.3.9 Rate Limiter Configuration
I/O rate limiting for background operations:
db:
rate_limiter:
enabled: false
rate_bytes_per_sec: 512MB
auto_tuned: true
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable rate limiting |
rate_bytes_per_sec | int64 | 512MB | Rate limit |
auto_tuned | bool | true | Auto-tune rate |
C.3.10 Encryption Configuration
At-rest encryption for database files:
db:
encryption:
enabled: false
algorithm: aes-256-ctr
key_file: /etc/cognica/encryption.key
key_format: hex
prefix_length: 4096
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable encryption |
algorithm | string | aes-256-ctr | Encryption algorithm |
key_file | string | "" | Encryption key file path |
key_format | string | hex | Key encoding (raw, hex, base64) |
prefix_length | size | 4096 | Prefix for direct I/O alignment |
C.4 Document Database Configuration
Configuration for document database operations including sorting, joining, and query optimization.
C.4.1 Sort Configuration
db:
document:
sort:
memory_limit: 256MB
spill_batch_size: 10000
max_merge_width: 16
enable_parallel_sort: true
temp_directory: ""
| Option | Type | Default | Description |
|---|---|---|---|
memory_limit | size | 256MB | Memory for sort operations |
spill_batch_size | uint64 | 10000 | Rows per spill batch |
max_merge_width | int32 | 16 | Max merge streams |
enable_parallel_sort | bool | true | Multi-threaded sorting |
temp_directory | string | "" | Temp directory for spill |
C.4.2 Join Configuration
db:
document:
join:
memory_limit: 256MB
num_partitions: 64
temp_directory: ""
| Option | Type | Default | Description |
|---|---|---|---|
memory_limit | size | 256MB | Memory for hash joins |
num_partitions | int32 | 64 | Hash partitions |
temp_directory | string | "" | Temp directory for spill |
C.4.3 CVM Configuration (Document Pipeline)
db:
document:
cvm:
enabled: true
memory_limit: 256MB
spill_threshold: 0.8
temp_directory: ""
cache_max_entries: 1024
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable CVM compilation |
memory_limit | size | 256MB | Execution memory limit |
spill_threshold | double | 0.8 | Memory % to trigger spill |
cache_max_entries | size | 1024 | Bytecode cache entries |
C.4.4 Optimizer Configuration
db:
document:
optimizer:
enabled: true
memory_budget: 256MB
max_optimization_passes: 10
enable_filter_pushdown: true
enable_topk_optimization: true
enable_index_selection: true
enable_cost_based_join: true
spill_directory: ""
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable cost-based optimizer |
memory_budget | size | 256MB | Optimizer memory budget |
max_optimization_passes | uint32 | 10 | Max optimization iterations |
enable_filter_pushdown | bool | true | Push filters to scans |
enable_topk_optimization | bool | true | Top-K optimization |
enable_index_selection | bool | true | Automatic index selection |
enable_cost_based_join | bool | true | Cost-based join ordering |
C.5 Full-Text Search Configuration
Configuration for full-text search and vector similarity search.
C.5.1 FTS Query Cache
db:
fts:
query_cache:
enabled: true
capacity: 16MB
num_shards: 8
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable FTS query cache |
capacity | size | 16MB | Cache capacity |
num_shards | size | 8 | Cache shards |
C.5.2 HNSW Index Configuration
db:
fts:
hnsw_index:
flush_before_search: false
| Option | Type | Default | Description |
|---|---|---|---|
flush_before_search | bool | false | Flush index before search |
C.5.3 FTS Options
db:
fts:
enable_match_all_query: false
default_top_k: 100
max_query_terms: 1024
max_search_limit: 100000
| Option | Type | Default | Description |
|---|---|---|---|
enable_match_all_query | bool | false | Allow match-all queries |
default_top_k | int64 | 100 | Default result limit |
max_query_terms | int64 | 1024 | Max terms per query |
max_search_limit | int64 | 100000 | Max results returned |
C.6 SQL Execution Configuration
Configuration for SQL query execution, caching, and compilation.
C.6.1 SQL Query Cache
sql:
query_cache:
enabled: true
max_capacity_bytes: 64MB
num_shards: 8
ttl_seconds: 60
max_result_size_bytes: 1MB
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable query result cache |
max_capacity_bytes | size | 64MB | Maximum cache size |
num_shards | size | 8 | Cache shards for concurrency |
ttl_seconds | uint64 | 60 | Cache entry TTL |
max_result_size_bytes | size | 1MB | Max cacheable result size |
C.6.2 JIT Compilation Configuration
sql:
jit:
enabled: true
cache_max_entries: 4096
cache_max_bytes: 32MB
min_ops_threshold: 3
min_rows_simple: 1000
min_rows_complex: 500
always_jit_ops: 10
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable JIT compilation |
cache_max_entries | size | 4096 | Compiled code cache entries |
cache_max_bytes | size | 32MB | Compiled code cache size |
min_ops_threshold | int32 | 3 | Min operations to JIT |
min_rows_simple | int64 | 1000 | Min rows for simple JIT |
min_rows_complex | int64 | 500 | Min rows for complex JIT |
always_jit_ops | int32 | 10 | Always JIT with N+ ops |
C.6.3 CVM Configuration (SQL)
sql:
cvm:
enabled: true
memory_limit: 256MB
spill_threshold: 0.8
temp_directory: ""
cache_max_entries: 4096
min_rows_threshold: 100
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | true | Enable CVM bytecode |
memory_limit | size | 256MB | Execution memory limit |
spill_threshold | double | 0.8 | Spill trigger threshold |
cache_max_entries | size | 4096 | Bytecode module cache |
min_rows_threshold | int64 | 100 | Min rows to use CVM |
C.7 Logging Configuration
Configuration for application logging with multiple categories.
C.7.1 Logger Configuration
logger:
general:
level: info
sinks:
- stdout
- type: file
max_files: 5
error:
level: error
sinks:
- stderr
- type: file
max_files: 10
access:
level: info
sinks:
- type: file
max_files: 5
querylog:
level: info
sinks:
- stdout
slowlog:
level: info
sinks:
- type: file
max_files: 5
system:
level: info
sinks:
- stdout
Logger Categories:
| Category | Purpose |
|---|---|
general | General application logs |
error | Error and exception logs |
access | Access and authentication logs |
querylog | Query execution logs |
slowlog | Slow query logs |
system | System operation logs |
Log Levels:
| Level | Description |
|---|---|
trace | Detailed tracing information |
debug | Debug information |
info | Informational messages |
warn | Warning messages |
error | Error messages |
critical | Critical errors |
Sink Types:
| Sink | Description |
|---|---|
stdout | Standard output |
stderr | Standard error |
file | Rotating file sink |
C.8 Network Configuration
Configuration for network services including HTTP, Flight SQL, and PostgreSQL wire protocol.
C.8.1 Network Bindings
network:
bindings:
- host: 0.0.0.0
port: 10080
- host: 0.0.0.0
port: 10443
ssl:
enabled: true
private_key: ./etc/cert/server.key
cert_chain: ./etc/cert/server.crt
| Option | Type | Default | Description |
|---|---|---|---|
host | string | "" | Bind address |
port | int16 | 0 | Port number |
ssl.enabled | bool | false | Enable TLS |
ssl.private_key | string | "" | Private key file (PEM) |
ssl.cert_chain | string | "" | Certificate chain (PEM) |
ssl.root_certs | string | "" | CA certificates (PEM) |
C.8.2 HTTP Configuration
network:
http:
enabled: false
host: 0.0.0.0
port: 8080
num_threads: 4
request_timeout_ms: 30000
max_body_size: 16MB
ssl:
enabled: false
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable HTTP server |
host | string | 0.0.0.0 | Bind address |
port | uint16 | 8080 | HTTP port |
num_threads | int32 | 4 | Server threads |
request_timeout_ms | int64 | 30000 | Request timeout |
max_body_size | int64 | 16MB | Max request body |
C.8.3 Flight SQL Configuration
network:
flight_sql:
enabled: false
host: 0.0.0.0
port: 31337
max_batch_size: 65536
statement_timeout_ms: 0
statement_cache_ttl_s: 300
ssl:
enabled: false
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable Flight SQL |
host | string | 0.0.0.0 | Bind address |
port | uint16 | 31337 | Flight SQL port |
max_batch_size | int64 | 65536 | Max rows per batch |
statement_timeout_ms | int64 | 0 | Statement timeout (0=unlimited) |
statement_cache_ttl_s | int64 | 300 | Statement cache TTL |
C.8.4 PostgreSQL Wire Protocol Configuration
network:
pgsql:
enabled: true
host: 0.0.0.0
port: 5432
num_threads: 16
statement_timeout_ms: 0
idle_session_timeout_ms: 0
max_connections: 100
max_message_size: 1GB
default_fetch_size: 10000
auth:
method: scram-sha-256
ssl:
enabled: false
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable PostgreSQL protocol |
host | string | 0.0.0.0 | Bind address |
port | uint16 | 5432 | PostgreSQL port |
num_threads | int32 | 4 | Server threads |
statement_timeout_ms | int64 | 0 | Statement timeout |
idle_session_timeout_ms | int64 | 0 | Idle connection timeout |
max_connections | int32 | 100 | Max concurrent connections |
max_message_size | int64 | 1GB | Max message size |
default_fetch_size | int64 | 10000 | Default cursor fetch size |
auth.method | string | scram-sha-256 | Authentication method |
Authentication Methods:
| Method | Description |
|---|---|
trust | No authentication |
scram-sha-256 | SCRAM-SHA-256 authentication |
C.9 Replication Configuration
Configuration for cluster replication using Raft consensus.
C.9.1 Basic Replication Configuration
replication:
enabled: true
current_node:
node_id: primary-1
host: 192.168.1.10
port: 20080
role: primary
nodes:
- node_id: primary-1
host: 192.168.1.10
port: 20080
role: primary
- node_id: secondary-1
host: 192.168.1.11
port: 20080
role: secondary
- node_id: secondary-2
host: 192.168.1.12
port: 20080
role: secondary
| Option | Type | Default | Description |
|---|---|---|---|
enabled | bool | false | Enable replication |
current_node.node_id | string | "" | This node's identifier |
current_node.host | string | 127.0.0.1 | This node's address |
current_node.port | uint16 | 9090 | Replication port |
current_node.role | string | secondary | Node role (primary/secondary) |
C.9.2 Replication Timing Configuration
replication:
max_batch_size: 1MB
max_batch_delay_ms: 100
heartbeat_interval_ms: 1000
election_timeout_ms: 5000
connection_timeout_ms: 5000
sync_retry_interval_ms: 1000
max_sync_retries: 3
strict_consistency: false
| Option | Type | Default | Description |
|---|---|---|---|
max_batch_size | size | 1MB | Max replication batch |
max_batch_delay_ms | uint64 | 100 | Max batch delay |
heartbeat_interval_ms | uint64 | 1000 | Heartbeat interval |
election_timeout_ms | uint64 | 5000 | Raft election timeout |
connection_timeout_ms | uint64 | 5000 | Connection timeout |
sync_retry_interval_ms | uint64 | 1000 | Sync retry interval |
max_sync_retries | int32 | 3 | Max sync retries |
strict_consistency | bool | false | Require strong consistency |
Raft Timing Guidelines:
election_timeout_msshould be 5-10xheartbeat_interval_ms- Network latency affects optimal timeout values
- Higher timeouts improve stability but increase failover time
C.9.3 Replication TLS Configuration
replication:
tls:
enable_ssl: true
server_cert_file: /etc/cognica/certs/server.crt
server_key_file: /etc/cognica/certs/server.key
ca_cert_file: /etc/cognica/certs/ca.crt
verify_peer: true
verify_hostname: true
min_tls_version: 0
cipher_list: "HIGH:!aNULL:!MD5:!RC4"
| Option | Type | Default | Description |
|---|---|---|---|
enable_ssl | bool | false | Enable TLS |
server_cert_file | string | "" | Server certificate |
server_key_file | string | "" | Server private key |
ca_cert_file | string | "" | CA certificate |
verify_peer | bool | true | Verify peer certificate |
verify_hostname | bool | true | Verify hostname match |
min_tls_version | int | 0 | Min TLS version (0=1.2, 1=1.3) |
C.10 Scheduler Configuration
Configuration for scheduled maintenance tasks.
C.10.1 Task Group Configuration
scheduler:
task_groups:
- name: "Daily Maintenance"
enabled: true
schedule:
invoke_at_startup: false
invocation_policy:
type: ScheduledInvocationPolicy
context:
timezone: UTC
day_of_week: kSunday
0-6: 3600 # Every hour on Sunday
7-23: 7200 # Every 2 hours other days
tasks:
- type: DatabaseCompactionTask
context:
target_level: -1
- name: "Backup"
enabled: true
schedule:
invoke_at_startup: false
invocation_policy:
type: ScheduledInvocationPolicy
context:
timezone: UTC
0-23: 86400 # Once per day
tasks:
- type: DatabaseBackupTask
context:
backup_path: /backup/cognica
Task Types:
| Type | Description |
|---|---|
DatabaseCompactionTask | LSM-tree compaction |
DatabaseBackupTask | Database backup |
C.11 Model Serving Configuration
Configuration for ML model serving (embeddings, LLMs).
C.11.1 Sentence Transformers
model_serving:
sentence_transformers:
sentence_encoders:
- all-MiniLM-L6-v2
- paraphrase-multilingual-MiniLM-L12-v2
cross_encoders:
- cross-encoder/ms-marco-MiniLM-L-6-v2
clip_encoders: []
qa_encoders: []
| Option | Type | Default | Description |
|---|---|---|---|
sentence_encoders | list | [] | Sentence embedding models |
cross_encoders | list | [] | Cross-encoder reranking models |
clip_encoders | list | [] | CLIP multimodal models |
qa_encoders | list | [] | Question-answering models |
C.11.2 Large Language Models
model_serving:
large_language_models:
- name: gpt-4
context:
api_key_env: OPENAI_API_KEY
endpoint: https://api.openai.com/v1
| Option | Type | Default | Description |
|---|---|---|---|
name | string | "" | Model identifier |
context | object | {} | Model-specific configuration |
C.12 Configuration Examples
C.12.1 Minimal Development Configuration
db:
storage:
db_path: ./data/dev.db
info_log_level: debug
network:
bindings:
- host: 127.0.0.1
port: 10080
pgsql:
enabled: true
port: 5432
auth:
method: trust
logger:
general:
level: debug
sinks:
- stdout
C.12.2 Production Single-Node Configuration
thread_pool:
query:
num_threads: 64
db:
block_cache:
cache_capacity: 16GB
strict_capacity_limit: true
storage:
db_path: /var/lib/cognica/data
write_buffer_size: 512MB
max_write_buffer_number: 32
max_background_jobs: 16
sql:
query_cache:
enabled: true
max_capacity_bytes: 256MB
cvm:
memory_limit: 1GB
network:
bindings:
- host: 0.0.0.0
port: 10080
pgsql:
enabled: true
port: 5432
max_connections: 500
num_threads: 32
logger:
general:
level: info
sinks:
- type: file
max_files: 10
slowlog:
level: info
sinks:
- type: file
max_files: 5
C.12.3 High-Availability Cluster Configuration
thread_pool:
query:
num_threads: 64
db:
block_cache:
cache_capacity: 32GB
storage:
db_path: /var/lib/cognica/data
write_buffer_size: 512MB
max_background_jobs: 16
network:
bindings:
- host: 0.0.0.0
port: 10080
ssl:
enabled: true
private_key: /etc/cognica/certs/server.key
cert_chain: /etc/cognica/certs/server.crt
pgsql:
enabled: true
port: 5432
max_connections: 1000
ssl:
enabled: true
replication:
enabled: true
current_node:
node_id: node-1
host: 192.168.1.10
port: 20080
role: primary
nodes:
- node_id: node-1
host: 192.168.1.10
port: 20080
role: primary
- node_id: node-2
host: 192.168.1.11
port: 20080
role: secondary
- node_id: node-3
host: 192.168.1.12
port: 20080
role: secondary
heartbeat_interval_ms: 500
election_timeout_ms: 3000
tls:
enable_ssl: true
server_cert_file: /etc/cognica/certs/repl-server.crt
server_key_file: /etc/cognica/certs/repl-server.key
ca_cert_file: /etc/cognica/certs/ca.crt
C.12.4 Analytics Workload Configuration
thread_pool:
query:
num_threads: 128
batch:
num_threads: 32
db:
block_cache:
cache_capacity: 64GB
storage:
db_path: /data/cognica
write_buffer_size: 1GB
max_write_buffer_number: 64
readahead_size: 2MB
compaction_readahead_size: 64MB
document:
sort:
memory_limit: 4GB
join:
memory_limit: 4GB
cvm:
memory_limit: 8GB
sql:
cvm:
memory_limit: 8GB
spill_threshold: 0.7
network:
flight_sql:
enabled: true
port: 31337
max_batch_size: 131072
pgsql:
enabled: true
port: 5432
max_connections: 100
Summary
This appendix provides a complete reference for Cognica configuration options organized by subsystem:
- Thread pools: Control parallelism for different operation types
- Storage: LSM-tree settings, caching, compression, compaction
- Document DB: Sort, join, CVM, and optimizer configuration
- FTS: Full-text search and vector index settings
- SQL: Query cache, JIT compilation, CVM execution
- Logging: Multi-category logging with configurable sinks
- Network: HTTP, Flight SQL, PostgreSQL protocol
- Replication: Raft consensus and cluster configuration
- Scheduler: Automated maintenance tasks
Understanding these options enables operators to optimize Cognica for specific workloads, from low-latency OLTP to high-throughput analytics, while maintaining data durability and availability requirements.