Research

Cognica is built on published mathematical foundations. Every feature is derived from first principles, not engineering heuristics.

Mathematical Framework

A progression from algebraic foundations through probabilistic calibration to neural computation theory.

Posting List AlgebraUnified Query Framework

Graph ExtensionAlgebraic Completeness

Bayesian CalibrationScore-to-Probability

Neural ComputationBayesian Derivation

Industry Adoption

Bayesian BM25 has been adopted by leading search and retrieval frameworks.

Apache Lucene

The most widely used open-source search library. Bayesian BM25 integrated as a core scoring option.

MTEB Baseline

Massive Text Embedding Benchmark adopted Bayesian BM25 as the official baseline for retrieval evaluation.

Vespa.ai

Yahoo's large-scale serving engine. Bayesian BM25 included as an official rank profile example.

txtai

All-in-one embeddings database. Added Bayesian BM25 for probabilistic hybrid search pipelines.

Publications

Paper2023

A Unified Mathematical Framework for Query Algebras Across Heterogeneous Data Paradigms

Establishes posting lists as a universal mathematical abstraction that unifies relational, full-text, and vector search paradigms. Proves compositional completeness via Boolean algebra.

Database SystemsQuery ProcessingInformation RetrievalAbstract Algebra

Read Paper

Paper2024

Extending the Unified Mathematical Framework to Support Graph Data Structures

Extends the unified algebra to graph operations, proving that graph traversal and pattern matching preserve algebraic completeness across all paradigms.

Database SystemsGraph TheoryQuery Processing

Read Paper

Paper2026

Bayesian BM25: A Probabilistic Framework for Hybrid Text and Vector Search

Transforms BM25 scores into calibrated probabilities using Bayesian inference with sigmoid likelihood. Enables principled hybrid search without RRF or ad-hoc normalization.

Information RetrievalBayesian StatisticsProbabilistic Models

Apache LuceneMTEB BaselinetxtaiVespa.ai

Read Paper

Paper2026

From Bayesian Inference to Neural Computation

Derives feedforward neural network structure from first-principles Bayesian inference. Shows sigmoid, ReLU, Swish, and GELU emerge from different probabilistic questions.

Deep LearningBayesian StatisticsComputational Neuroscience

Read Paper

Paper2026

Vector Scores as Likelihood Ratios: Index-Derived Bayesian Calibration for Hybrid Search

Develops Bayesian calibration for vector similarity scores using distributional statistics from ANN index construction. Completes probabilistic unification of sparse and dense retrieval.

Information RetrievalVector SearchBayesian Statistics

Read Paper

Paper2026

Answer Bandwidth: Why Sigmoid Fails in Hidden Layers

Identifies answer bandwidth compression as the fundamental mechanism behind sigmoid failure in hidden layers: sigmoid's bounded output destroys representational capacity independently of gradient flow. Provides a new explanation beyond vanishing gradients.

Deep LearningActivation FunctionsRepresentation Learning

Read Paper

Paper2026In Progress

Bridges information retrieval and neural inference: applies Block-Max WAND to skip negligible neurons during forward passes. Derives per-neuron activation bounds with zero hyperparameters using exponential priors for ReLU activations.

Deep LearningNeural PruningInformation Retrieval

Paper2026

Product of Experts as Scalable Local Learning: Modular Construction at 1.3B Parameters

Validates Product of Experts as a principled local learning framework at 1.3B production scale, eliminating backpropagation's global state dependency. Clustered PoE on 1.3B GPT shows a 6.52% BPB gap at r=20 from-scratch versus matched BP, while delivering six modularity pillars unattainable under backpropagation: training-time modularity via detached per-stage losses, inference-time composability across compute tiers, post-hoc specialist attachment without retraining, base preservation under SFT, composable quality gains via log-space parallel composition, and heterogeneous specialist ecosystems. Stage 1 alone (25% compute) achieves 87.5% of full-model factual accuracy; WAND adaptive pruning delivers 1.82x wall-clock at 100% top-1 agreement.

Deep LearningLocal LearningBayesian StatisticsDistributed Training

Read Paper

Paper2026

Stage-Partitioned Learning: A Configuration Space Between Backpropagation and Local Learning

Shows that backpropagation and fully local per-layer learning are two extreme points of a single parameterized family C(L) of stage-partitioned training configurations. Each element (pi, H) specifies how the L layers group into K contiguous stages with shared or per-stage output heads, inducing a stage-local loss with gradient detachment between stages. Proves C(L) forms a bounded lattice under partition refinement, with BP at the minimum and fully local at the maximum, and that the intermediate region inherits memory locality and modularity from the local side while preserving end-to-end coherence from the BP side. Identifies the root bottleneck principle: in a chain partition the first-stage representation is a sufficient statistic for all later stages, so any BPB-dominated objective collapses onto the BP endpoint, and dynamic configuration selection must be formulated as Pareto multi-objective optimization over BPB, locality, and capability metrics rather than BPB minimization.

Deep LearningLocal LearningLattice TheoryInformation Theory

Read Paper

Paper2026

Logit Virtual Machines: Deterministic Database Execution over Neural Natural Parameters

Reframes the database as a deterministic virtual machine over logit-valued relations. A neural model emits logits over instructions, schema, fields, and values; a trainable typed compiler lowers them into VM programs over relational, text, vector, and graph indexes; the database returns logit deltas through a typed ACCUM projection. Logits become executable natural parameters rather than decoding scores, replacing untyped application glue with a deterministic, durable, index-aware memory model for neural computation. Proves closure over the log semiring, recovers classical Boolean query algebra as support projection, and shows that adding database-produced logit deltas to model logits is equivalent, after softmax, to normalized Product-of-Experts aggregation.

Database SystemsNeural ComputationCompiler TheoryProbabilistic Models

Read Paper

Terms Privacy

Research

Mathematical Framework

Industry Adoption

Apache Lucene

MTEB Baseline

Vespa.ai

txtai

Publications

Zero-Hyperparameter Neural Pruning via Block-Max WAND with Adaptive Epsilon Bounds