Research
Cognica is built on published mathematical foundations. Every feature is derived from first principles, not engineering heuristics.
Mathematical Framework
A progression from algebraic foundations through probabilistic calibration to neural computation theory.
Industry Adoption
Bayesian BM25 has been adopted by leading search and retrieval frameworks.
Apache Lucene
The most widely used open-source search library. Bayesian BM25 integrated as a core scoring option.
MTEB Baseline
Massive Text Embedding Benchmark adopted Bayesian BM25 as the official baseline for retrieval evaluation.
Vespa.ai
Yahoo's large-scale serving engine. Bayesian BM25 included as an official rank profile example.
txtai
All-in-one embeddings database. Added Bayesian BM25 for probabilistic hybrid search pipelines.
Publications
A Unified Mathematical Framework for Query Algebras Across Heterogeneous Data Paradigms
Establishes posting lists as a universal mathematical abstraction that unifies relational, full-text, and vector search paradigms. Proves compositional completeness via Boolean algebra.
Extending the Unified Mathematical Framework to Support Graph Data Structures
Extends the unified algebra to graph operations, proving that graph traversal and pattern matching preserve algebraic completeness across all paradigms.
Bayesian BM25: A Probabilistic Framework for Hybrid Text and Vector Search
Transforms BM25 scores into calibrated probabilities using Bayesian inference with sigmoid likelihood. Enables principled hybrid search without RRF or ad-hoc normalization.
From Bayesian Inference to Neural Computation
Derives feedforward neural network structure from first-principles Bayesian inference. Shows sigmoid, ReLU, Swish, and GELU emerge from different probabilistic questions.
Vector Scores as Likelihood Ratios: Index-Derived Bayesian Calibration for Hybrid Search
Develops Bayesian calibration for vector similarity scores using distributional statistics from ANN index construction. Completes probabilistic unification of sparse and dense retrieval.
Identifies answer bandwidth compression as the fundamental mechanism behind sigmoid failure in hidden layers: sigmoid's bounded output destroys representational capacity independently of gradient flow. Provides a new explanation beyond vanishing gradients.
Shows that Product of Experts provides a principled foundation for local learning that matches backpropagation within 2% across MLPs, CNNs, ResNets, and Transformers. Establishes structural advantages for continual learning.
Bridges information retrieval and neural inference: applies Block-Max WAND to skip negligible neurons during forward passes. Derives per-neuron activation bounds with zero hyperparameters using exponential priors for ReLU activations.