chip_042
vec: [0.82, -0.31, 0.54, ...]
cosine_sim: 0.971
Vector Database · v0.9-alpha · MIT · Inherits NoVectDB

CooxixDB

A post-vector database where data lives in crumbs, queries are nibbles, and every embedding cluster is a chip. Scientifically rigorous. Shamelessly delicious.

Crumb-based storage O(log n) chip search Contextual indexing NoVectDB kernel Zero sugar added
⟵ drag to rotate · scroll to zoom · hover chips
O(log n)
Chip Lookup
Crumb Dimensions
3.14
Deliciousness π
0.97
Avg Cosine Sim
42°C
Baking Temperature
System Architecture

The Cookie Layer Model

CookixDB inherits NoVectDB's contextual indexing kernel and wraps it in a four-layer architecture inspired by the physical structure of a chocolate-chip cookie. Each layer solves a distinct computer science problem — and tastes great.

🔥
Layer 1
Crust — The Inverted Index
Outer shell. A hash-mapped inverted index with jump-consistent sharding. All queries enter here. O(1) bucket lookup on 64-bit key digests.
index
🍪
Layer 2
Dough — The Context Tensor
Inherited from NoVectDB: a multi-dimensional context tensor field. Crumbs (data nodes) exist as sparse coordinate tensors in this field. Supports up to 65 536 embedding dimensions before crying.
context
🍫
Layer 3
Chips — The Cluster Nodes
HNSW-adjacent graph where each chip is a centroid of a Voronoi cell. k-nearest-chip search navigates the graph in O(log n) expected time. Chips self-organise via online Lloyd's algorithm during ingestion.
cluster
Layer 4
Sugar Glaze — The Query Rewriter
Syntactic sugar on top of everything. Rewrites natural language, SQL-like, and bring-format queries into CookixDB's internal crumb-vector ops. The most unnecessary and most beloved layer.
query
CookixDB API Reference

Bake. Nibble. Search.

Every database operation has a culinary alias. The serious names work too, but using nibble() instead of query() will make your tests 40% more enjoyable. Peer-reviewed.

python
from cookixdb import CookieJar

# Open (or bake) a database
jar = CookieJar.open("my_db")          # alias: .bake()

# Insert a vector (crumbs)
jar.bake(id="doc_001", vec=[0.82, -0.31, 0.54],
         meta={"text": "hello world"})

# Query nearest neighbours (nibble)
results = jar.nibble(vec=[0.79, -0.28, 0.60], k=5)

# Delete a document (crumble)
jar.crumble(id="doc_001")

# Flush to disk (cool_down)
jar.cool_down()              # alias: .flush()
INSERT insert crumb
bake(id, vec, meta) → CrumbID
Inserts a vector embedding into the dough layer. Triggers chip re-clustering if the nearest chip's variance exceeds σ²_max.
alias: insert()
QUERY k-nearest
nibble(vec, k, ef) → List[Crumb]
Approximate k-NN search across all chips. ef controls exploration factor (default 64). Bigger ef = slower + tastier results.
alias: query() · search()
DELETE remove
crumble(id) → bool
Soft-deletes a crumb. The crumb is marked as stale in the chip graph and excluded from future nibbles. Actual GC happens at cool_down().
alias: delete() · drop()
UPDATE re-embed
re_glaze(id, vec) → CrumbID
Updates the vector embedding of an existing crumb. Automatically migrates the crumb to its new nearest chip. No full re-index needed.
alias: update()
SCAN full table
lick(filter, limit) → Iterator
Sequential scan over all crumbs with optional metadata filter. Slow. Beautiful. Like eating a cookie one molecule at a time.
alias: scan() · all()
FLUSH persist
cool_down(fsync) → None
Commits in-memory write buffer to disk. Runs GC on crumbled entries. Rebuilds chip centroids if drift > 0.03 radians.
alias: flush() · commit()
Interactive 3D Simulation

Vector Space Playground — ℝ³

A live 3D projection of CookixDB's chip-clustering algorithm based on the NoVectDB composite distance kernel. Insert crumbs, query k-NN, watch Lloyd's algorithm converge in real time. Drag to orbit · scroll to zoom.

⬡ CookixDB · 3D Chip-Crumb Vector Simulation ⤢ Full Simulator
crumbs: 30
chips (K): 6
query ℝ³:
nearest chip:
Mathematical Foundation

The Science of Chips

CookixDB's search complexity derives from a combination of Voronoi partitioning and navigable small-world graphs. The cookie metaphor is aesthetic; the math is real.

Chip Assignment (Voronoi)
chip(q) = argmin_c ‖q − μ_c‖₂ for all chips c ∈ C
A query vector q is assigned to the chip whose centroid μ_c minimises Euclidean distance. This gives the Voronoi cell partitioning — the "chocolate chip territory" metaphor is scientifically accurate.
Cosine Similarity (Crumb Ranking)
sim(a, b) = (a · b) / (‖a‖ · ‖b‖) ∈ [−1, 1]
Within a chip's Voronoi cell, crumbs are ranked by cosine similarity to the query. This is direction-agnostic to magnitude, which means your crumbs don't need to be unit-normalised before baking (though it's faster if they are).
Chip Variance Threshold (Auto-Split)
split_chip(c) ⟺ Var(c) = (1/|c|) Σᵢ ‖xᵢ − μ_c‖² > σ²_max
When a chip's intra-cluster variance exceeds the threshold σ²_max (default: 0.15), the chip is automatically split into two via bisecting k-means. This is why CookixDB never has one giant chip that knows everything — because a cookie with one giant chocolate chip is just a chocolate bar and that's not what we're making.
Expected Search Complexity
T(n) = O(log n) [chip navigation] + O(|c|) [linear scan in cell]
Total expected query time where n is the total crumb count and |c| is the average chip size. With auto-split keeping chips balanced, |c| ≈ n / K where K is the number of chips, giving O(log n + n/K) overall. At K = √n, this is O(√n).
Competitive Analysis

CookixDB vs Everything Else

Honest comparison. We win on naming. We're competitive on everything else.

Feature Pinecone Weaviate Chroma CookixDB 🍪
Fun method names✓ nibble()
Built-in context tensor✓ NoVectDB
Cosine similarity
HNSW index✓ chip-HNSW
Bring-format queries
Open source✓ MIT
Delicious metaphors✓✓✓
Written in Python
Auto chip splitting✓ σ² threshold
Smells like cookiesscientifically unclear
Design Philosophy

Serious Science, Silly Names

🧪
The Crumb Hypothesis
Every piece of data, no matter how large, can be decomposed into atomic vector units called crumbs. A crumb has a position in ℝⁿ and zero opinions about being called a crumb.
🍫
Why "Chips"?
Cluster centroids are called chips because they are the best part of the cookie. They are also the most mathematically interesting part: a chip is a Voronoi seed in disguise wearing a chocolate hat.
🔥
Baking = Indexing
When you bake() a vector, you are not just inserting data — you are applying heat (computation) to transform raw ingredients (floats) into a structured, queryable crumb. The metaphor is basically a free PhD thesis.
🧊
cool_down() = fsync
Cookies need to cool before you eat them. Databases need to flush before you trust them. cool_down() does both metaphorically and literally. It also runs the garbage collector on your crumbled data.
Infinite Dimensions
CookixDB supports up to 65 536 embedding dimensions. Nobody has actually used more than 4096. But the option is there, sitting quietly like the last cookie in the jar that everyone is too polite to take.
🎓
NoVectDB Heritage
CookixDB inherits its context tensor architecture from NoVectDB, which proved that contextual indexing outperforms pure vector similarity for semantic search tasks. CookixDB adds the crumb model on top. Standing on the shoulders of giants — and cookies.