MatyOS
|
# Download the matyos binary from the Releases page, then: $ matyos check stdlib/arith.elk $ matyos new my_theory $ matyos check my_theory $ matyos build my_theory # seal a complete theory → .matyos archive
Introduction
MatyOS is a proof assistant in the tradition of Lean, Coq and Agda: you state theorems as types and prove them by writing terms that a small, trusted kernel checks. Its long-term goal is to be a system that large language models can use to do mathematics — analysis, conjecture, and proof — with first-class support for uncertainty (the realistic truth value).
Proof files use the .elk extension. matyos check exits 0 when every proof in the file holds and non-zero when any proof fails, so it drops straight into CI.
Prop, a text front-end (.elk), a tactic engine (by ... qed), and a three-valued "realistic" logic — all covered by 233 tests.
Why another proof assistant?
Lean / Coq / Agda were designed for humans, decades before LLMs. MatyOS is designed from day one around two bets:
- Soundness is sacred. Every proof — however it is produced, including by an LLM — reduces to a term checked by a tiny trusted kernel. Nothing is ever "assumed proven".
- Uncertainty is first-class. Real mathematical work (especially an LLM's) is full of plausible-but-unproven steps. MatyOS treats this rigorously with a three-valued logic (
true/false/realistic) that lives in an epistemic layer above the kernel, so conjecture and certainty never get confused.
Installation
Download the matyos binary for your platform from the Releases page, then put it on your PATH — or let the installer scripts do it for you (per-user location, no admin required):
:: Windows (run from the folder containing matyos.exe / the repo) > powershell -ExecutionPolicy Bypass -File install.ps1 # Linux / macOS $ bash install-matyos.sh
Open a new terminal, and the matyos command is available everywhere:
$ matyos version
$ matyos check demo.elk # type-check a proof file (exit 0 = all proofs hold)
$ matyos help
Want to build from source or run the 233-test suite? See the repo's CONTRIBUTING.md.
A first proof
Here is arithmetic from scratch — declaring the natural numbers, defining addition, and proving n + 0 = n by induction. This is stdlib/arith.elk in the repo:
inductive Nat : Type :=
| zero : Nat
| succ : Nat -> Nat
def add (m : Nat) (n : Nat) : Nat :=
Nat.rec (fun (_ : Nat) => Nat) n (fun (k : Nat) (ih : Nat) => succ ih) m
def cong (A : Type) (B : Type) (f : A -> B) (a : A) (b : A) (e : Eq A a b)
: Eq B (f a) (f b) :=
Eq.J A a (fun (x : A) (_ : Eq A a x) => Eq B (f a) (f x)) (refl B (f a)) b e
def add_zero_right (n : Nat) : Eq Nat (add n zero) n :=
Nat.rec (fun (m : Nat) => Eq Nat (add m zero) m)
(refl Nat zero)
(fun (k : Nat) (ih : Eq Nat (add k zero) k) =>
cong Nat Nat succ (add k zero) k ih)
n
example : forall (n : Nat), Eq Nat (add n zero) n := add_zero_right
Run it:
$ matyos check stdlib/arith.elk
inductive Nat : Type0 (2 constructors)
def add : (Nat -> (Nat -> Nat))
eval ... = (succ (succ (succ (succ (succ zero))))) -- 2 + 3 = 5
def cong : (Pi (x0 : Type0), ...)
def add_zero_right : (Pi (x0 : Nat), (((Eq Nat) ((add x0) zero)) x0))
example : (Pi (x0 : Nat), (((Eq Nat) ((add x0) zero)) x0)) [QED]
Dependent Types
MatyOS's kernel is a predicative dependent type theory (Martin-Löf / λΠ-with-universes) plus an impredicative Prop. The core terms (in de Bruijn form) are:
Univ i— universe levels:Univ 0 : Univ 1, etc.Prop— the impredicative proposition sort:Prop : Type0Pi (x : A) -> B— dependent function typefun (x : A) => b— lambdaf a— applicationConst(name)— references to defined inductive types, constructors, recursors, definitions
The imax rule keeps a Pi into Prop in Prop (impredicative); otherwise it lands in the larger of the two universes (predicative max).
Definitional equality = equality of normal forms (α-equivalence is free thanks to de Bruijn indices). Normalization includes β, δ-unfolding of definitions, and ι-reduction (recursors).
Inductive Types & Recursors
Declaring an inductive type registers, as kernel terms, the type former, every constructor, and the recursor (with a dependent motive and induction hypotheses). The kernel also generates an ι-reduction rule for the recursor.
A strict-positivity check rejects declarations that would make the logic unsound:
-- accepted: Nat is strictly positive inductive Nat : Type := | zero : Nat | succ : Nat -> Nat -- REJECTED: Bad would let us encode non-termination inductive Bad : Type := | mk : (Bad -> Bad) -> Bad
Termination holds by construction: the only recursion is via recursors (no general fix). Every program a user writes in MatyOS is total.
Propositional Equality (Eq, J)
Equality lives in Prop via the inductive family Eq : (A : Type) -> A -> A -> Prop with constructor refl and the based J eliminator:
-- Eq.J A a P d a (refl A a) ⟹ d (the ι-rule for J)
def cong (A : Type) (B : Type) (f : A -> B)
(a : A) (b : A) (e : Eq A a b)
: Eq B (f a) (f b) :=
Eq.J A a (fun (x : A) (_ : Eq A a x) => Eq B (f a) (f x))
(refl B (f a)) b e
The standard library at stdlib/eq.elk derives the rest of the equality toolkit — symm, trans, subst, cong — all from this single primitive. m + n = n + m (commutativity of addition) is then a real, kernel-certified theorem built on this toolkit.
Curry–Howard: logic is just types
Propositions are types. Proofs are programs. The four "logical" theorems are exactly the standard combinators (examples/proofs/curry_howard.elk):
-- A implies A def id (A : Type) (x : A) : A := x -- A implies (B implies A) def const (A : Type) (B : Type) (a : A) (b : B) : A := a -- modus ponens: (A -> B) -> A -> B def modus_ponens (A : Type) (B : Type) (f : A -> B) (a : A) : B := f a -- composition: (B -> C) -> (A -> B) -> A -> C def compose (A : Type) (B : Type) (C : Type) (g : B -> C) (f : A -> B) (x : A) : C := g (f x) example : forall (A : Type), A -> A := id
The kernel rejects non-proofs: an ill-typed term like fun (A:Type)(x:A) => x x (self-application) does not type-check, so it cannot masquerade as a proof.
Terms vs Tactics
MatyOS supports two writing styles for the same kernel terms:
- Direct terms. Write the proof as a
defwhose body is the term that inhabits the type. The kernel re-checks it. - Tactic scripts. Open a
by ... qedblock and apply tactics (intro,exact,assumption, …) that build a kernel term you don't have to write by hand. The kernel re-checks the result.
Either way, the final artefact is a kernel term — there is no "trust me, the tactic worked" mode.
Tactic Engine (by … qed)
Tactic scripts let you build proofs step-by-step instead of writing the raw term (examples/proofs/tactics.elk):
-- identity: intro the binders, close with a hypothesis theorem id_thm : forall (A : Type), A -> A proof id_thm := by intro A x assumption qed -- modus ponens: intro, then apply the function to the argument theorem mp_thm : forall (A : Type), forall (B : Type), (A -> B) -> A -> B proof mp_thm := by intro A B f a exact f a qed -- composition theorem comp_thm : forall (A : Type), forall (B : Type), forall (C : Type), (B -> C) -> (A -> B) -> A -> C proof comp_thm := by intro A B C g f x exact g (f x) qed
The realistic idea
Classical logic forces every proposition to be true or false. Real reasoning — and LLM reasoning especially — also needs "not (yet) known". MatyOS gives this a precise, textbook semantics (Kleene K3 and Łukasiewicz Ł3 three-valued logics):
| formula | classical | with a realistic atom |
|---|---|---|
P ∨ ¬P (excluded middle) | valid | not valid — undetermined when P is realistic |
¬(P ∧ ¬P) (non-contradiction) | valid | not valid |
P → P (self-implication) | valid | not valid in Kleene · valid in Łukasiewicz |
The failure of excluded middle for a realistic atom is the point: an uncertain proposition is neither affirmed nor denied. Crucially, realistic lives above the trusted kernel — conjectures carry evidence and confidence and a status (conjectured → certified / refuted), and the kernel only ever certifies real proofs.
Full treatment in the Realistic Type paper →.
MatyOS Projects
Real work is bigger than one proof. A MatyOS project is a directory that mirrors the scientific method — assume (hypothesis) → state (theorem) → experiment (test) → certify (proof) — and matyos check runs the whole thing and reports honestly per claim:
┌─ hypothesise ──┐ │ .hyp │ assume / conjecture (epistemically REALISTIC) ▼ │ state ─ .thm ─────▶ a precise proposition (an open obligation) ▼ │ experiment ─ .test ▶ computational checks the kernel runs ▼ │ prove ─ .prf ─────▶ a kernel-checked derivation → CERTIFIED ▼ │ (or CONDITIONAL if it rests on a conjecture) bundle ─ theory ───▶ a verified body of knowledge ▼ │ seal ─ .matyos ──▶ a compressed, self-describing archive └─ distribute / build on ─┘
Scaffold a project and check it:
$ matyos new my_theory
Created project 'my_theory'. Try: matyos check my_theory
$ matyos check my_theory
==========================================================
MatyOS project: my_theory
==========================================================
theory: theories/arithmetic
definitions: Nat, add, cong
hypotheses / conjectures (realistic):
[CONJ] add_comm : ...commutativity of addition...
theorems:
[PROVEN] add_zero_right (certified)
tests:
[PASS] add_2_3
[PASS] add_0_4
----------------------------------------------------------
Summary
theorems : 1 proven (1 certified, 0 conditional), 0 open
conjectures: 1 (realistic)
tests : 2 passed, 0 failed, 0 ran
status : OK (exit 0)
----------------------------------------------------------
The report draws a hard line — tracked transitively — between certified theorems (depend on nothing unproven) and conditional / realistic ones (rest on an open conjecture). An assumption buried deep in a chain of lemmas can never launder itself into a certified result.
Sealed .matyos archives
When a theory is complete (no open theorems, no failed checks), seal it into a single compressed, self-describing .matyos archive — a sigma of files bundling every theory, theorem, proof and test, plus a machine-readable MANIFEST.json:
$ matyos build my_theory # refuses to seal an incomplete project sealed -> my_theory.matyos (1 certified, 0 conditional, 0 open) $ matyos info my_theory.matyos # read the manifest without re-checking
Each file type has its own black-and-white icon:
.matyosarchive — Σ.thmtheorem — ∀.prfproof — ∎.hyphypothesis — ∃.testtest — ✓.elkdefinitions — λ
The matyos CLI
$ matyos version $ matyos help $ matyos check <file.elk> # type-check a single .elk file $ matyos check <project_dir/> # run the whole scientific-method report $ matyos new <project_name> # scaffold a project (.hyp / .thm / .test / .prf layout) $ matyos build <project_dir/> # seal a complete project → .matyos archive $ matyos info <file.matyos> # read the manifest without re-checking
matyos check exits 0 when every proof in the file or project holds and non-zero when any proof fails — so it drops straight into CI.
Trust Boundary
MatyOS follows the de Bruijn criterion: correctness rests on a small, auditable kernel, and everything else is untrusted machinery that must produce terms the kernel re-checks.
untrusted TRUSTED ┌───────────────────────────────────┐ ┌──────────────────────┐ .elk file ──▶ frontend/surface.py ──▶│ │ matyos/kernel/core │ LLM output ─▶ (parser → kernel terms)│ ──▶ │ infer / normalize │ ──▶ ✓ / ✗ tactics ─▶ (future) │ │ (inductive, equality)│ realistic ─▶ logic/realistic.py │ └──────────────────────┘ └───────────────────────────────────┘
matyos/ # the proof assistant ├── kernel/ # the TRUSTED core — small and auditable │ ├── core # terms, normalization, definitional equality, type inference │ ├── inductive # inductive types, recursors, iota-reduction, strict positivity │ └── equality # propositional equality (Eq) + the J eliminator ├── frontend/ # tokenizer + parser for the .elk proof language ├── logic/ # three-valued ("realistic") logic └── cli # the `matyos` command stdlib/ # standard library, written in MatyOS itself (.elk) examples/ # example proofs (.elk) docs/ # design + language reference ROADMAP.md # the honest, phased plan toward a Lean-class system
realistic layer, and any future tactics or LLM output all ultimately produce terms that the kernel re-checks. If the kernel is sound, a term that type-checks against a proposition is a proof of it.
Roadmap & honest status
"Competitive with Lean for all of mathematics" is a decade-scale, community effort — Lean's Mathlib is ~1.7M lines, ~400 contributors, ~6 years. The realistic way to be competitive in a niche soon is to win on the axis Lean was never designed for: LLM-native authoring + uncertainty-aware reasoning, while bootstrapping the library by autoformalization.
Two parallel tracks run throughout the roadmap:
- Core track (C): the trusted kernel growing to Lean-class expressiveness.
- Realistic track (R): the
realistic/ uncertainty layer — kept above the trusted core so it never threatens soundness.
Current state (v0.7.0): kernel + inductives + equality + tactic engine + scientific-method projects + sealed archives + commutativity of + proven. Next on deck: richer evidence-bearing tests, a discharge queue surfacing the highest-leverage open conjectures, theory import across .matyos archives, and an LLM conjecture↔proof loop.
Full phased plan: ROADMAP.md.