TheseusCodex

Guide 2 — Knowledge and Principles

Knowledge, Evidence Claims, and Principles

How recorded material becomes durable, reusable belief

This documentation describes Theseus Codex's infrastructure and methodology. It does not expose private firm materials, uploaded source documents, or unreleased internal records.

For
Readers who want to understand the corpus side of the system — how a sentence in a source becomes a principle the firm can act on.
Summary
Everything the firm reads or records is broken into atomic evidence claims, embedded, clustered by meaning, distilled into principles, and — when the firm is willing to commit — promoted into vetted positions that carry a falsifiability layer. The corpus is two passes: one builds the principle library, the other builds the principle-shaped conclusions that cite them.

What a claim is

A claim is one sentence, one speaker, one assertion, plus a fingerprint of its meaning. The workshop refuses to roll claims into paragraphs — the whole point is to be able to reason about them individually.

  • The text itself.
  • Who said it (speaker label, founder identity when known).
  • Where it came from (source upload, span start and end).
  • Its disciplines, from a fixed vocabulary.
  • Its fingerprint (an embedding vector).
  • Its type — factual, methodological, normative, predictive, definitional, or interpretive.
  • Its origin — founder (the speaker's own assertion) or external (a quoted external position, stripped before inference).
  • Hedges and evidence pointers.

A note on the word "conclusion"

Inside the workshop and in the early guides, some rows are called "conclusions." In current usage this is a legacy term for a principle-shaped evidence claim — a structured assertion the firm is willing to commit to internally — not the finished public answer a reader of this site might expect.

The firm's finished public answers live as reviewed methodology pages, articles, and Currents opinions. A "conclusion" inside the corpus is a building block on the way there.

The principle-shape contract

Every principle-grade evidence claim has to declare what kind of thing it is and where it applies. The firm requires five fields before the row is allowed to graduate to the highest confidence tier.

  • principleKind — one of seven shapes: RULE (a normative "do X"), CRITERION ("X is the test for Y"), MECHANISM ("X causes Y by Z"), HEURISTIC ("X usually works in case Y"), DEFINITION, FORMULA, or ALGORITHM.
  • domainOfApplicability — a short description of where the claim is meant to hold and where it stops working. "Always" is not acceptable.
  • quantifiableProxies — up to five measurable proxies that would let an outside reviewer falsify the principle.
  • decisionExamples — up to three concrete decisions the principle would direct.
  • sourceSpan — a verbatim substring of a source chunk that the principle is anchored to. If the quoted span does not literally appear in the source, the workshop refuses to write the row.

The falsifiability layer

Anything being prepared for publication must also carry a quantitative formalisation: a null hypothesis, one or more metrics, one or more statistical tests, and one or more data sources. Without all four, the workshop refuses to mark the formalisation approved.

This is the operational form of the working criterion that the test should be severe enough that the claim would have failed if it were wrong. If the firm cannot state what would falsify a principle, the principle does not get to be public.

Provenance, contradictions, and decay

Every uploaded source carries one of four provenance labels — proprietary, endorsed external, studied external, or opposing external — chosen at upload time. The contradiction engine uses these labels to skip cross-provenance pairs the firm expects to disagree with.

A single calibrated contradiction detector replaces an earlier family of heuristics. It returns a score, a confidence band, and a human-readable explanation. Contradictions no longer resolve by operator click; they resolve when new source material weights one side decisively over the other.

Knowledge ages. A conclusion can decay on a fixed schedule, on evidence change, on method version bumps, on embedding drift, on outcome observation, or on a calibration regression. Each row in the decay surface carries revalidate, retire, and update actions.