corpus semantic registry

documents define.
the registry identifies.

a registry layer over a corpus of documents: one symbol, one definition home, one content-hashed identity. edit a pinned source and every dependent claim is flagged in ci until it is re-verified and re-pinned.

three live registries

compiled from source and rendered. each is a browsable index: symbols, definition homes, dependency edges, verification states, and the collisions and aliases the registry resolves.

fold real paper

19 symbols8 proved1 conditional4 open

the claim-status table of a mathematics preprint (v67), maintained as a registry with dependency edges and sha256 pins against the pdf. change a byte of the paper and every affected claim is flagged until re-verified.

relay worked example

an auth design doc and an api spec. two teams used session for different things; the registry records the collision and its resolution. api_key was renamed access_token; the alias keeps old references resolving.

gambit worked example

a card-game rulebook vs. its engine: the same machinery on prose-vs-code drift. capture collides between the two; the registry resolves it on the rulebook sense.

how drift detection works

for corpora where documents, code, and ai agents all touch the same vocabulary: research frameworks, spec-driven codebases, long-lived design docs.

1register

documents own the prose and the definitions. the registry records each symbol, its one definition home, and its dependency edges.

2pin

every definition home gets a sha256. the registry compiles to a lockfile, a browsable wiki, a dependency graph, and a validation report with 20+ typed error codes.

3detect

any change to a pinned source flags every dependent symbol with a CSR004 drift diagnostic until it is re-verified and re-pinned.

$ echo >> examples/relay/docs/auth_design_v1.md
$ python3 tools/csr.py --root examples/relay/csr build
CSR004 hash_drift: symbol csr.Auth.session: source changed after hash
pinning (pinned sha256:6dfd2f.. != current sha256:e42753..)

this exact check runs in ci on every commit (the demo is real). these pages rebuild from source weekly and on every push.

part of a verification stack

csr handles identity. its siblings handle provenance and proof-term evidence. each is usable on its own.

verification-events content-hashed provenance events for every verification act (schema + stdlib-only python)
lean-introspect lean 4 #introspect: proof-term dag + leakage report (sorry, mvars, dependency surface) as json
fold-registry the fold preprint's claim registry, the source behind the fold pages here