What is sourcebook?
AI agents don't just write wrong code. They write correct code and stop too early — leaving test files, sibling components, and co-dependent configs unchanged.
sourcebook checks your diff for completeness.
It flags the files your agent probably needs to change but didn't.
Under the hood, sourcebook maps your repo — import graphs, git co-change history, test file mappings, hub file detection. When you run sourcebook check, it compares your changes against that structural map and flags what's missing.
No LLM needed. Under a second. Runs as a CLI, a Claude Code hook, or an MCP server. For deeper semantic analysis, --ai sends the diff to Claude Sonnet for about a penny per run.
Import Graph PageRank
Ranks every file by structural importance. Find the hubs that break everything when you touch them.
Hub: fixtures.ts (86 importers)
Hub: http.ts (72 importers)
Git Forensics
Reverted commits are "don't do this" signals. Co-change coupling reveals invisible dependencies.
CORRELATION: 88%
REVERTS: 2 found (anti-patterns)
Convention Detection
Naming patterns, export style, barrel exports, path aliases — the tribal knowledge no README captures.
IMPORTS: path alias @/ preferred
BARREL: 40 index.ts files
FOUR_SURFACES
sourcebook check
Run it on any diff. Get told what's missing. Co-change coupling, test files, import graph siblings, hub blast radius. No LLM, under a second.
Claude Code hooks
sourcebook init wires up pre-commit hooks. Agent edits a file, sourcebook checks the diff, agent fixes it before committing.
MCP server
Published on the official MCP registry. Agents can query repo structure, blast radius, conventions, and co-change data on demand.
PR completeness checks
Automated completeness review on every pull request. Comments with specific files that likely need to change. Coming soon for teams.
RESEARCH_FOUNDATION
Auto-generated obvious context makes agents worse
LLM-generated context files reduced task success by 2-3% and increased inference costs by 20%+. Only non-discoverable information improves performance.
program.md is the #1 lever for agent effectiveness
Autoresearch ran 700 experiments in 2 days because the curated context file contained only what the agent couldn't figure out alone.
PageRank on import graphs for structural importance
Repo-map technique ranks files by how many other files depend on them. sourcebook uses this to identify architectural hubs.
LLMs lose 30%+ accuracy in the middle of long contexts
sourcebook places critical constraints at the top and bottom of output files — where LLMs pay the most attention.
Agents that don't wander finish first
19 tasks, 10 repos, 4 languages. Repeat runs showed sourcebook eliminates the slow, wandering runs — and prevents failures handwritten context can't. Full methodology and results →
CHECK_YOUR_DIFF
Stop shipping incomplete changes. One command to find what's missing.
Free and open source. No LLM needed. BSL-1.1 licensed.