ENGINEERING · APRIL 2026

We scanned 30+ repos. Here's what broke (and what we fixed).

esbuild, vLLM, Polars, Biome, Convex, Fern, and 25 more. 12 scanner bugs found, 69 QA assertions added, and an engine that stopped hallucinating about codebases.

sourcebook generates project knowledge files for coding agents. It reads your repo's structure, git history, and conventions, then outputs an AGENTS.md that tells the agent what actually matters before it starts editing.

We'd already scanned 15 major repos and written about what we found. But 15 repos is a controlled environment. We wanted to know what happens when you push the scanner into hostile territory — repos that look nothing like the typical Next.js or Django app.

So we scanned 30+ more. And the scanner broke. Repeatedly. Each break taught us something.

The false positive problem

The first thing that broke was framework detection. We scanned Schemathesis — an API testing tool written in Python — and the output said it was a FastAPI application. It's not. Schemathesis tests FastAPI apps. It imports FastAPI in its test fixtures, and our scanner saw the import and made an assumption.

Same thing with SQLModel. It's an ORM library that integrates with FastAPI — so it imports FastAPI, has FastAPI in its docs, and uses FastAPI in examples. But it's not a FastAPI project. The scanner said it was.

This is the worst kind of bug in a context file: a confident lie. The agent reads "this is a FastAPI project" and starts generating FastAPI-style code in a library that has nothing to do with web serving. The fix was adding stack-vs-dependency distinction — checking whether the framework appears in source code or only in test/doc files.

Ghost frameworks

Scanning Hono (a TypeScript web framework) produced an AGENTS.md that mentioned Zod validation. Hono doesn't use Zod. But one test file — a single test — imported a Zod adapter to test compatibility. That was enough for the scanner to declare Zod as a project convention.

We found the same pattern with DRF (Django REST Framework) appearing in Go codebases. The scanner was matching regex patterns too broadly — a Go file with permission and auth keywords was enough to trigger the DRF detector. We had to scope auth detection to Python files only and require multiple signals before flagging a framework.

The adversarial batch

After fixing the obvious bugs, we deliberately chose repos designed to confuse a scanner:

Each of these repos exposed a different category of assumption in the scanner. esbuild taught us that the primary language isn't always the one with the most files. Biome taught us that test fixtures aren't dependencies. vLLM taught us the difference between "uses FastAPI" and "is a FastAPI project."

12 bugs, 69 assertions

Every bug became a regression test. We now have a QA suite that clones 15 repos and runs 69 assertions against the scanner output. Each assertion checks for a specific known issue:

$ bash test/qa-repos.sh

honojs/hono (TS framework, library):
  PASS Should detect: Hono routes (found)
  PASS Should detect: Vitest (found)
  PASS Should NOT detect: Zod (correctly absent)
  PASS Should NOT detect: FastAPI (correctly absent)

vllm-project/vllm (ML engine, NOT FastAPI app):
  PASS Should detect: library (found)
  PASS Should NOT detect: 'FastAPI project' (correctly absent)

biomejs/biome (Rust linter, NOT a web framework):
  PASS Should NOT detect: React routes (correctly absent)
  PASS Should NOT detect: Vue (correctly absent)
  PASS Should NOT detect: Svelte (correctly absent)

========================================
RESULTS: 69 passed, 0 failed, 0 skipped
========================================

The pattern that emerged: scanning more repos doesn't just find more findings — it finds more categories of error. Repos 1-15 found the happy-path bugs. Repos 16-30 found the adversarial bugs. After repo 25, we stopped finding new bug classes entirely. The engine converged.

What the data actually shows

Across 30+ repos, here's what we consistently found:

Why this matters for context files

A context file that lies is worse than no context file at all. If AGENTS.md says "this is a FastAPI project" and the agent follows that instruction, it will generate FastAPI-style code in a library that doesn't serve HTTP requests. The agent trusted the file and produced confidently wrong output.

That's why we scan adversarially. Not just "does it detect the right things?" but "does it avoid saying wrong things?" The 69 assertions in our QA suite are split roughly 50/50 between should-detect and should-NOT-detect checks. Precision matters as much as recall.

You can explore all 32 repos at sourcebook.run/for, or run it on your own repo in about 3 seconds:

npx sourcebook init

TRY_IT_YOURSELF

npx sourcebook init
VIEW_SOURCE

No API keys. No LLM. Everything runs locally.

MORE_FROM_SOURCEBOOK

What we found scanning 15 open-source repos arrow_forward Why auto-generated context makes agents worse arrow_forward Explore: How 32 open-source repos actually work arrow_forward