▶ Live demo & how it works
· PyPI
· uvx plainmarker check .
A fail-closed second opinion on the code your AI agent just wrote — deterministic, in-loop checks that need no model and no account and never send your code off your machine.
plainmarker is an in-loop verifier for developers running AI coding agents (Claude Code, Cursor, Codex).
The moment the agent edits a file, plainmarker reads what the code, tests, and scans actually do, not
what the agent claims, and tells you in plain language what's dangerous right now: a hardcoded secret,
a curl | sh install hook. These are deterministic checks — detect-secrets, a
vendored offline Semgrep ruleset, and a shell-command auditor — not the writer agent's say-so, so they
can't just agree with it, and they run entirely on your machine. (plainmarker check additionally screens
your dependencies against the OSV advisory database, sending package names — never your code. plainmarker's
deeper, opt-in plainmarker audit adds an independent reviewer on a different AI model family and discloses
what it sends; the in-loop seatbelt runs no model.) It never calls your code
"safe" — only which checks passed, and that this is not a guarantee — and when a check can't finish, it
says so (fail-closed) instead of pretending all-clear. It also remembers why each decision was made
and re-checks it every session, so trust compounds instead of resetting.
You don't have to read the diff to know what your agent just did to your project.
Status: in-loop wedge, v0.49.0. The in-loop
plainmarker seatbelt+ an actionable verdict + a one-line install are the current focus. Running the target's tests in a sandbox and the agent-facing API are deferred to later milestones.
Your agent writes a hardcoded key, then a curl | sh. plainmarker reads the change (not the agent's claim) and tells you:
# (!) plainmarker found dangers in code your agent just wrote
2 open danger type(s) — secrets + code-vulnerabilities + risky shell only; NOT a guarantee of safety.
## What looks dangerous now
- 2 potential secret(s) found in the code (e.g. config.py:1).
- Dangerous shell command(s) found: 1 (download-and-run or secret-to-network exfiltration).
## Paste this to your coding agent to fix it
> Remove any hardcoded secret and load it from an environment variable, then rotate the exposed key.
> Replace any curl|sh with a pinned, checksum-verified download.
It exits non-zero so your Git hook or CI can gate on it — and it never calls your code "safe."
Ground truth only. plainmarker never reports what the AI says it did. It reports what the code, the tests, the scans, and the live system actually are. It never claims a project is "safe," only which specific checks passed, and that this is not a guarantee.
Local-capable, and honest about egress. The independent Auditor uses a cloud model by
default, so whenever it runs — and when onboarding writes its plain-English summary — plainmarker tells
you, in plain language, that your code or project facts left your machine. Turn on local_only with
a local Auditor model and nothing leaves your computer. No telemetry, ever.
Three ready-to-use Auditor configs ship under keeper_core/templates/keeper/ — point at one with
plainmarker audit <path> --config <file>:
- Default — NVIDIA-served DeepSeek (no config file needed): a capable cloud model; set
NVIDIA_API_KEY. - Free — OpenRouter (
config.openrouter-free.yaml): run the Auditor at no cost on a free, different-family model; setOPENROUTER_API_KEY. Egress is disclosed like any cloud model.⚠️ Free models may log/train on inputs, so if you are not 100% sure your project is free of private/customer data, use the local option below instead. They are also rate-limited: a throttle shows up honestly as "could not determine" (never a fake pass), so a first run that says "the reviewer did not run" is usually a rate-limit — runplainmarker doctorfor the exact reason, then retry / switch model / go local. - Local / private — Ollama (
config.local-ollama.yaml): zero egress, unlimited;local_only: truerefuses any remote endpoint. Best for private code, and the only option with a benchmarked model (qwen2.5-coder:7b).
keeper_core/— the standalone engine. All real logic. Knows nothing about Claude Code and runs on its own.adapters/— thin wrappers that let a host use Core. The Claude Code plugin is the first; a standalone CLI/service and others follow. Adapters contain no logic of their own.
See docs/ARCHITECTURE.md for the full plain-language map of the parts.
uvx plainmarker check .
That's the whole install: uvx fetches a compatible Python and runs
plainmarker — no clone, no global setup. For a command that stays around, uv tool install plainmarker
(or pipx install plainmarker), then just plainmarker <command>.
For the in-loop catch — run plainmarker seatbelt . right after an agent edits and it tells you
what's dangerous now plus a paste-ready fix to hand back to the agent. Firing it automatically the
moment your agent writes a file means adding a Claude Code PostToolUse hook that calls plainmarker seatbelt yourself — that wiring is manual today (a packaged hook is on the roadmap), and plainmarker
never edits your ~/.claude config for you. That hook only fires
in repos you opt into the watch with mkdir -p .keeper && touch .keeper/.watch.
For the opt-in plainmarker audit (the independent, different-family model reviewer) add one API key
for the Auditor model — read from an environment variable, never stored in the repo (see Choosing the
Auditor model above).
Point plainmarker at any project (each writes its receipts under that project's .keeper/ folder):
| Command | What it does |
|---|---|
plainmarker doctor |
check your setup and that the two models are reachable |
plainmarker onboard <path> |
map a project (files, stack, dependencies, risk areas) + a plain summary |
plainmarker seatbelt <path> |
in-loop: after your agent edits a file, what's dangerous NOW + a paste-ready fix to hand back to the agent |
plainmarker check <path> |
what changed since plainmarker last looked, what's dangerous now, and what it did NOT check |
plainmarker baseline <path> |
read-only hard checks: exposed secrets, vulnerable dependencies, code-vulnerability patterns |
plainmarker audit <path> |
the independent different-family reviewer over ground truth (facts vs concerns) |
plainmarker interrogate <path> |
a few plain questions; flags where the code disagrees with your intent |
plainmarker report <path> |
one plain-language report: what's true, broken, unsure, and to decide |
plainmarker calibrate |
prove it catches planted flaws and passes clean projects (and see its blind spots) |
plainmarker never claims a project is "safe" — only which specific checks passed, and that this is not a guarantee.
A script can gate on plainmarker's exit code. There are two shapes, both fail-closed in the way that matters:
a real danger never exits 0.
| Code | Meaning |
|---|---|
0 |
Nothing to act on (see per-verb note below). |
1 |
STOP — something needs a human. |
2 |
Usage error (bad arguments; standard argparse). Not a security signal. |
plainmarker seatbeltandplainmarker baselineare fail-closed gates: exit0only when nothing dangerous was found and every check actually ran. A danger, or a check that could not finish (timeout / offline / missing tool), or a crash →1. There is deliberately no exit code meaning "could not determine" — doubt collapses to1, never0, so a hook can treat any non-zero as "don't trust this yet." (baselineis strict: with no network for the dependency check it returns1.)plainmarker checkis the change narrator: it exits1only when a real danger is present right now, and0otherwise. A gracefully-undetermined (unknown) check — a scanner that timed out or is offline — is disclosed in the report text and in--jsonasfail_closed/could_not_check, not in the exit code (a check that outright crashes still exits1). If you want fail-closed-on-unknown gating fromcheck, read--jsonfail_closed, not the exit code (or gate withseatbelt).plainmarker auditadditionally needs the Auditor model reachable.
MIT.

