Skip to content

Distilling Frontier Model Security Reviews into a Cheap Deterministic Scan

We're releasing skills that pair an LLM agent with OpenTaint's taint engine. The agent maps your application's attack surface, models the library methods the engine can't see, and writes rules specific to your code — and it can confirm a finding by actually exploiting it. After that, the engine re-scans every future commit on its own, for the cost of CPU.

Jun 10, 2026

Frontier models can genuinely review code for security now. An LLM agent reads code the way a human reviewer does — it follows intent, catches the dangerous pattern nobody wrote a rule for, and reasons about whether a given flow is actually exploitable in context. The catch is that you’re only renting that judgment — the agent burns tokens on every file, it gives a different answer each time you run it, and it can’t promise it ever looked everywhere. A taint engine is the opposite kind of tool: it traces untrusted data through your code and invents nothing, but whatever you teach it, the engine applies everywhere, the same way every time, for the cost of CPU. It is a mistake to treat these two as a choice — pick one and you’re left living with its blind spots.

We’re releasing a set of skills that run the two together as a single workflow. The skills plus OpenTaint — our taint-analysis engine — form a harness for the model: the agent does the discovering, and the engine applies what it finds on every scan, deterministically, and costs CPU only. The slow, probabilistic work happens once and gets distilled into a few artifacts the engine reuses on every future commit. From there, a frontier-model security review becomes a plain deterministic scan.

Division of labor

The taint analysis engine runs on a single idea. You describe code with patterns, and it matches them along traces — the actual route data takes through a running program, from where it enters to where it gets used. A security rule is just one specific pattern: untrusted data (an HTTP parameter, say, or a request header) reaching a dangerous operation (a SQL query, a shell command) with nothing in between to make it safe. When a trace matches the pattern, you’ve got a finding.

Classical SAST tools miss real vulnerabilities for three main reasons, and each one has its own fix:

  1. Insensitivity. To stay fast on big codebases, conventional engines give up precision on purpose. Field-insensitivity treats all of an object’s fields as one. Context-insensitivity cannot distinguish different call sites for the same function. Alias-insensitivity loses track of which references point at the same object. Each of these shortcuts collapses a distinction that real code actually relies on, and you pay for it both ways — false alarms on safe code, and missed flows on the dangerous kind.
  2. Opaque externals. To trace data, the engine has to know what every method along the path does with it — and library code is not an excuse. The engine could analyze your dependencies the same way it analyzes your code, but they usually outweigh it many times over, so doing that would blow up the cost of every scan. Instead it works from models of what those library methods do. When a model is missing, the engine has to guess, and either guess hurts. If it guesses that the method passes nothing through, real flows die quietly at the call and the scan never reports them. If it guesses the opposite — that everything passes through — you get a flood of phantom flows that bury the findings that matter. Without a model, those are the only two options it has.
  3. Generic rules. A built-in ruleset has no idea that your application treats some particular header as untrusted, or that some helper of yours already sanitizes the data. Code-specific vulnerabilities need code-specific rules, and nobody has written yours yet. Studies of SAST results keep finding the same thing: most come from missing rules, not from a weak engine.

The OpenTaint engine is our answer to that first failure, the insensitivity. Where other engines approximate, it stays sensitive — it tracks each field of an object separately, keeps call sites apart, and resolves aliases back to the object they actually name — and it follows each flow all the way to its end, with no depth cutoff. Within the surface it can see, it is exhaustive: nothing sampled, nothing skipped.

The AI agent takes on the other two failures, and it does it by writing two kinds of artifact. Approximations are small models that teach the engine how a method moves data, so an opaque call turns into a step the trace can pass through. Project rules are the patterns your code needs — where untrusted data gets in, which calls are dangerous, which helpers make data safe again. Anyone can read them and check them in review — the rules are AST patterns, the approximations are a simple YAML file. From then on, the engine applies them on every commit, the same way each time. The model’s judgment is captured the first time and reused on every scan after at no extra cost.

Here’s one such approximation. Spring’s ResponseEntity.ok(body) stashes the body inside the entity it returns, and getBody() hands it back — so data that goes in one side comes back out the other:

passThrough:
- function: org.springframework.http.ResponseEntity#ok
copy:
- from: arg(0)
to:
- this
- .org.springframework.http.HttpEntity#Body#java.lang.Object
- from: this
to: result
- function: org.springframework.http.HttpEntity#getBody
copy:
- from:
- this
- .org.springframework.http.HttpEntity#Body#java.lang.Object
to: result

And here’s part of a project rule — the OkHttp calls this particular codebase uses to reach the network, written up as the dangerous operations in its SSRF pattern:

patterns:
- pattern-either:
- pattern: (okhttp3.Request.Builder $B).url($UNTRUSTED)
- pattern: okhttp3.HttpUrl.get($UNTRUSTED)
- pattern: okhttp3.HttpUrl.parse($UNTRUSTED)
- pattern: (okhttp3.HttpUrl $U).newBuilder($UNTRUSTED)
- pattern: (okhttp3.OkHttpClient $C).newCall($UNTRUSTED)
- pattern: (okhttp3.OkHttpClient $C).newWebSocket($UNTRUSTED, $LISTENER)
- focus-metavariable: $UNTRUSTED

That’s all an artifact really is — a small, checkable claim about how data moves through your code, reviewed in a pull request like anything else.

Distilling the security review

A security review is really a chain of judgments about trust. Where’s the attack surface — the places outside data gets in, and the operations it could damage? Where do the trust boundaries run — the checks that make data safe, and the library code it vanishes into along the way? And of the flows running between those points, which are actually exploitable? Point the AppSec Agent skill at a project and it works through exactly those questions, writing down each answer as it goes. The attack-surface map becomes a coverage record. Checks and dangerous operations turn into rules. The library behavior gets written up as approximations, and every flow ends up with a recorded verdict and the reasoning behind it. Of all of that, the rules and approximations are the part the engine replays on every later scan — the distilled review. Here’s the workflow:

Stage 1 — Scan. The agent does the discovering and turns what it finds into artifacts. The engine applies them, deterministically. The cycle is approximation saturation — re-scan until no new opaque methods turn up.

  • Maps the attack surface — the places untrusted data enters, and the operations where it can do real damage.
  • Authors rules, test-first — the patterns your code is missing, each accepted only once it’s tested.
  • Scans — the engine traces every flow and hands back each finding as the full trace behind it.
  • Saturates approximations — models the opaque methods, over and over, until no new ones show up.

Mapping is what makes coverage defensible. The agent works through every dependency the project pulls in and sorts them: which ones handle outside data, which expose dangerous operations, and which are just inert. The packages it flags get drilled into, down to the exact methods the project calls. The ones it dismisses get logged with a reason. What’s left is a coverage record showing what got examined, what got set aside, and why. You also set the scope up front — the whole application, one component you’re worried about, or a single bug you already suspect.

Rules get written the way code does, tests first. Where a built-in rule already fits, the agent just wires it in. It only writes a new pattern when nothing existing covers the case. Each new rule comes with two examples — a vulnerable snippet it should flag and a safe one it shouldn’t — and it isn’t accepted until it gets both right. A rule that misfires manufactures noise at scale, across the whole codebase at once. Writing the tests first heads that off.

The deep findings only surface after saturation. The agent models the methods where traces were dying, then re-scans — and that scan usually turns up a fresh layer of unmodeled methods the new approximations just exposed. So it models those too and scans again. This keeps going until a scan finds nothing new. Only at that point does the engine start seeing flows that thread through many layers of library and framework code at once. In one real case, this workflow found and confirmed an unauthenticated remote-code-execution bug in a widely used open-source server. The path ran from an HTTP request, through several layers of framework plumbing, into a scripting evaluator — and most of those layers were the unmodeled methods where the trace had been quietly dying. Only once the approximations were saturated did it connect the request all the way to the evaluator.

You choose how much of this workflow to actually run, and each level includes everything in the one below it:

  • Lite — the scan with whatever rules you already have (the built-ins on day one, plus anything earlier runs distilled), followed by triage.
  • Normal — adds approximation saturation.
  • Deep — adds attack-surface mapping and test-first rules on top of that.

Exploit confirmation is a part of the triage stage, available at any depth — you can turn it on even with a Lite scan.

Findings you can trust

When the scan finishes, every finding goes back to the agent for triage. It reads the full trace and calls the finding real or false. With exploit confirmation switched on, it goes past just reading the trace — for each candidate it writes an actual proof-of-concept exploit and runs it against a throwaway local instance of the app. If the exploit lands, the finding is real, by definition. If the agent tries everything and still can’t get a PoC to work, it marks the finding as a probable false positive — though a failed exploit only suggests the code is safe and can’t prove it, so the verdict stays in the report for a human to overrule.

Stage 2 — Triage. A working exploit confirms the finding. When none can be found, the agent turns to diagnosing what misled the engine and writes the cause down.

A false positive doesn’t just get thrown out. The agent works out why the engine was fooled and records the cause next to the verdict, sorting it into one of three kinds. A rule issue: the pattern is too broad — it treats data as untrusted when it never was, or it misses the check that already neutralizes it. An approximation issue: a model is propagating data the real method wouldn’t. An engine issue: the trace itself is wrong and no artifact is to blame — and the agent writes that one up as a reproducible report for us to fix on the engine side.

Those first two — rule and approximation diagnoses — can be picked up later, and each can be turned into a small failing test: the false positive boiled down to an example the artifact must not match. The same skills that author rules and approximations also tune them. You add that example as a should-not-fire test, then adjust the rule or correct the approximation until it passes and the should-fire tests still do. And because the artifact applies to the whole codebase on every scan, fixing it clears that entire class of false positives in one go — anywhere the same mistake would have fired. Stage 1’s saturation covered the case where the engine wasn’t propagating enough. This covers the opposite one — rules and approximations that propagate too much.

The payoff

You pay for the model’s discovery once per artifact. The result is saved, and from then on the engine spreads that one-time cost across the whole codebase and every commit that comes after. Scans are deterministic — run the same scan over the same code and the same findings come back — and they cost only CPU time. The only thing that’s ever uncertain is the discovery up front. Everything after it is fixed.

Here’s how that played out on Komga — an open-source media server, around 137,000 lines of Kotlin — run completely cold, with the agent set to Deep:

Cold Deep agent run (once)Scan (every commit)
Cost$112 in tokens$0 — CPU only
Time2 hours 51 minutes1 minute 37 seconds

Token costs are measured with Claude Opus 4.8 ($5 / $25 per million input / output tokens).

Across the projects we’ve tried so far, that one-time spend has landed in the $100–$200 range. After that, a commit is just a scan again — a minute or two of CPU, and the dollar cost drops to nothing.

Only that first run is cold. Those three hours went into closing the gap between what the engine knows out of the box and this one specific stack — the libraries it had no models for, the patterns particular to the project. And that knowledge doesn’t expire once it’s written down. From there on, a commit comes down to one of two cases:

  • Nothing new crossed a boundary — the common case. The engine re-scans with the artifacts already sitting in the repo, and a fresh finding in an existing pattern costs nothing extra.
  • The code crossed a boundary the artifacts don’t cover yet — a new library, a new kind of entry point. The agent can come back to model just that new piece, and nothing more.

Either way, the only tokens you spend go toward whatever the artifacts don’t already cover — everything else just runs as a scan.

Get started

OpenTaint is open source, Apache 2.0 licensed. It analyzes Java and Kotlin today, with Go and Python next on the roadmap. Add the skills to your coding agent — Claude Code, OpenCode, Codex, or anything that supports the skills format:

Terminal window
npx skills add https://github.com/seqra/opentaint

That installs fifteen skills. One is the workflow itself, and the rest are the steps it drives — each usable on its own:

  • Pipelineappsec-agent runs the whole workflow end to end, from build to confirmed findings.
  • Build and scanbuild-project builds the project into the model the engine analyzes. run-scan runs the deterministic scan and produces the report.
  • Attack surfacetriage-dependencies marks which dependencies could introduce sources or sinks. discover-attack-surface drills into a flagged package for the exact methods the project uses.
  • Rulescreate-rule authors a detection rule test-first, and fixes one that misfires or misses. assemble-lib-rules merges the per-package rules with the built-ins into the project-level patterns.
  • Approximationsanalyze-external-methods sorts the methods where traces died and decides what to approximate. create-pass-through-approximation models a method whose propagation is plain copying. create-dataflow-approximation models one a copy cannot express.
  • Artifact testscreate-test-project builds the annotated should-fire / should-not-fire samples that rules and approximations are verified against. debug-rule traces where taint is dropped when one misbehaves.
  • Triageanalyze-findings splits a rule’s findings into distinct vulnerabilities and rules each true or false. generate-poc reproduces a finding against the running application.
  • Engine feedbackreport-analyzer-issue turns a confirmed engine-side diagnosis into a reproducible report, optionally a GitHub issue.

appsec-agent will offer to install the engine for you if it isn’t already there. To start, open your coding agent in the project and ask it to find vulnerabilities. It asks two questions up front, then works the rest of the pipeline on its own:

  • Scan depth — the Lite / Normal / Deep ladder from above.
  • Exploit confirmation — whether to confirm findings with PoCs. Dynamic triage launches throwaway local instances of the application under test and tears them down at the end.

Everything it produces lands in one .opentaint/ directory at the project root:

  • the rules and approximations it wrote — the distilled review itself
  • the attack-surface coverage record — what was examined, what was dismissed, and why
  • a verdict with reasoning per finding
  • a PoC script per confirmed vulnerability
  • a vulnerabilities.md report on top

Keep that directory in the repo. From then on, every later scan can run straight from those artifacts — deterministically, even on CI.

The more of your code is AI-written, the more you need a formal layer underneath it — one that turns every discovery into coverage that lasts.