Why Static Analysis Fails Gracefully

Static analysis has an old reputation problem: it cannot be complete. The moment code becomes dynamic enough, the analyzer begins to lose precision. It misses behavior hidden behind runtime values. It over-approximates control flow. It flags things that are not actually reachable. It gives up on cases that humans can understand in a few seconds.

All of that is true.

But it is also the wrong standard.

The useful property of static analysis is not perfection. It is that the failure mode is structured. When a static analyzer fails, it usually fails for a reason: dynamic dispatch, string construction, reflection, external state, native code, runtime configuration, data-dependent branches, or missing library models. Those reasons are not noise. They are a map of where the program becomes hard to reason about before execution.

This matters more in agent systems than in ordinary software review.

An agent skill may be small, but it is exposed to a model that can call it under many different task contexts. The skill may look harmless in isolation and still become sensitive when paired with user data, credentials, shell access, or a browser session. Runtime monitoring is important, but waiting until execution is not always enough. Before a skill is installed, shared, or enabled, we want to know what kind of capability it offers.

Static analysis can answer part of that question cheaply.

It can identify direct capability surfaces:

filesystem reads and writes
subprocess execution
network clients
browser automation
environment variables
credential stores
persistent databases

It can also identify uncertainty:

command strings assembled from model input
URLs that cannot be resolved statically
file paths derived from task context
external binaries with unknown behavior
helper libraries with broad side effects

That second list is as important as the first.

A static analyzer that says “unknown” in the right place is not useless. It is doing triage. It is telling us where a permission prompt, sandbox rule, manual review, or runtime guard should be placed. In practice, most security engineering is not the elimination of uncertainty. It is the routing of uncertainty to the right control.

This is why I like the phrase “fails gracefully” for static analysis.

It does not mean the analyzer always produces a good answer. It means the analyzer can expose the contour of its own weakness. False positives can point to APIs that need better modeling. False negatives can point to dynamic behavior that should not be hidden. Unknowns can point to review boundaries.

For agent skills, this is a reasonable goal: produce a capability map, attach a confidence level, and preserve the places where the map becomes blurry.

The output does not have to be final. It has to be useful before the model gets to act.