← All posts
·pentestsecurityowaspred-team

Pentest mode: point an offensive agent at your own app

Security mode is surgical. You point it at one flow, it replays a request with a mutated ID, and it writes the result into a regression spec you keep. That's the orange bar.

Pentest mode is the red bar, and it has the opposite posture. You give it a scope ("test the checkout flow") and the agent sweeps it for real vulnerabilities: reflected and stored XSS, injection, broken object-level authorization, the OWASP classes a happy-path suite never touches. It doesn't stop at "this looks suspicious." It fires the payload, reads the response, and confirms the bug in-band before it reports it.

A different output

Security mode crystallizes a confirmed finding into __vibe_tests__/<slug>.security.spec.ts, plain Playwright that reruns in CI with no agent. That's the right artifact when you found one hole and want it to stay closed.

Pentest mode hands back a Markdown findings report instead. A sweep produces a list: what the agent confirmed, what it tried and couldn't break, and what it never reached. The "not tested" section matters as much as the findings, because a report that only lists hits reads like full coverage when it isn't. When one finding deserves a permanent guard, you still crystallize it through security mode. Discovery feeds the regression test; it doesn't replace it.

Turn it on

Pentest mode ships as its own plugin, @hover-dev/pentest. Add it next to the Hover plugin:

pnpm add -D @hover-dev/pentest
// vite.config.ts (Astro / Nuxt / Next / Webpack mirror the pattern)
import { hover } from 'vite-plugin-hover';
import pentestMode from '@hover-dev/pentest/plugin';

export default defineConfig({
  plugins: [hover({}, pentestMode())],
});

It shares the resident MITM proxy with the security plugin through a refcounted runtime, so loading both costs one proxy, not two. The two modes are mutually exclusive at the widget: switch to red and the orange bar steps aside. Pair it with codeContext and the agent reads your route table to aim before it attacks, which turns a blind fuzz into a white-box assessment.

The guardrails are deliberate

An offensive agent with destructive payloads on needs a hard scope, and Hover draws it tight. The agent attacks the origin the debug Chrome is already on, your own dev server, and nothing else. It works in-band, the way a normal client would. It does not flood the target, and it does not try to evade detection. Those last two are the line between a developer testing their own app and a tool you'd point at someone else's, and Hover stays on the developer side of it.

Run it against your staging branch before you ship, read the report, file the real ones, and lock the worst with a security spec. The agent you already use for writing tests can spend an afternoon trying to break them instead.

Try Hover on your own app.

One command adds the widget to your dev server. Author tests with AI, ship plain Playwright.

npx @hover-dev/cli setup