Hover is an open-source VS Code extension that turns plain-English chat into end-to-end tests. AI drives your real Chrome once to explore a flow, then Hover crystallizes the verified run into a standard @playwright/test spec that runs in CI with no AI in the loop.

How is Hover different from other AI testing tools?

Other AI test tools keep a model in the loop at runtime and re-generate the test on every run, so CI keeps paying for tokens and results drift. Hover spends the model once, at authoring time, and the artifact it leaves behind is deterministic, human-readable @playwright/test code. Green builds never pay a recurring AI tax.

What does Hover cost to run?

Hover is free and open source. It bundles no model SDK and no API keys — it spawns the coding-agent CLI (Claude Code or OpenAI Codex) already on your PATH, running on your own subscription or API key. There is no per-token resale.

Can Hover do security testing?

Yes. The same chat flips into an API-testing mode (IDOR / authz probing that crystallizes confirmed findings into .api-test.spec.ts CI gates) and a pentest mode (offensive, white-box, own-app-only — SQLi / XSS / SSTI / SSRF — writing a findings report).

← All posts

Jun 16, 2026·vibe-codingsecurityidorpentest

Vibe-coded apps ship security holes by default

This is the third post in a series on what vibe-coding leaves behind. The first was about tests that actually prove the feature works. The second was about why those apps keep breaking. This one is about the bug you cannot see by clicking around: the authorization check the AI never wrote.

The check the AI didn't write

Ask an AI to build "an invoice page," and you get a route that loads /invoices/42, fetches the invoice, and renders it. It works. You click through it, the numbers show up, you ship.

What the AI optimized for was rendering invoice 42. It had no reason to ask whether the person logged in is allowed to see invoice 42. So the server hands back whatever row matches that ID. Change the URL to /invoices/43 and you are reading someone else's invoice. That is IDOR, an insecure direct object reference, and it is the most common flavor of broken access control on the web.

The AI did not write the check because nothing in the prompt demanded it. "Make the invoice page work" is satisfied the moment the page works for you. The missing line is the one that says: this invoice belongs to this account, or return 403. A human reviewer who has been burned before adds it by reflex. An AI building the happy path does not.

Why your end-to-end tests miss it

Your click-driven tests walk the app the way a user does. They log in, they land on the dashboard, they open their own invoice. The link in the UI points at their own ID, so the test follows it and passes. Everything is green.

The hole lives at a URL the frontend never renders. There is no button that links to invoice 43 when you are logged in as the owner of invoice 42. Your test harness only knows about the URLs the app shows it. It will never type a stranger's ID into the address bar, because no user flow ever does. The bug is invisible to a test that only knows how to click.

Finding IDOR means doing something the UI refuses to: take a request the app already made, change one ID, and send it again as a different user. That is a server-trust question, and it lives below the buttons. (We go deeper on the mechanics in this post on testing for IDOR and broken access control.)

Security mode: replay the real calls with mutated IDs

Hover is a free, open-source VS Code extension, and security mode (🟠) is built for exactly this. It runs a local HTTPS man-in-the-middle proxy in front of your dev browser, so the agent sees the real API calls your app makes, headers and bodies and all.

From there the agent does what your test suite cannot. It takes a captured request, mutates the object ID, and replays it. If the server returns invoice 43 to the account that owns invoice 42, that is a confirmed access-control finding, not a guess. The agent saw the real call, changed one value, and watched the server leak.

A confirmed finding does not stay a chat message. Hover crystallizes it into a .api-test.spec.ts regression gate, a plain spec that asserts the mutated request gets rejected. That gate runs in CI with no agent in the loop. Once you patch the missing check, the gate stays green. If someone reintroduces the hole later, the gate goes red on the pull request. The agent found it once; the spec guards it forever.

Pentest mode: go offensive on your own origin

Security mode probes access control. Pentest mode (🔴) goes further and runs an offensive sweep. It is origin-locked and own-app-only, so it only ever attacks the dev target you point it at. Within that boundary it probes for SQL injection, XSS, SSTI, SSRF, and IDOR, the way an attacker would, against the running app rather than a static read of the code.

Pentest mode writes a Markdown findings report: what it tried, what responded, and where the app gave something away. You read it like a security review, then fix what is real.

Both of these are modes inside the same Hover chat. There is no separate tool to install, no bundler plugin, no second account. You flip from authoring tests to probing access control to running an offensive sweep in the same panel, against the same dev browser. (More on how the two modes share one editor chat is here.)

Where this leaves you

Vibe-coding moved the cost. Writing the feature is cheap now. Knowing it is safe is still work, and the AI that built the happy path is the last thing you should trust to find the hole it skipped. The fix is to let an agent do the part the UI hides: replay the real traffic, change the ID, and turn what it finds into a gate that outlives the session.

Try Hover on your own app.

Install the VS Code extension. Author tests with AI, ship plain Playwright.

Install on VS Code Marketplace →