AI-ready Puppeteer QA harness for PR-end browser testing across any app repo.
It can be used by any AI agent, any developer, or any CI/local workflow that can run Node.
ProbeQA gives coding agents a repeatable QA loop: inspect the diff, generate browser edge-case scenarios, run the app like a user, and leave durable tests behind.
If you want AI coding agents to prove browser behavior before pushing code, star/watch the repo and share feedback in Discussions.
This is a local-first open-source npm CLI for any app repo configured in probeqa.config.json.
It is not a hosted AI product and it is not a trained model. What exists now is:
- A Puppeteer runner that opens a real browser and clicks through flows
- A scenario format for PR-specific integration tests
- A crawler that maps reachable public routes and UI affordances
- A generator that reads changed files and creates scenario drafts
- An AI tester role in
AI_TESTER.mdthat tells agents how to generate edge-case scenarios - A place to keep those generated tests so they become part of integration QA
The "AI" part is the coding agent using the tester role before push: it reads the diff, thinks through edge cases, generates or writes Puppeteer scenarios, runs them headed/headless, and reports the result.
The open-source part is normal Node/Puppeteer code. Nothing external is required besides npm packages.
The next layer should be model-provider adapters that improve qa:generate with real LLM reasoning while keeping the runner vendor-neutral.
- Make AI coding agents prove changed browser flows before a PR is pushed.
- Turn one-off manual QA ideas into reusable integration scenarios.
- Keep QA local-first instead of paying for every experimental CI run.
- Capture screenshots, console errors, and network failures when a scenario breaks.
- Give teams a single lightweight convention that works across repos.
- Launch kit and sharing copy: LAUNCH.md
- AI tester role: AI_TESTER.md
- Examples: examples/
npm install -D probeqa
npx probeqa initConfigure target repos in probeqa.config.json, then start the target app stack separately.
{
"projects": [
{
"name": "App",
"path": ".",
"kind": "app",
"baseUrl": "http://localhost:3000"
}
],
"scenariosDir": "probeqa/scenarios",
"artifactsDir": "probeqa/artifacts"
}Run the QA scenarios:
npx probeqa plan # inspect changed files and proposed QA surface
npx probeqa audit # find routes without ProbeQA scenario coverage
npx probeqa crawl # crawl reachable same-origin pages into a route map
npx probeqa generate # write a generated scenario draft
npx probeqa run # headless
AI_QA_HEADLESS=false npx probeqa run
npx probeqa listOr add scripts to the consuming app:
{
"scripts": {
"qa:plan": "probeqa plan",
"qa:audit": "probeqa audit",
"qa:crawl": "probeqa crawl",
"qa:generate": "probeqa generate",
"qa": "probeqa run",
"qa:headed": "AI_QA_HEADLESS=false probeqa run"
}
}Install ProbeQA in any app repo:
npm install -D probeqa
npx probeqa initUse a config like:
{
"projects": [
{
"name": "Web App",
"path": ".",
"kind": "app",
"baseUrl": "http://localhost:3000"
}
],
"scenariosDir": "probeqa/scenarios",
"artifactsDir": "probeqa/artifacts"
}Add scripts:
{
"scripts": {
"qa:plan": "probeqa plan",
"qa:generate": "probeqa generate",
"qa": "probeqa run",
"qa:headed": "AI_QA_HEADLESS=false probeqa run"
}
}Expected PR habit:
npm run qa:plan
npm run qa:audit
npm run qa:crawl -- --generate
npm run qa:generate
npm run qa:headed
npm run qaThe generated scenario is a draft. The AI or developer should tighten it into real clicks, role states, and assertions before merging.
Run on pull requests and after merges to main:
name: ProbeQA
on:
pull_request:
push:
branches: [main]
jobs:
browser-qa:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
cache: npm
- run: npm ci
- run: npm run build
- run: npm run start &
- run: npx wait-on http://localhost:3000
- run: npx probeqa runFor local-first teams, keep this workflow optional at first and require agents to run npm run qa:headed before pushing.
ProbeQA is distributed as an npm CLI package.
Recommended public path:
npm install -D probeqa
npx probeqa init
npx probeqa generate
npx probeqa runIt is not a cloud package right now. A cloud layer can come later for hosted artifacts, scheduled QA runs, team dashboards, and shared scenario intelligence. The core should stay local and open source so any AI agent can plug into it.
Publishing:
npm login
npm publish --access publicProbeQA uses Node's built-in test runner for package-level coverage:
npm test
npm run pack:checknpm run pack:check runs the test suite and syntax checks before producing the npm package dry run. The GitHub CI workflow runs the same check on pushes and pull requests.
Before pushing every PR, the AI must either:
- Add and run 3-7 Puppeteer scenarios for the changed user flows, or
- Add the scenarios and report the exact blocker that prevented running them.
Scenarios live in the configured scenariosDir. Use probeqa/scenarios/_template.mjs after probeqa init, or run npx probeqa generate --repo "Web App" to create a draft from one configured project. Use AI_TESTER.md for the tester mindset.
Prioritize edge cases the normal unit tests miss:
- Auth redirects and role boundaries
- Empty, loading, error, and forbidden states
- Form validation and disabled-submit behavior
- Navigation between changed pages
- Cross-service behavior between frontend and backend
- Browser-only issues: focus, click targets, layout, console errors
Keep scenarios deterministic. Do not use production data or destructive actions unless the test environment is explicitly disposable.
- Code changes land locally.
probeqa generatereads the current diff for configured repos.- It writes a scenario draft under
scenarios/. - The AI tester sharpens that draft into real clicks/assertions.
- The scenario stays in the repo and becomes part of future integration QA.
This is intentionally local-first so teams can run the expensive browser checks before push or merge instead of paying for every CI attempt.
After ProbeQA is installed in an existing app, start with a route and edge-case audit:
npx probeqa audit
npx probeqa crawl --generate- List critical flows: signup, login, onboarding, billing, admin, core app workflow.
- For each flow, create one scenario for happy path, empty state, forbidden state, invalid input, and slow/error backend.
- Run headed first so the reviewer can watch the browser.
- Keep only stable scenarios in
probeqa/scenarios.
ProbeQA does not yet crawl the whole app and prove full coverage automatically. That should be a near-term feature.