Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion apps/desktop/scripts/build-sidecar.ts
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,11 @@ await rm(EXECUTOR_OUT_DIR, { recursive: true, force: true });
await mkdir(EXECUTOR_OUT_DIR, { recursive: true });
await cp(sourceBinDir, EXECUTOR_OUT_DIR, { recursive: true });

if (process.platform !== "win32") {
// Restore the unix executable bit — keyed on the TARGET, not the host. A
// windows-target cross-build (BUN_TARGET=bun-windows-x64 on macOS/linux) stages
// `executor.exe`, which needs no bit; chmod'ing a non-existent `executor` there
// would ENOENT.
Comment on lines +48 to +51

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Em-dash character in code comment

AGENTS.md prohibits em-dashes () in all contexts: "prose, docs, code comments, commit messages, or PRs. Use commas, colons, parentheses, or separate sentences instead." Line 48 uses one (// Restore the unix executable bit — keyed on the TARGET), as do comments in e2e/AGENTS.md ("the real test, not a transcript of one — develop the flow"), e2e/setup/desktop-linux.globalsetup.ts ("no launchctl — just background processes"), e2e/scripts/cli.ts, and several other new files throughout this PR.

Context Used: AGENTS.md (source)

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

if (!targetPackage.includes("windows")) {
await chmod(join(EXECUTOR_OUT_DIR, "executor"), 0o755);
}

Expand Down
74 changes: 74 additions & 0 deletions e2e/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,80 @@ When handing results to the user, follow the evidence contract in the root
[AGENTS.md](../AGENTS.md) (direct run links + a live instance + what to try);
[RUNNING.md](../RUNNING.md) has the current sharing/demo mechanics.

## Authoring from a live browser (`browse` → `promote`)

You don't have to hand-write a browser scenario. Drive a running instance's web
UI one step at a time, then turn the recorded journey into a committed scenario.
The generated test drives the same Browser surface the exploration drove, so it
is the real test, not a transcript of one — develop the flow, then crystallize
it.

```sh
cd e2e
bun run cli up cloud # a live instance to develop against
bun run cli browse cloud goto / # each step REPLAYS the whole flow from a
bun run cli browse cloud click link Policies # clean browser and prints the page's controls
bun run cli browse cloud at-url /policies # (role · name) + a screenshot, so the next
bun run cli browse cloud see "No policies yet" # step is written against what's actually there
bun run cli promote cloud "Policies · a fresh workspace has none"
```

Each `browse` replays every step so far, so what you are building is, at every
moment, exactly what `promote` emits — a step that doesn't reproduce fails here,
not in CI. Steps: `goto <path>`, `click <role> <name>`, `click-text <text>`,
`fill <field> <value>`, `press <key>`, and the assertions `see <text>` /
`at-url <substring>`. `--label "…"` names a step (it becomes the `step(...)`
group); `browse <target> show | undo | reset` manages the journey.

`promote` writes `<target>/<slug>.gen.test.ts` and runs it against the live
instance, producing the usual run artifacts (session.mp4, step screenshots,
trace). A journey with no assertion is refused — a scenario must prove
something. From then on the file is an ordinary scenario: edit it, add API/MCP
checks, drop the `.gen` once it's yours. The journey itself lives in
`.dev/<target>.journey.json` (gitignored), not the repo.

## Desktop targets (the app on real OSes, filmed)

The packaged desktop app runs as its own targets, each landing in its own
`runs/<target>/` bucket with a video. One shared scenario (`desktop-vm/`) and the
shared driver (`src/vm/desktop.ts`) + setup plumbing (`setup/desktop-vm.ts`); one
project + globalsetup per guest OS.

- **`desktop-packaged`** — the real electron-builder bundle on THIS machine's
display (the supervised-daemon attach path). Needs a logged-in GUI session.
- **`desktop-macos` / `desktop-linux`** — the same bundle inside a guest VM,
driven over CDP from the host and filmed. The globalsetup boots the guest
(tart), builds + pushes the bundle, brings the app up with
`--remote-debugging-port`, forwards it, and the scenario connects + drives +
records. Provisioned automatically — or attach to a running guest with
`E2E_DESKTOP_VM_IP=<ip>`:

```sh
vitest run --project desktop-macos # or desktop-linux
```

The guests run tart `--no-graphics` (no host window, never steals focus) but
still have a usable display:

- **macOS**: the base image's autologin reaches a real Aqua session
(WindowServer/Dock/Finder). Launch the app INTO it with `sudo launchctl asuser
<uid> …` (a plain SSH spawn lands in a non-GUI session); the unsigned arm64
bundle is ad-hoc `codesign`'d in the guest; `screencapture` films it.
- **linux**: no window server, so the app renders into an `Xvfb` display with a
minimal WM (`openbox` — without it the electron window never maps); the window
maps tiny (10x10) so the globalsetup `xdotool`-resizes it to fill, and ffmpeg
`x11grab` films it. `--no-sandbox` (the chrome-sandbox needs setuid root).

Base images (`admin`/`admin`): `executor-macos-base` (cirruslabs sequoia, autologin)
and `executor-linux-base` (cirruslabs ubuntu + Xvfb/ffmpeg/openbox/xdotool +
electron runtime libs). The bundle's `executor` binary is cross-compiled for the
guest (`BUN_TARGET`), and electron-builder's `dir` target assembles the unpacked
app on macOS — so both bundles build on this Mac.

Note: `desktop-packaged`'s `guiAvailable()` probe (`launchctl managername`) reads
"Background" over SSH even when Aqua is up, so it's host-only; the VM targets gate
on a CDP page target instead.

## Discovering endpoints

- The full OpenAPI spec: `curl http://127.0.0.1:<cloud port>/api/openapi.json`
Expand Down
75 changes: 75 additions & 0 deletions e2e/desktop-vm/console-renders.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
// The PACKAGED desktop app, on camera, inside a GUI guest — driven over CDP from
// the host. ONE scenario shared by every desktop-<os> project (desktop-macos,
// desktop-linux): the same bundle and CDP driver, proving it renders on a guest
// OS and filming the actual console. The desktop-<os> globalsetup boots the
// guest, launches the app, forwards its --remote-debugging-port (E2E_DESKTOP_CDP_PORT)
// and publishes the guest IP; this scenario connects, drives, and records. The
// run lands in runs/<target>/ (its own per-OS bucket). Without a guest it skips
// honestly, like desktop-packaged without a display.
import { writeFileSync } from "node:fs";
import { join } from "node:path";

import { expect, it } from "@effect/vitest";
import { Effect } from "effect";

import { scenario } from "../src/scenario";
import { RunDir } from "../src/services";
import { CdpPage, pageWsUrl, recordGuestScreen } from "../src/vm/desktop";

const NAME = "Desktop (packaged, in a VM) · the bundle renders its console";
const cdpPort = process.env.E2E_DESKTOP_CDP_PORT;
const guestIp = process.env.E2E_DESKTOP_VM_IP;
const recSeconds = Number(process.env.E2E_DESKTOP_REC_SECONDS ?? "12");
const os: "macos" | "linux" | "windows" =
process.env.E2E_TARGET === "desktop-windows"
? "windows"
: process.env.E2E_TARGET === "desktop-linux"
? "linux"
: "macos";

const run = async (runDir: string) => {
const cdp = await CdpPage.connect(await pageWsUrl(Number(cdpPort)));
try {
await cdp.command("Runtime.enable");
await cdp.command("Page.enable");

// Film the console while we drive it (OS-aware capture lands a playable mp4).
const recording = recordGuestScreen(
guestIp as string,
recSeconds,
join(runDir, "session.mp4"),
os,
);

// Reaching the nav proves the packaged bundle booted and connected to its
// daemon on this OS.
await cdp.waitForText("Integrations", 60_000).catch(() => cdp.waitForText("Settings", 60_000));
writeFileSync(join(runDir, "01-console-rendered.png"), await cdp.screenshot());

const body = await cdp.command<{ result?: { value?: string } }>("Runtime.evaluate", {
expression: "document.body.innerText",
returnByValue: true,
});
expect(body.result?.value ?? "", "the packaged console rendered its nav").toContain(
"Integrations",
);

await recording;
} finally {
cdp.close();
}
};

if (!cdpPort || !guestIp) {
it.skip(`${NAME} (needs a desktop guest — set E2E_DESKTOP_VM_IP or run the desktop-<os> project)`, () => {});
} else {
// Literal name (not NAME) so the run's test.ts review artifact captures it.
scenario(
"Desktop (packaged, in a VM) · the bundle renders its console",
{ timeout: 180_000 },
Effect.gen(function* () {
const runDir = yield* RunDir;
yield* Effect.promise(() => run(runDir));
}),
);
}
Loading
Loading