True end-to-end coverage for the Slack bridge: send real user messages in a real Slack workspace, sample the bot's reply while it's streaming, take screenshots in the middle of long streams, and verify what landed.
Why this exists. Unit tests (under
src/__tests__/) lock in the internal contracts of each module — they don't catch issues that only surface end-to-end: an open code fence leaking through the rest of the Slack message during streaming, a mrkdwn translation that looks right in tests but renders weird in Slack's actual client, a Block Kit limit we forgot about, a Bolt event that doesn't fire under some setting.The catalog at
e2e/cases.tsis the source of truth for what "feature-complete" means.
e2e/
├── README.md this
├── cases.ts catalog of test cases (technical axes; expand liberally)
├── slack-api.ts Slack Web API helpers (history, thread replies, sampling)
├── run.ts harness entrypoint — sends prompts, samples, screenshots
└── results/ per-run output: screenshots + JSON report
# from packages/slack/
# one-time: log into Slack once in the playwright browser profile.
# Subsequent runs reuse that profile.
pnpm exec playwright open --browser=chromium --user-data-dir=./e2e/.chrome-profile \
https://app.slack.com/client/T05QFA4BW9X/C0B49MEJ1HQ
# then:
pnpm e2eThe runner expects .env to already contain SLACK_BOT_TOKEN (used for
polling the channel history while the bot streams). Sending the user
message happens through the playwright-driven Slack UI using Atai's
session cookies from the persistent profile.
For each case the harness:
- Sends the prompt via the Slack UI (or
/agentslash command). - Polls
conversations.replies(or.historyfor DMs / flat replies) everysampleIntervalMsuntilmaxWaitMselapses. - At each sample, records:
- elapsed time
- bot's reply text snapshot
- bracket-balance check (
isBalanced(text))
- At each
screenshots[i]offset (ms after send), takes a screenshot of the Slack thread pane via playwright. - After the run, writes
results/<timestamp>/report.jsonand the screenshots.
- Open code fences leaking through the rest of the Slack message
- Slack's
chat.updaterate limits creating visible "jumps" - mrkdwn rendering differences vs. our translator's expectations
- The bot's actual streaming cadence with the model
- Thread vs DM rendering differences
- Real concurrency from multiple users in the channel
- Mid-stream cancellation / kill behaviour
Edit cases.ts. The bar is low — anything you'd want to see working in
Slack belongs in the catalog. Don't be afraid of duplication with
unit tests; the unit test proves the code is internally correct, the E2E
proves Slack actually renders it that way.
- Sending the user message still relies on UI automation (no user token), so a one-time signin in the persistent profile is required.
- A future enhancement: a long-lived user OAuth token would let us skip the browser entirely for the send step (screenshots still need the browser, but sampling already uses pure API).