Tabby E2E Test Plan

Functional + visual coverage proposal

Chrome extension · 6,362 LOC · 49 features

TL;DR

~62%

line coverage

40/49

features asserted

~15

visual artifacts

Playwright + launchPersistentContext + unpacked extension.
Same harness as existing scripts/video-demo.mjs.

Two-layer strategy

Functional

Playwright expect() on DOM
sw.evaluate() on SW singletons
Binary pass/fail, drives CI
Assert end-state, not mid-animation
Mock Supabase + Mixpanel

Visual

PNG per stable state (layouts, menus, dialogs)
MP4 per animation (zoom, pinch, DnD)
Recorded under Xvfb + ffmpeg
Uploaded via scripts/dbx-upload.sh
For human review, not CI gate

Visuals are not a replacement for assertions — they catch regressions in motion that DOM state alone misses.

Coverage by file

Area	Files (LOC)	Coverage
App shell	App, main, hooks (588)	~75%
Header	Header.tsx (438)	~75%
Grid	TabGrid, TabCard (494)	~80%
Stack	StackView.tsx (764)	~70%
Zoom entrance	ZoomEntrance, pre-overlay (417)	~65%
Welcome	WelcomeOverlay.tsx (254)	~80%
Refresh popup	RefreshPopup.tsx (151)	~80%
Content script	zoom-out.ts (67)	~75%
Commands & icon	index, tab-ready (219)	~70%
Capture/storage	capture, storage (1260)	~55%
Messages	messages.ts (424)	~65%
Backup	backup.ts (271)	~50%
Clustering	clustering.ts (500)	~30%
Analytics	analytics.ts (159)	~30%

Weighted total: ~60–65% LOC · ~50% branch

49 features, 11 areas

A Bootstrap (4)

B Header controls (7)

C Grid layout (8)

D Stack layout (12)

E Zoom entrance/exit (3)

F Refresh screenshots (4)

G Content-script pinch (1)

H Keyboard commands (2)

I Capture + storage (4)

J Debug + backup (3)

K Analytics (1)

A – B: Bootstrap & Header

A. Bootstrap (4)

Install / new-tab override loads tab.html
First-time welcome overlay (bounce)
Congratulations + confetti (5 s)
Toolbar icon / single-instance policy

B. Header (7)

Search input + clear
Search shortcuts (/, Cmd+F, Esc)
Layout toggle: grid ↔ stack
Overview spring-zoom
Hamburger menu + outside-click close
Alt/Option reveals Debug submenu
About dialog

C – D: Grid & Stack

C. Grid (8)

Window sections + current marker
TabCard render (thumb, favicon, title)
3D tilt on hover
Close button X
Click → zoom-exit → activate
DnD reorder + cross-window
Cluster groups render
Ungroup all

D. Stack (12)

3D fanned stack geometry
Focused mode (viewport-scaled)
Hover → focus
Mouse-past-slot hand-off
Pinch zoom-out
Pinch zoom-in
Pinch-in focused → open tab
500 ms cooldown
Horizontal wheel scroll
Close window
Close tab from card
DnD reorder (horizontal)

E – H: Zoom, Refresh, Commands

E. Zoom entrance (3)

Pre-overlay before React mount
Zoom-entrance on NTP load
Zoom-exit on card click

F. Refresh (4)

Launches 585×501 popup
Progress UI (catwalk + bar)
Stop button halts loop
Completion returns focus

G. Content script (1)

Pinch-out → open Tabby

H. Commands (2)

Cmd+Shift+X — expose mode
Cmd+Shift+A — expose all

I – K: Capture, Backup, Analytics

I. Capture + storage (4)

Per-tab capture on activation / update
Restricted URLs skipped
Thumbnail re-association after restart
Thumbnail healing after cross-window DnD

J. Debug + backup (3)

getDebugSettings round-trip
Multi-Tabby toggle
Folder backup mirror + restore

K. Analytics (1)

OnNewTabPageLoaded, First Time Usage

Functional vs visual split

Kind	Count	Examples
Functional only	16	close window, commands, persistence, debug settings
Functional + PNG	10	layouts, menus, clusters, search empty state
Functional + MP4	19	zoom, pinch, DnD, welcome flow, refresh progress
Visual only	—	every visual test also asserts

Example assertions

// Item 5 — Search filters tabs
await page.fill('input[placeholder^="Search"]', 'alpha')
await expect(page.locator('[data-tab-id]')).toHaveCount(1)

// Item 15 — Close button removes tab
const before = await sw.evaluate(() => chrome.tabs.query({}).then(t => t.length))
await page.locator('[data-tab-id] button.close').first().click()
const after = await sw.evaluate(() => chrome.tabs.query({}).then(t => t.length))
expect(after).toBe(before - 1)

// Item 42 — Thumbnail captured on activation
await page.goto('http://127.0.0.1:17001/')
await waitForCaptures(sw, ['http://127.0.0.1:17001/'], 10_000)
const rows = await sw.evaluate(() => self.__tabbyStorage.getAllRaw())
expect(rows.find(r => r.url === 'http://127.0.0.1:17001/').dataUrl.length)
  .toBeGreaterThan(1000)

What we do NOT test

gesture showDirectoryPicker, permission re-auth prompts
network real Supabase clustering, real Mixpanel
timing frame-accurate animation timing
perf FPS counters, jank detection
a11y accessibility, themes, multi-monitor
browser visualViewport.scale in content script
retry capture rate-limit / retry edge branches

Explicitly out of scope so expectations stay clear.

Harness structure

test/e2e/
  setup.mjs                       # shared helpers
  functional/
    bootstrap.test.mjs            # items 1–4
    header.test.mjs               # items 5–11
    grid.test.mjs                 # items 12–19
    stack.test.mjs                # items 20–31
    zoom.test.mjs                 # items 32–34
    refresh.test.mjs              # items 35–38
    commands.test.mjs             # items 39–41
    capture.test.mjs              # items 42–45
    backup.test.mjs               # items 46–48
  visual/                         # PNGs → docs/e2e-screens/
  video/                          # MP4s → docs/videos/ (Xvfb wrapper)

"test:e2e":        "node test/e2e/run-functional.mjs",
"test:e2e:visual": "node test/e2e/run-visual.mjs",
"test:e2e:video":  "bash test/e2e/run-video.sh"

Phased rollout

Phase	Scope	Coverage	Runtime
1. Smoke	Bootstrap, layouts, search, click, close, DnD, capture, persistence	~30%	<2 min
2. Breadth	Menus, clusters, commands, backup, all remaining functional	~55%	~5 min
3. Visual	PNGs + MP4s under Xvfb + Dropbox upload	~65%	~10 min

Each phase is independently shippable.

Four decisions

Phase 1 only, or full plan upfront?
Short per-feature clips, or one narrative video?
Mock Supabase + Mixpanel to push past 75% coverage?
Location: new test/e2e/ tree, or extend scripts/?

Pick any, then I start writing code.

Questions?

Full plan: docs/e2e-test-plan.md
Slides: docs/e2e-test-plan-slides.html