January 27, 2026

Vibe Engineering: Visual Testing for Browser Extensions: From Puppeteer to GIF-Powered PR Reviews

When AI Agents writes a lot of code, reviewing and testing browser extensions is notoriously difficult. Extensions live in a sandboxed…

Dzianis Vashchuk

2 min read

Originally on Medium

Author: Dzianis Vashchuk | Site: Medium | Published: 2026-01-27T21:50:56Z

Vibe Engineering: Visual Testing for Browser Extensions: From Puppeteer to GIF-Powered PR Reviews When AI Agents writes a lot of code, reviewing and testing browser extensions is notoriously …

When AI Agents writes a lot of code, reviewing and testing browser extensions is notoriously difficult. Extensions live in a sandboxed environment, interact with unpredictable web content, and require complex setup involving service workers, content scripts, and browser APIs. At VibeBrowser, we built an AI-powered browser agent that needs to work reliably across thousands of websites. Here’s how we tackled extension testing with Puppeteer and made code reviews seamless with automated GIF reports.The ChallengeBrowser extensions are hard to test because:No standard test runner — Unlike web apps, extensions don’t run in Node.jsComplex lifecycle — Service workers, content scripts, and popups all need coordinationReal browser required — Mocking Chrome APIs only gets you so farVisual verification — Many bugs are only visible when you see the actual UIOur Solution: Puppeteer with Extension LoadingWe use Puppeteer to launch Chrome with our extension pre-installed:const browser = await puppeteer.launch({ headless: false, args: [ --disable-extensions-except=${extensionPath}, --load-extension=${extensionPath}, '--no-first-run', ]});Test Harness ArchitectureOur test harness (tests/lib/harness.js) handles the complexity:class TestHarness { async init(testName) { this.testDir = .test/${testName}-${timestamp}; this.browser = await puppeteer.launch({...}); await this.waitForServiceWorker(); this.recorder = new VideoRecorder(this.testDir); } async screenshot(name) { const path = ${this.testDir}/screenshots/${name}.png; await this.page.screenshot({ path, fullPage: false }); this.recorder.addScreenshot(path, name); }}Testing Different ScenariosWe test multiple extension use cases:1. Extension Mock Tests — Core functionality with mocked LLM responsestest('agent completes navigation task', async () => { await harness.openSidepanel(); await harness.screenshot('01_sidepanel-open'); await harness.sendMessage('Go to example.com'); await harness.waitForAgentIdle(); await harness.screenshot('02_navigation-complete'); expect(page.url()).toContain('example.com');});2. MCP Server Tests — External tool integrationtest('MCP tools execute correctly', async () => { await harness.enableMCP(); await harness.screenshot('01_mcp-enabled'); await harness.executeTool('browser_navigate', { url: 'https://...' }); await harness.screenshot('02_tool-executed');});3. Google Workspace Tests — Real-world productivity workflowstest('creates Google Doc', async () => { await harness.authenticate(); await harness.sendMessage('Create a new Google Doc titled "Meeting Notes"'); await harness.waitForAgentIdle(); expect(await page.$('div[data-doc-title="Meeting Notes"]')).toBeTruthy();});The GIF-Powered Review ExperienceScreenshots are great, but watching the test flow as an animation is even better. We built a VideoRecorder class that:Collects screenshots during test executionGenerates optimized GIFs using ffmpegUploads to GitHub releases for embedding in PR commentsclass VideoRecorder { async createVideo() { await this.generatePalette(); await this.createGifWithPalette(); }}Automated PR CommentsOur CI script (scripts/create-gh-report.sh) automatically:Finds the most recent run for each test typeGenerates GIFs if missingPosts a clean PR comment with all test resultstests/extension.mock.test.jstests/google-workspace.test.js---2024-01-27T21:33:08ZKey Design Decisions1. Crop, Don’t Squish — Full-page screenshots get cropped to viewport size instead of scaled down, keeping text readable:scale=${width}:-1,crop=${width}:${height}:0:02. One Entry Per Test Type — The report shows the most recent run for each unique test, not duplicates:if echo "$SEEN_TYPES" | grep -q "|${test_type}|"; then continuefi3. Fast Playback — GIFs play at 2 fps (0.5s per frame) so reviewers can quickly scan results without waiting.ResultsThis approach gives us:Confidence — Every PR shows exactly what the tests didSpeed — Reviewers see test behavior in seconds, not minutesDebugging — Failed tests have visual evidence of what went wrongDocumentation — GIFs serve as living documentation of featuresTry It YourselfThe core components you need:Puppeteer with --load-extension flagffmpeg for GIF generationGitHub releases for artifact hosting (free and fast)A simple bash script to tie it togetherBrowser extension testing doesn’t have to be painful. With the right tooling, you can have visual, automated, and reviewable tests that make your whole team more confident in shipping.We’re building VibeBrowser, an AI agent that lives in your browser. Follow our journey at vibebrowser.app