vibetunnel/docs/testing.md
2025-07-18 08:37:16 +02:00

8.7 KiB
Raw Permalink Blame History

Testing

VibeTunnel uses modern testing frameworks across platforms: Swift Testing for macOS/iOS, Vitest for Node.js unit tests, and Playwright for end-to-end web testing. Tests are organized by platform and type, with comprehensive coverage requirements.

Key Files

Test Configurations

  • web/vitest.config.ts (main config)
  • web/vitest.config.e2e.ts (E2E config)
  • web/playwright.config.ts (Playwright E2E)

Test Utilities

  • web/src/test/test-utils.ts (mock helpers)
  • mac/VibeTunnelTests/Utilities/TestTags.swift (test categorization)

Platform Tests

  • mac/VibeTunnelTests/ (Swift tests)
  • web/src/test/ (Node.js tests)
  • web/tests/ (Playwright E2E tests)

Test Types

macOS Unit Tests

Swift Testing framework tests covering core functionality:

// From mac/VibeTunnelTests/ServerManagerTests.swift:14-40
@Test("Starting and stopping Bun server", .tags(.critical))
func serverLifecycle() async throws {
    let manager = ServerManager.shared
    await manager.stop()
    await manager.start()
    #expect(manager.isRunning)
    await manager.stop()
    #expect(!manager.isRunning)
}

Core Test Files:

  • ServerManagerTests.swift - Server lifecycle and management
  • TerminalManagerTests.swift - Terminal session handling
  • TTYForwardManagerTests.swift - TTY forwarding logic
  • SessionMonitorTests.swift - Session monitoring
  • NetworkUtilityTests.swift - Network operations
  • CLIInstallerTests.swift - CLI installation
  • NgrokServiceTests.swift - Ngrok integration
  • DashboardKeychainTests.swift - Keychain operations

Test Tags (mac/VibeTunnelTests/Utilities/TestTags.swift):

  • .critical - Core functionality tests
  • .networking - Network-related tests
  • .concurrency - Async/concurrent operations
  • .security - Security features
  • .integration - Cross-component tests

Node.js Unit Tests

Vitest-based testing with unit and E2E capabilities:

Test Configuration (web/vitest.config.ts):

  • Global test mode enabled
  • Node environment
  • Coverage thresholds: 80% across all metrics
  • Custom test utilities setup (web/src/test/setup.ts)

E2E Tests (web/src/test/e2e/):

  • hq-mode.e2e.test.ts - HQ mode with multiple remotes
  • server-smoke.e2e.test.ts - Basic server functionality

Test Utilities (web/src/test/test-utils.ts):

// Mock session creation helper
export const createMockSession = (overrides?: Partial<MockSession>): MockSession => ({
  id: 'test-session-123',
  command: 'bash',
  workingDir: '/tmp',
  status: 'running',
  ...overrides,
});

Playwright End-to-End Tests

Browser automation tests for full user workflows:

Configuration (web/playwright.config.ts):

use: {
  // Global timeout for assertions
  expect: { timeout: 5000 },
  
  // Action timeout (click, fill, etc.)
  actionTimeout: 10000,
  
  // Navigation timeout
  navigationTimeout: 10000,
  
  // Trace on retry for debugging
  trace: 'on-first-retry',
}

Playwright Best Practices

Use Auto-Waiting Instead of Arbitrary Delays

Bad: Arbitrary timeouts

await page.waitForTimeout(1000); // Don't do this!

Good: Wait for specific conditions

// Wait for element to be visible
await page.waitForSelector('vibe-terminal', { state: 'visible' });

// Wait for loading indicator to disappear
await page.locator('.loading-spinner').waitFor({ state: 'hidden' });

// Wait for specific text to appear
await page.getByText('Session created').waitFor();

Use Web-First Assertions

Web-first assertions automatically wait and retry:

// These assertions auto-wait
await expect(page.locator('session-card')).toBeVisible();
await expect(page).toHaveURL(/\?session=/);
await expect(sessionCard).toContainText('RUNNING');

Prefer User-Facing Locators

Locator Priority (best to worst):

  1. getByRole() - semantic HTML roles
  2. getByText() - visible text content
  3. getByTestId() - explicit test IDs
  4. locator() with CSS - last resort
// Good examples
await page.getByRole('button', { name: 'Create Session' }).click();
await page.getByText('Session Name').fill('My Session');
await page.getByTestId('terminal-output').waitFor();

VibeTunnel-Specific Patterns

Waiting for Terminal Ready

// Wait for terminal component to be visible
await page.waitForSelector('vibe-terminal', { state: 'visible' });

// Wait for terminal to have content
await page.waitForFunction(() => {
  const terminal = document.querySelector('vibe-terminal');
  return terminal && (
    terminal.textContent?.trim().length > 0 ||
    !!terminal.shadowRoot ||
    !!terminal.querySelector('.xterm')
  );
});

Session Creation

// Wait for navigation after session creation
await expect(page).toHaveURL(/\?session=/, { timeout: 2000 });

// Wait for terminal to be ready
await page.locator('vibe-terminal').waitFor({ state: 'visible' });

Terminal Output

// Wait for specific text in terminal
await page.waitForFunction(
  (searchText) => {
    const terminal = document.querySelector('vibe-terminal');
    return terminal?.textContent?.includes(searchText);
  },
  'Expected output'
);

// Wait for shell prompt
await page.waitForFunction(() => {
  const terminal = document.querySelector('vibe-terminal');
  const content = terminal?.textContent || '';
  return /[$>#%]\s*$/.test(content);
});

Running Tests

macOS Tests

# Run all tests via Xcode
xcodebuild test -project mac/VibeTunnel.xcodeproj -scheme VibeTunnel

# Run specific test tags
xcodebuild test -project mac/VibeTunnel.xcodeproj -scheme VibeTunnel -only-testing:VibeTunnelTests/ServerManagerTests

Node.js Tests

# Run all unit tests
cd web && pnpm run test

# Run with coverage
pnpm run test:coverage

# Run E2E tests
pnpm run test:e2e

Playwright Tests

# Install browsers (first time)
cd web && pnpm exec playwright install

# Run all Playwright tests
pnpm run test:playwright

# Run with UI mode
pnpm exec playwright test --ui

# Debug specific test
pnpm exec playwright test --debug tests/session-management.spec.ts

Test Organization

macOS Test Structure

mac/VibeTunnelTests/
├── Utilities/
│   ├── TestTags.swift      - Test categorization
│   ├── TestFixtures.swift  - Shared test data
│   └── MockHTTPClient.swift - HTTP client mocks
├── ServerManagerTests.swift
├── TerminalManagerTests.swift
├── TTYForwardManagerTests.swift
├── SessionMonitorTests.swift
├── NetworkUtilityTests.swift
├── CLIInstallerTests.swift
├── NgrokServiceTests.swift
├── DashboardKeychainTests.swift
├── ModelTests.swift
├── SessionIdHandlingTests.swift
└── VibeTunnelTests.swift

Web Test Structure

web/
├── src/test/           # Unit tests
│   ├── e2e/
│   │   ├── hq-mode.e2e.test.ts
│   │   └── server-smoke.e2e.test.ts
│   ├── setup.ts
│   └── test-utils.ts
└── tests/              # Playwright E2E tests
    ├── session-management.spec.ts
    ├── terminal-interaction.spec.ts
    └── fixtures/

Coverage Requirements

Node.js Coverage (web/vitest.config.ts):

  • Provider: V8
  • Reporters: text, json, html, lcov
  • Thresholds: 80% for lines, functions, branches, statements
  • Excludes: node_modules, test files, config files

Playwright Coverage: Focus on critical user journeys rather than code coverage.

Custom Matchers & Utilities

Node.js Custom Matchers (web/src/test/setup.ts):

  • toBeValidSession() - Validates session object structure

Test Utilities:

  • createMockSession() - Generate test session data
  • createTestServer() - Spin up Express server for testing
  • waitForWebSocket() - WebSocket timing helper
  • mockWebSocketServer() - Mock WS server implementation

Debugging Flaky Tests

Enable Trace Recording

// In playwright.config.ts
use: {
  trace: 'on-first-retry',
}

Use Debug Mode

# Run with headed browser and inspector
pnpm exec playwright test --debug

Add Strategic Logging

console.log('Waiting for terminal to be ready...');
await page.locator('vibe-terminal').waitFor();
console.log('Terminal is ready');

Best Practices Summary

  1. Never use waitForTimeout() - always wait for specific conditions
  2. Use web-first assertions that auto-wait
  3. Prefer semantic locators over CSS selectors
  4. Wait for observable conditions not arbitrary time
  5. Configure appropriate timeouts for your application
  6. Keep tests isolated and independent
  7. Use debugging tools for flaky tests
  8. Test at the right level - unit for logic, E2E for workflows