vibetunnel/web/server-spec.md
Mario Zechner 96a5f1c3d8 docs: add comprehensive server specification
- Complete architectural overview of VibeTunnel server
- Detailed protocol specifications for all components
- Binary buffer encoding format documentation
- HQ mode distributed architecture details
- API endpoint reference with examples
- WebSocket protocol specifications
- File system structure and session storage format
- Implementation notes for cross-language compatibility

This specification enables implementing VibeTunnel servers in any language
while maintaining full compatibility with the protocol and architecture.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-20 12:20:21 +02:00

14 KiB

VibeTunnel Server Specification

This document provides a comprehensive specification of the VibeTunnel server architecture, including PTY management, terminal state management, distributed HQ mode, and all protocols. This specification is designed to enable implementation in any programming language.

Table of Contents

  1. Overview
  2. Server Modes
  3. Authentication
  4. Session Management
  5. Terminal Management
  6. Binary Buffer Protocol
  7. Stream Format
  8. API Endpoints
  9. WebSocket Protocols
  10. HQ Mode Architecture
  11. File System Structure

Overview

VibeTunnel is a terminal session management server that provides:

  • Remote terminal session creation and management
  • Real-time terminal output streaming
  • Session persistence and replay
  • Distributed architecture with HQ mode for managing multiple servers
  • Binary-optimized terminal buffer synchronization

Server Modes

The server can operate in three modes:

1. Normal Mode (Default)

  • Standalone server managing local terminal sessions
  • Optional Basic Authentication
  • No connection to other servers

2. HQ Mode (--hq flag)

  • Acts as a headquarters server managing multiple remote servers
  • Aggregates sessions from all registered remotes
  • Proxies API requests to appropriate remote servers
  • Maintains health checks on all remotes

3. Remote Mode (--hq-url flag)

  • Registers with an HQ server
  • Accepts both Basic Auth and Bearer token authentication
  • Operates independently if HQ is unavailable
  • Provides all normal mode functionality

Authentication

Configuration

Environment Variables

  • VIBETUNNEL_USERNAME - Username for Basic Authentication
  • VIBETUNNEL_PASSWORD - Password for Basic Authentication
  • Both must be provided together or neither

Command Line Arguments

  • --username - Local server username (overrides env var)
  • --password - Local server password (overrides env var)
  • --hq-url - URL of HQ server to register with (enables remote mode)
  • --hq-username - Username for authenticating with HQ
  • --hq-password - Password for authenticating with HQ
  • --name - Unique name for this remote server (required with --hq-url)

Authentication Flow

Basic Authentication

  • Standard HTTP Basic Auth: Authorization: Basic base64(username:password)
  • Used by clients to authenticate with any server
  • Used by remote servers to authenticate with HQ during registration

Bearer Token Authentication

  • Format: Authorization: Bearer <token>
  • Remote servers generate a unique token (UUID v4) during registration
  • HQ uses this token for all API calls to the remote
  • Remote servers accept both Basic Auth and Bearer token

Authentication Middleware

  1. Skip auth if not configured (no username/password and not in remote mode)
  2. Skip auth for WebSocket upgrade requests (handled separately)
  3. Check Bearer token first if server is in remote mode
  4. Fall back to Basic Auth check
  5. Return 401 Unauthorized with WWW-Authenticate: Basic realm="VibeTunnel" if failed

Session Management

Session States

  • starting - Session is being created
  • running - Session is active
  • exited - Session has terminated

Session Data Structure

interface Session {
  id: string;              // UUID v4
  name: string;            // User-friendly name
  command: string;         // Command line as string
  workingDir: string;      // Working directory path
  status: string;          // Session state
  exitCode?: number;       // Exit code if exited
  startedAt: string;       // ISO 8601 timestamp
  lastModified: string;    // ISO 8601 timestamp
  pid?: number;            // Process ID
  waiting?: boolean;       // If waiting for input
  remoteName?: string;     // Name of remote server (HQ mode)
}

PTY Service

The PTY service manages the actual terminal processes using node-pty:

interface PtyConfig {
  implementation: 'node-pty';
  controlPath: string;     // Base directory for session data
}

Key responsibilities:

  • Create PTY sessions with specified dimensions
  • Manage session lifecycle (create, kill, cleanup)
  • Handle input/output to/from PTY
  • Resize terminal dimensions
  • Track session metadata

Terminal Management

The Terminal Manager maintains server-side terminal state for efficient buffer synchronization:

Terminal State

interface TerminalState {
  cols: number;           // Terminal width
  rows: number;           // Terminal height
  buffer: string[][];     // 2D array of [char, style] pairs
  cursor: {
    x: number;
    y: number;
    visible: boolean;
  };
  scrollback: string[][]; // Historical lines
  title: string;          // Terminal title
  applicationKeypad: boolean;
  applicationCursor: boolean;
  bracketedPasteMode: boolean;
  origin: boolean;
  reverseWraparound: boolean;
  wraparound: boolean;
  insertMode: boolean;
}

Buffer Management

  1. Parse ANSI escape sequences from PTY output
  2. Update terminal state based on control sequences
  3. Maintain accurate buffer representation
  4. Support terminal operations:
    • Cursor movement
    • Text insertion/deletion
    • Screen clearing
    • Scrolling
    • Style changes

Binary Buffer Protocol

The binary buffer protocol provides efficient terminal state synchronization:

Snapshot Encoding

[4 bytes: magic "SNAP"]
[4 bytes: version (1)]
[4 bytes: cols]
[4 bytes: rows]
[4 bytes: cursor X]
[4 bytes: cursor Y]
[1 byte: cursor visible]
[4 bytes: scrollback length]
[scrollback data...]
[4 bytes: buffer length]
[buffer data...]
[4 bytes: title length]
[title UTF-8 bytes]
[1 byte: flags]
  - bit 0: applicationKeypad
  - bit 1: applicationCursor
  - bit 2: bracketedPasteMode
  - bit 3: origin
  - bit 4: reverseWraparound
  - bit 5: wraparound
  - bit 6: insertMode

Line Encoding

Each line is encoded as:

[4 bytes: line length]
[4 bytes: number of cells]
[cell data...]

Cell Encoding

Each cell is encoded as:

[4 bytes: character UTF-8 length]
[character UTF-8 bytes]
[4 bytes: style]

Style is a 32-bit integer:

  • Bits 0-7: Foreground color (256 colors)
  • Bits 8-15: Background color (256 colors)
  • Bit 16: Bold
  • Bit 17: Italic
  • Bit 18: Underline
  • Bit 19: Blink
  • Bit 20: Inverse
  • Bit 21: Hidden
  • Bit 22: Strikethrough

Stream Format

Session output is stored in asciicast v2 format:

Header

{
  "version": 2,
  "width": 80,
  "height": 24,
  "timestamp": 1234567890,
  "env": {"TERM": "xterm-256color"}
}

Events

Each subsequent line is an event:

[timestamp, type, data]

Types:

  • "o" - Output data (UTF-8 string)
  • "i" - Input data (UTF-8 string)
  • "r" - Resize event (e.g., "80x24")

API Endpoints

Health Check

GET /api/health
Response: {"status": "ok", "timestamp": "2024-01-01T00:00:00.000Z"}

Session Management

List Sessions

GET /api/sessions
Response: Session[]

In HQ mode, aggregates sessions from all registered remotes.

Create Session

POST /api/sessions
Body: {
  "command": ["bash", "-l"],
  "workingDir": "/home/user",
  "name": "My Session",
  "remoteId": "remote-uuid"  // Optional, HQ mode only
}
Response: {"sessionId": "uuid"}

Get Session Info

GET /api/sessions/:sessionId
Response: Session

Kill Session

DELETE /api/sessions/:sessionId
Response: {"success": true, "message": "Session killed"}

Cleanup Session

DELETE /api/sessions/:sessionId/cleanup
Response: {"success": true, "message": "Session cleaned up"}

Cleanup All Exited

POST /api/cleanup-exited
Response: {
  "success": true,
  "message": "N exited sessions cleaned up across all servers",
  "localCleaned": 5,
  "remoteResults": [
    {"remoteName": "server1", "cleaned": 3},
    {"remoteName": "server2", "cleaned": 2, "error": "timeout"}
  ]
}

Terminal I/O

Stream Session Output (SSE)

GET /api/sessions/:sessionId/stream
Response: Server-Sent Events stream

Event format:

event: output
data: {"data": "terminal output...", "timestamp": 1234567890}

event: exit
data: {"exitCode": 0}

Get Session Snapshot

GET /api/sessions/:sessionId/snapshot
Response: Optimized asciicast v2 format (text/plain)

Returns events after the last clear screen command.

Get Buffer Stats

GET /api/sessions/:sessionId/buffer/stats
Response: {
  "lines": 100,
  "cells": 8000,
  "scrollbackLines": 500,
  "lastModified": "2024-01-01T00:00:00.000Z"
}

Get Buffer

GET /api/sessions/:sessionId/buffer?format=binary
Response: Binary encoded buffer (application/octet-stream)

GET /api/sessions/:sessionId/buffer?format=json
Response: JSON representation of terminal state

Send Input

POST /api/sessions/:sessionId/input
Body: {"text": "ls -la\n"}
Response: {"success": true}

Special keys:

  • "arrow_up", "arrow_down", "arrow_left", "arrow_right"
  • "escape", "enter", "ctrl_enter", "shift_enter"

Resize Terminal

POST /api/sessions/:sessionId/resize
Body: {"cols": 120, "rows": 40}
Response: {"success": true, "cols": 120, "rows": 40}

File System

Browse Directory

GET /api/fs/browse?path=/home/user
Response: {
  "absolutePath": "/home/user",
  "files": [
    {
      "name": "document.txt",
      "created": "2024-01-01T00:00:00.000Z",
      "lastModified": "2024-01-01T00:00:00.000Z",
      "size": 1024,
      "isDir": false
    }
  ]
}

Create Directory

POST /api/mkdir
Body: {"path": "/home/user", "name": "newfolder"}
Response: {
  "success": true,
  "path": "/home/user/newfolder",
  "message": "Directory 'newfolder' created successfully"
}

HQ Mode Endpoints

Register Remote

POST /api/remotes/register
Headers: Authorization: Basic <hq-credentials>
Body: {
  "id": "remote-uuid",
  "name": "unique-remote-name",
  "url": "http://remote-server:4020",
  "token": "bearer-token-uuid"
}
Response: {
  "success": true,
  "remote": {"id": "remote-uuid", "name": "unique-remote-name"}
}

Unregister Remote

DELETE /api/remotes/:remoteId
Headers: Authorization: Basic <hq-credentials>
Response: {"success": true}

List Remotes

GET /api/remotes
Response: [
  {
    "id": "remote-uuid",
    "name": "unique-remote-name",
    "url": "http://remote-server:4020",
    "sessionCount": 5,
    "lastHeartbeat": "2024-01-01T00:00:00.000Z"
  }
]

WebSocket Protocols

Buffer Synchronization WebSocket

Endpoint: /buffers

Client → Server Messages

Subscribe to session:

{"type": "subscribe", "sessionId": "session-uuid"}

Unsubscribe from session:

{"type": "unsubscribe", "sessionId": "session-uuid"}

Heartbeat response:

{"type": "pong"}

Server → Client Messages

Heartbeat:

{"type": "ping"}

Error:

{"type": "error", "message": "Error description"}

Binary buffer update:

[1 byte: 0xBF magic byte]
[4 bytes: session ID length (little-endian)]
[N bytes: session ID UTF-8]
[M bytes: encoded buffer snapshot]

HQ Mode Architecture

Remote Registration

  1. Remote server starts with --hq-url, --hq-username, --hq-password, --name
  2. Remote generates unique ID (UUID v4) and token (UUID v4)
  3. Remote sends POST to /api/remotes/register with Basic Auth
  4. HQ validates unique name and stores remote info
  5. Remote registration is complete

Health Checking

  1. HQ checks each remote every 15 seconds
  2. First tries GET /api/health with Bearer token
  3. Falls back to GET /api/sessions if health endpoint not found
  4. Updates session tracking from sessions response
  5. Removes remote if health check fails

Request Proxying

  1. Session proxy middleware intercepts requests with session IDs
  2. Looks up which remote owns the session
  3. Forwards request to remote with Bearer token auth
  4. Returns remote's response to client

Session Aggregation

  1. GET /api/sessions in HQ mode fetches from all remotes
  2. Adds remoteName field to each session
  3. Tracks session ownership for future proxying
  4. Returns combined list sorted by last modified

File System Structure

~/.vibetunnel/control/
├── {session-id}/
│   ├── info.json       # Session metadata
│   ├── stream-out      # Asciicast v2 format output
│   └── stream-in       # Input log (optional)

info.json Structure

{
  "version": 1,
  "session_id": "uuid",
  "name": "Session Name",
  "cmdline": ["bash", "-l"],
  "cwd": "/home/user",
  "env": {},
  "term": "xterm-256color",
  "width": 80,
  "height": 24,
  "started_at": "2024-01-01T00:00:00.000Z",
  "pid": 12345,
  "status": "running",
  "exit_code": null
}

Implementation Notes

Error Handling

  • All endpoints should return appropriate HTTP status codes
  • Error responses should include {"error": "Description"}
  • WebSocket errors should send error message before closing

Security Considerations

  • HTTPS required for HQ URL
  • Tokens should be cryptographically random (UUID v4)
  • File system access restricted to home directory and temp
  • Input validation on all user-provided paths

Performance Considerations

  • Stream files are append-only for efficiency
  • Binary buffer protocol minimizes data transfer
  • Health checks have 5-second timeout
  • Proxy requests have 30-second timeout
  • Buffer updates are debounced to avoid flooding

Compatibility

  • UTF-8 encoding throughout
  • Little-endian byte order for binary protocol
  • ISO 8601 timestamps in UTC
  • Line endings normalized to LF (\n)