vibetunnel/docs/cjk-ime-input.md
Tao Xu 31e48b6674
Fix CJK IME issues: language detection, visible input, performance optimizations (#495)
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2025-08-05 02:34:24 +02:00

15 KiB
Raw Blame History

VibeTunnel CJK IME Input Implementation

Overview

VibeTunnel provides comprehensive Chinese, Japanese, and Korean (CJK) Input Method Editor (IME) support across both desktop and mobile platforms. The implementation uses platform-specific approaches to ensure optimal user experience:

  • Desktop: Invisible input element with native browser IME integration
  • Mobile: Native virtual keyboard with direct input handling

Architecture

Core Components

SessionView
├── InputManager (Main input coordination layer)
│   ├── Platform detection (mobile vs desktop)
│   ├── DesktopIMEInput component integration (desktop only)
│   ├── Keyboard input handling
│   ├── WebSocket/HTTP input routing
│   └── Terminal cursor position access
├── DesktopIMEInput (Desktop-specific IME component)
│   ├── Invisible input element creation
│   ├── IME composition event handling
│   ├── Global paste handling
│   ├── Dynamic cursor positioning
│   └── Focus management
├── DirectKeyboardManager (Mobile input handling)
│   ├── Native virtual keyboard integration
│   ├── Direct input processing
│   └── Quick keys toolbar
├── LifecycleEventManager (Event interception & coordination)
└── Terminal Components (Cursor position providers)

Implementation Details

Cursor Position Tracking

File: cursor-position.ts

The cursor position tracking system uses a shared utility function that works consistently across both terminal types (XTerm.js and binary buffer modes):

Coordinate System

export function calculateCursorPosition(
  cursorX: number,        // 0-based column position
  cursorY: number,        // 0-based row position  
  fontSize: number,       // Terminal font size in pixels
  container: Element,     // Terminal container element
  sessionStatus: string   // Session status for validation
): { x: number; y: number } | null

Position Calculation Process

  1. Character Measurement: Dynamically measures actual character width using font metrics
  2. Absolute Positioning: Calculates page-absolute cursor coordinates
  3. Container Relative: Converts to position relative to #session-terminal container
  4. IME Positioning: Returns coordinates suitable for IME input placement

Terminal Type Support

  • XTerm Terminal (vibe-terminal): Uses terminal.buffer.active.cursorX/Y from XTerm.js
  • Binary Terminal (vibe-terminal-binary): Uses buffer.cursorX/Y from WebSocket buffer data

Key Features

  • Precise Alignment: Accounts for exact character width and line height
  • Container Aware: Handles side panels and complex layouts
  • Font Responsive: Adapts to different font sizes and families
  • Platform Consistent: Same calculation logic across all terminal types

Error Handling

The function includes comprehensive error handling and graceful fallbacks:

  • Returns null when session is not running
  • Returns null when container element is not found
  • Returns null when character measurement fails
  • Falls back to absolute coordinates if session container is missing

Platform Detection

File: mobile-utils.ts

VibeTunnel automatically detects the platform and chooses the appropriate IME strategy:

export function detectMobile(): boolean {
  return /Android|webOS|iPhone|iPad|iPod|BlackBerry|IEMobile|Opera Mini/i.test(
    navigator.userAgent
  );
}

Desktop Implementation

1. DesktopIMEInput Component

File: ime-input.ts:32-382

A dedicated component for desktop browsers that creates and manages an invisible input element:

  • Positioned dynamically at terminal cursor location
  • Completely invisible (opacity: 0, 1px x 1px, pointerEvents: none)
  • Handles all CJK composition events through standard DOM APIs
  • Placeholder: "CJK Input"
  • Auto-focus with retention mechanism to prevent focus loss
  • Clean lifecycle management with proper cleanup

2. Desktop Input Manager Integration

File: input-manager.ts:71-129

The InputManager detects platform and creates the appropriate IME component:

private setupIMEInput(): void {
  // Skip IME input setup on mobile devices (they use native keyboard)
  if (detectMobile()) {
    logger.log('Skipping IME input setup on mobile device');
    return;
  }

  // Create desktop IME input component
  this.imeInput = new DesktopIMEInput({
    container: terminalContainer,
    onTextInput: (text: string) => this.sendInputText(text),
    onSpecialKey: (key: string) => this.sendInput(key),
    getCursorInfo: () => {
      // Dynamic cursor position calculation
      const cursorInfo = terminalElement.getCursorInfo();
      const pixelX = terminalRect.left - containerRect.left + cursorX * charWidth;
      const pixelY = terminalRect.top - containerRect.top + cursorY * lineHeight + lineHeight;
      return { x: pixelX, y: pixelY };
    }
  });
}

3. Desktop Focus Retention

File: ime-input.ts:317-343

Desktop IME requires special focus handling to prevent losing focus during composition:

private startFocusRetention(): void {
  // Skip in test environment to avoid infinite loops
  if (typeof process !== 'undefined' && process.env?.NODE_ENV === 'test') {
    return;
  }
  
  this.focusRetentionInterval = setInterval(() => {
    if (document.activeElement !== this.input) {
      this.input.focus();
    }
  }, 100);
}

Mobile Implementation

1. Direct Keyboard Manager

File: direct-keyboard-manager.ts

Mobile devices use the native virtual keyboard with a visible input field:

  • Standard HTML input element (not hidden)
  • Native virtual keyboard with CJK support
  • Quick keys toolbar for common terminal operations
  • No special IME handling needed (OS provides it)

2. Mobile Input Flow

Files: session-view.ts, lifecycle-event-manager.ts

Mobile input handling follows a different flow:

  1. User taps terminal area
  2. Native virtual keyboard appears with CJK support
  3. User types or selects from IME candidates
  4. Input is sent directly to terminal
  5. No invisible elements or composition tracking needed

Platform Differences

Key Implementation Differences

Aspect Desktop Mobile
Input Element Invisible 1px × 1px input Visible standard input field
IME Handling Custom composition events Native OS keyboard
Positioning Follows terminal cursor Fixed position or overlay
Focus Management Active focus retention Standard focus behavior
Keyboard Physical + software IME Virtual keyboard with IME
Integration Completely transparent Visible UI component
Performance Minimal overhead Standard input performance

Technical Architecture Differences

Desktop Implementation

// Creates invisible input at cursor position
const input = document.createElement('input');
input.style.opacity = '0';
input.style.width = '1px';
input.style.height = '1px';
input.style.pointerEvents = 'none';

// Handles IME composition events
input.addEventListener('compositionstart', handleStart);
input.addEventListener('compositionend', handleEnd);

// Positions at terminal cursor
input.style.left = `${cursorX}px`;
input.style.top = `${cursorY}px`;

Mobile Implementation

// Uses DirectKeyboardManager with visible input
const input = document.createElement('input');
input.type = 'text';
input.placeholder = 'Type here...';
// Standard visible input - no special IME handling needed

// OS handles IME automatically through virtual keyboard
// No composition event handling required

User Experience Differences

Desktop Experience

  • Seamless: No visible UI changes
  • Cursor following: IME popup appears at terminal cursor
  • Click to focus: Click anywhere in terminal area
  • Traditional: Works like native terminal IME
  • Paste support: Global paste handling anywhere in terminal

Mobile Experience

  • Touch-first: Designed for finger interaction
  • Visible input: Clear indication of where to type
  • Quick keys: Easy access to terminal-specific keys
  • Gesture support: Touch gestures and haptic feedback
  • Keyboard management: Handles virtual keyboard show/hide

Platform-Specific Features

Desktop Features

  • Dynamic cursor positioning: IME popup follows terminal cursor exactly
  • Global paste handling: Paste works anywhere in terminal area
  • Composition state tracking: Via data-ime-composing DOM attribute
  • Focus retention: Active mechanism prevents accidental focus loss
  • Invisible integration: Zero visual footprint for users
  • Performance optimized: Minimal resource usage when not composing

Mobile Features

  • Native virtual keyboard: Full OS-level CJK IME integration
  • Quick keys toolbar: Touch-friendly terminal keys (Tab, Esc, Ctrl, etc.)
  • Touch-optimized UI: Larger tap targets and touch gestures
  • Auto-capitalization control: Intelligently disabled for terminal accuracy
  • Viewport management: Graceful handling of keyboard show/hide animations
  • Direct input mode: Option to use hidden input for power users

User Experience

Desktop Workflow

User clicks terminal → Invisible input focuses → Types CJK → 
Browser shows IME candidates → User selects → Text appears in terminal

Mobile Workflow

User taps terminal → Virtual keyboard appears → Types CJK → 
OS shows IME candidates → User selects → Text appears in terminal

Visual Behavior

  • Desktop: Completely invisible, native IME popup at cursor position
  • Mobile: Standard input field with native virtual keyboard
  • Both platforms: Seamless CJK text input with full IME support

Performance

Resource Usage

  • Memory: <1KB (1 invisible DOM element + event listeners)
  • CPU: ~0.1ms per event (negligible overhead)
  • Impact on English users: None (actually improves paste reliability)

Optimization Features

  • Event handlers only active during IME usage
  • Dynamic positioning only calculated when needed
  • Minimal DOM footprint (single invisible input element)
  • Clean event delegation and lifecycle management
  • Automatic focus management with click-to-focus behavior
  • Proper cleanup prevents memory leaks during session changes

Code Reference

Primary Files

  • cursor-position.ts - Shared cursor position calculation
    • 14-20 - Main calculateCursorPosition() function signature
    • 32-46 - Character width measurement using test elements
    • 48-69 - Coordinate conversion (absolute → container-relative)
    • 70-72 - Error handling and cleanup
  • ime-input.ts - Desktop IME component implementation
    • 32-48 - DesktopIMEInput class definition
    • 50-80 - Invisible input element creation
    • 82-132 - Event listener setup (composition, paste, focus)
    • 134-156 - IME composition event handling
    • 317-343 - Focus retention mechanism
  • input-manager.ts - Input coordination and platform detection
    • 71-129 - Platform detection and IME setup
    • 131-144 - IME state checking during keyboard input
    • 453-458 - Cleanup and lifecycle management
  • direct-keyboard-manager.ts - Mobile keyboard handling
    • Complete mobile input implementation
  • mobile-utils.ts - Mobile detection utilities

Supporting Files

  • cursor-position.ts - Shared cursor position calculation utility
  • terminal.ts - XTerm cursor position API via getCursorInfo() (uses shared utility)
  • vibe-terminal-binary.ts - Binary terminal cursor position API (uses shared utility)
  • session-view.ts - Container element and terminal integration
  • lifecycle-event-manager.ts - Event coordination and interception
  • ime-constants.ts - IME-related key filtering utilities
  • terminal-constants.ts - Centralized terminal element IDs and selectors

Browser Compatibility

Works with all major browsers that support:

  • IME composition events (compositionstart, compositionupdate, compositionend)
  • Clipboard API for paste functionality
  • Standard DOM positioning APIs

Tested with:

  • Chrome, Firefox, Safari, Edge
  • macOS, Windows, Linux IME systems
  • Chinese (Simplified/Traditional), Japanese, Korean input methods

Configuration

Automatic Platform Detection

CJK IME support is automatically configured based on the detected platform:

  • Desktop: Invisible IME input with cursor following
  • Mobile: Native virtual keyboard with OS IME

Requirements

  1. User has CJK input method enabled in their OS
  2. Desktop: User clicks in terminal area to focus
  3. Mobile: User taps terminal or input field
  4. User switches to CJK input mode in their OS

Troubleshooting

Common Issues

  • IME candidates not showing: Ensure browser supports composition events
  • Text not appearing: Check if terminal session is active and receiving input
  • Paste not working: Verify clipboard permissions in browser

Debug Information

Comprehensive logging available in browser console:

  • 🔍 Setting up IME input on desktop device - Platform detection
  • [ime-input] - Desktop IME component events
  • [direct-keyboard-manager] - Mobile keyboard events
  • State tracking through DOM attributes:
    • data-ime-composing - IME composition active (desktop)
    • data-ime-input-focused - IME input has focus (desktop)
  • Mobile detection logs showing user agent analysis

Recent Improvements (v1.0.0-beta.16+)

Unified Cursor Position Tracking

  • Shared Utility: Created cursor-position.ts for consistent cursor calculation across all terminal types
  • Container-Aware Positioning: Fixed IME positioning issues with side panels and complex layouts
  • Precise Alignment: Improved character width measurement for pixel-perfect cursor alignment
  • Debug Logging: Enhanced debug output with comprehensive coordinate information

Technical Improvements

  • Code Deduplication: Eliminated ~120 lines of duplicate cursor calculation code
  • Maintainability: Single source of truth for cursor positioning logic
  • Type Safety: Improved TypeScript interfaces and error handling
  • Performance: More efficient coordinate conversion with optimized calculations

Element ID Centralization

  • Constants File: Created terminal-constants.ts to centralize all critical terminal element IDs
  • Prevention of Breakage: Changes to IDs like session-terminal, buffer-container, or terminal-container now only require updates in one location
  • Consistent References: All components now import TERMINAL_IDS constants instead of using hardcoded strings
  • Type Safety: Constants are strongly typed to prevent typos and ensure consistent usage across the codebase

Status: Production Ready
Platforms: Desktop (Windows, macOS, Linux) and Mobile (iOS, Android)
Version: VibeTunnel Web v1.0.0-beta.16+
Last Updated: 2025-08-02