mirror of
https://github.com/samsonjs/Peekaboo.git
synced 2026-04-21 13:55:50 +00:00
425 lines
No EOL
11 KiB
Markdown
425 lines
No EOL
11 KiB
Markdown
# Peekaboo MCP Server
|
|
|
|
A macOS utility exposed via Node.js MCP server for advanced screen captures, image analysis, and window management.
|
|
|
|
## 🚀 Installation & Setup
|
|
|
|
### Prerequisites
|
|
|
|
Before installing Peekaboo, ensure your system meets these requirements:
|
|
|
|
**System Requirements:**
|
|
- **macOS 12.0+** (Monterey or later)
|
|
- **Node.js 18.0+**
|
|
- **Swift 5.7+** (for building the native CLI)
|
|
- **Xcode Command Line Tools**
|
|
|
|
**Install Prerequisites:**
|
|
```bash
|
|
# Install Node.js (if not already installed)
|
|
brew install node
|
|
|
|
# Install Xcode Command Line Tools (if not already installed)
|
|
xcode-select --install
|
|
```
|
|
|
|
### Installation Methods
|
|
|
|
#### Method 1: NPM Installation (Recommended)
|
|
|
|
```bash
|
|
# Install globally for system-wide access
|
|
npm install -g peekaboo-mcp
|
|
|
|
# Or install locally in your project
|
|
npm install peekaboo-mcp
|
|
```
|
|
|
|
#### Method 2: From Source
|
|
|
|
```bash
|
|
# Clone the repository
|
|
git clone https://github.com/yourusername/peekaboo.git
|
|
cd peekaboo
|
|
|
|
# Install Node.js dependencies
|
|
npm install
|
|
|
|
# Build the TypeScript server
|
|
npm run build
|
|
|
|
# Build the Swift CLI component
|
|
cd swift-cli
|
|
swift build -c release
|
|
|
|
# Copy the binary to the project root
|
|
cp .build/release/peekaboo ../peekaboo
|
|
|
|
# Return to project root
|
|
cd ..
|
|
|
|
# Optional: Link for global access
|
|
npm link
|
|
```
|
|
|
|
### 🔧 Configuration
|
|
|
|
#### Environment Setup
|
|
|
|
Create a `.env` file in your project or set environment variables:
|
|
|
|
```bash
|
|
# AI Provider Configuration (Optional)
|
|
AI_PROVIDERS='[
|
|
{
|
|
"type": "ollama",
|
|
"baseUrl": "http://localhost:11434",
|
|
"model": "llava",
|
|
"enabled": true
|
|
},
|
|
{
|
|
"type": "openai",
|
|
"apiKey": "your-openai-api-key",
|
|
"model": "gpt-4-vision-preview",
|
|
"enabled": false
|
|
}
|
|
]'
|
|
|
|
# Logging Configuration
|
|
LOG_LEVEL="INFO"
|
|
PEEKABOO_LOG_FILE="/tmp/peekaboo-mcp.log"
|
|
|
|
# Optional: Custom paths for screenshots
|
|
PEEKABOO_DEFAULT_SAVE_PATH="~/Pictures/Screenshots"
|
|
```
|
|
|
|
#### MCP Server Configuration
|
|
|
|
Add Peekaboo to your MCP client configuration:
|
|
|
|
**For Claude Desktop (`~/Library/Application Support/Claude/claude_desktop_config.json`):**
|
|
```json
|
|
{
|
|
"mcpServers": {
|
|
"peekaboo": {
|
|
"command": "peekaboo-mcp",
|
|
"args": [],
|
|
"env": {
|
|
"AI_PROVIDERS": "[{\"type\":\"ollama\",\"baseUrl\":\"http://localhost:11434\",\"model\":\"llava\",\"enabled\":true}]"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**For other MCP clients:**
|
|
```json
|
|
{
|
|
"server": {
|
|
"command": "node",
|
|
"args": ["/path/to/peekaboo/dist/index.js"],
|
|
"env": {
|
|
"AI_PROVIDERS": "[{\"type\":\"ollama\",\"baseUrl\":\"http://localhost:11434\",\"model\":\"llava\",\"enabled\":true}]"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 🔐 Permissions Setup
|
|
|
|
Peekaboo requires specific macOS permissions to function properly:
|
|
|
|
#### 1. Screen Recording Permission
|
|
|
|
**Grant permission via System Preferences:**
|
|
1. Open **System Preferences** → **Security & Privacy** → **Privacy**
|
|
2. Select **Screen Recording** from the left sidebar
|
|
3. Click the **lock icon** and enter your password
|
|
4. Click **+** and add your terminal application or MCP client
|
|
5. Restart the application
|
|
|
|
**For common applications:**
|
|
- **Terminal.app**: `/Applications/Utilities/Terminal.app`
|
|
- **Claude Desktop**: `/Applications/Claude.app`
|
|
- **VS Code**: `/Applications/Visual Studio Code.app`
|
|
|
|
#### 2. Accessibility Permission (Optional)
|
|
|
|
For advanced window management features:
|
|
1. Open **System Preferences** → **Security & Privacy** → **Privacy**
|
|
2. Select **Accessibility** from the left sidebar
|
|
3. Add your terminal/MCP client application
|
|
|
|
### ✅ Verification
|
|
|
|
Test your installation:
|
|
|
|
```bash
|
|
# Test the Swift CLI directly
|
|
./peekaboo --help
|
|
|
|
# Test server status
|
|
./peekaboo list server_status --json-output
|
|
|
|
# Test screen capture (requires permissions)
|
|
./peekaboo image --mode screen --format png
|
|
|
|
# Start the MCP server for testing
|
|
peekaboo-mcp
|
|
```
|
|
|
|
**Expected output for server status:**
|
|
```json
|
|
{
|
|
"success": true,
|
|
"data": {
|
|
"swift_cli_available": true,
|
|
"permissions": {
|
|
"screen_recording": true
|
|
},
|
|
"system_info": {
|
|
"macos_version": "14.0"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 🎯 Quick Start
|
|
|
|
Once installed and configured:
|
|
|
|
1. **Capture Screenshot:**
|
|
```bash
|
|
peekaboo-mcp
|
|
# In your MCP client: "Take a screenshot of my screen"
|
|
```
|
|
|
|
2. **List Applications:**
|
|
```bash
|
|
# In your MCP client: "Show me all running applications"
|
|
```
|
|
|
|
3. **Analyze Screenshot:**
|
|
```bash
|
|
# In your MCP client: "Take a screenshot and tell me what's on my screen"
|
|
```
|
|
|
|
### 🐛 Troubleshooting
|
|
|
|
**Common Issues:**
|
|
|
|
| Issue | Solution |
|
|
|-------|----------|
|
|
| `Permission denied` errors | Grant Screen Recording permission in System Preferences |
|
|
| `Swift CLI unavailable` | Rebuild Swift CLI: `cd swift-cli && swift build -c release` |
|
|
| `AI analysis failed` | Check AI provider configuration and network connectivity |
|
|
| `Command not found: peekaboo-mcp` | Run `npm link` or check global npm installation |
|
|
|
|
**Debug Mode:**
|
|
```bash
|
|
# Enable verbose logging
|
|
LOG_LEVEL=DEBUG peekaboo-mcp
|
|
|
|
# Check permissions
|
|
./peekaboo list server_status --json-output
|
|
```
|
|
|
|
**Get Help:**
|
|
- 📚 [Documentation](./docs/)
|
|
- 🐛 [Issues](https://github.com/yourusername/peekaboo/issues)
|
|
- 💬 [Discussions](https://github.com/yourusername/peekaboo/discussions)
|
|
|
|
---
|
|
|
|
## 🛠️ Available Tools
|
|
|
|
Once installed, Peekaboo provides three powerful MCP tools:
|
|
|
|
### 📸 `peekaboo.image` - Screen Capture
|
|
|
|
**Parameters:**
|
|
- `mode`: `"screen"` | `"window"` | `"multi"` (default: "screen")
|
|
- `app`: Application identifier for window/multi modes
|
|
- `path`: Custom save path (optional)
|
|
|
|
**Example:**
|
|
```json
|
|
{
|
|
"name": "peekaboo.image",
|
|
"arguments": {
|
|
"mode": "window",
|
|
"app": "Safari"
|
|
}
|
|
}
|
|
```
|
|
|
|
### 📋 `peekaboo.list` - Application Listing
|
|
|
|
**Parameters:**
|
|
- `item_type`: `"running_applications"` | `"application_windows"` | `"server_status"`
|
|
- `app`: Application identifier (required for application_windows)
|
|
|
|
**Example:**
|
|
```json
|
|
{
|
|
"name": "peekaboo.list",
|
|
"arguments": {
|
|
"item_type": "running_applications"
|
|
}
|
|
}
|
|
```
|
|
|
|
### 🧩 `peekaboo.analyze` - AI Analysis
|
|
|
|
**Parameters:**
|
|
- `image_path`: Absolute path to image file
|
|
- `question`: Question/prompt for AI analysis
|
|
|
|
**Example:**
|
|
```json
|
|
{
|
|
"name": "peekaboo.analyze",
|
|
"arguments": {
|
|
"image_path": "/tmp/screenshot.png",
|
|
"question": "What applications are visible in this screenshot?"
|
|
}
|
|
}
|
|
```
|
|
|
|
## 🎯 Key Features
|
|
|
|
### Screen Capture
|
|
- **Multi-display support**: Captures each display separately
|
|
- **Window targeting**: Intelligent app/window matching with fuzzy search
|
|
- **Format flexibility**: PNG, JPEG, WebP, HEIF support
|
|
- **Automatic naming**: Timestamps and descriptive filenames
|
|
- **Permission handling**: Automatic screen recording permission checks
|
|
|
|
### Application Management
|
|
- **Running app enumeration**: Complete system application listing
|
|
- **Window discovery**: Per-app window enumeration with metadata
|
|
- **Fuzzy matching**: Find apps by partial name, bundle ID, or PID
|
|
- **Real-time status**: Active/background status, window counts
|
|
|
|
### AI Integration
|
|
- **Provider agnostic**: Support for Ollama, OpenAI, and other providers
|
|
- **Image analysis**: Natural language querying of captured content
|
|
- **Configurable**: Environment-based provider selection
|
|
|
|
## 🏛️ Project Structure
|
|
|
|
```
|
|
Peekaboo/
|
|
├── src/ # Node.js MCP Server (TypeScript)
|
|
│ ├── index.ts # Main MCP server entry point
|
|
│ ├── tools/ # Individual tool implementations
|
|
│ │ ├── image.ts # Screen capture tool
|
|
│ │ ├── analyze.ts # AI analysis tool
|
|
│ │ └── list.ts # Application/window listing
|
|
│ ├── utils/ # Utility modules
|
|
│ │ ├── swift-cli.ts # Swift CLI integration
|
|
│ │ ├── ai-providers.ts # AI provider management
|
|
│ │ └── server-status.ts # Server status utilities
|
|
│ └── types/ # Shared type definitions
|
|
├── swift-cli/ # Native Swift CLI
|
|
│ └── Sources/peekaboo/ # Swift source files
|
|
│ ├── main.swift # CLI entry point
|
|
│ ├── ImageCommand.swift # Image capture implementation
|
|
│ ├── ListCommand.swift # Application listing
|
|
│ ├── Models.swift # Data structures
|
|
│ ├── ApplicationFinder.swift # App discovery logic
|
|
│ ├── WindowManager.swift # Window management
|
|
│ ├── PermissionsChecker.swift # macOS permissions
|
|
│ └── JSONOutput.swift # JSON response formatting
|
|
├── package.json # Node.js dependencies
|
|
├── tsconfig.json # TypeScript configuration
|
|
└── README.md # This file
|
|
```
|
|
|
|
## 🔧 Technical Details
|
|
|
|
### Swift CLI JSON Output
|
|
The Swift CLI outputs structured JSON when called with `--json-output`:
|
|
|
|
```json
|
|
{
|
|
"success": true,
|
|
"data": {
|
|
"applications": [
|
|
{
|
|
"app_name": "Safari",
|
|
"bundle_id": "com.apple.Safari",
|
|
"pid": 1234,
|
|
"is_active": true,
|
|
"window_count": 2
|
|
}
|
|
]
|
|
},
|
|
"debug_logs": ["Found 50 applications"]
|
|
}
|
|
```
|
|
|
|
### MCP Integration
|
|
The Node.js server translates between MCP's JSON-RPC protocol and the Swift CLI's JSON output, providing:
|
|
- **Schema validation** via Zod
|
|
- **Error handling** with proper MCP error codes
|
|
- **Logging** via Pino logger
|
|
- **Type safety** throughout the TypeScript codebase
|
|
|
|
### Permission Model
|
|
Peekaboo respects macOS security by:
|
|
- **Checking screen recording permissions** before capture operations
|
|
- **Graceful degradation** when permissions are missing
|
|
- **Clear error messages** guiding users to grant required permissions
|
|
|
|
## 🧪 Testing
|
|
|
|
### Manual Testing
|
|
```bash
|
|
# Test Swift CLI directly
|
|
./peekaboo list apps --json-output | head -20
|
|
|
|
# Test MCP integration
|
|
echo '{"jsonrpc": "2.0", "id": 1, "method": "tools/list"}' | node dist/index.js
|
|
|
|
# Test image capture
|
|
echo '{"jsonrpc": "2.0", "id": 2, "method": "tools/call", "params": {"name": "peekaboo.image", "arguments": {"mode": "screen"}}}' | node dist/index.js
|
|
```
|
|
|
|
### Automated Testing
|
|
```bash
|
|
# TypeScript compilation
|
|
npm run build
|
|
|
|
# Swift compilation
|
|
cd swift-cli && swift build
|
|
```
|
|
|
|
## 🐛 Known Issues
|
|
|
|
- **FileHandle warning**: Non-critical Swift warning about TextOutputStream conformance
|
|
- **AI Provider Config**: Requires `AI_PROVIDERS` environment variable for analysis features
|
|
|
|
## 🚀 Future Enhancements
|
|
|
|
- [ ] **OCR Integration**: Built-in text extraction from screenshots
|
|
- [ ] **Video Capture**: Screen recording capabilities
|
|
- [ ] **Annotation Tools**: Drawing/markup on captured images
|
|
- [ ] **Cloud Storage**: Direct upload to cloud providers
|
|
- [ ] **Hotkey Support**: System-wide keyboard shortcuts
|
|
|
|
## 📄 License
|
|
|
|
MIT License - see LICENSE file for details.
|
|
|
|
## 🤝 Contributing
|
|
|
|
1. Fork the repository
|
|
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
|
|
3. Commit your changes (`git commit -m 'Add amazing feature'`)
|
|
4. Push to the branch (`git push origin feature/amazing-feature`)
|
|
5. Open a Pull Request
|
|
|
|
---
|
|
|
|
**🎉 Peekaboo is ready to use!** The project successfully combines the power of native macOS APIs with modern Node.js tooling to create a comprehensive screen capture and analysis solution. |