diff --git a/README.md b/README.md index b49cc96..51a7740 100644 --- a/README.md +++ b/README.md @@ -1,98 +1,213 @@ -# Peekabooβ€”screenshot got you! Now you see it, now it's saved. +# Peekaboo v1.0 β€” The screenshot tool that just worksβ„’ ![Peekaboo Banner](assets/banner.png) -πŸ‘€ β†’ πŸ“Έ β†’ πŸ’Ύ β€” **Unattended screenshot automation that actually works** +πŸ‘€ β†’ πŸ“Έ β†’ πŸ’Ύ β€” **Zero-click screenshots with AI superpowers** + +--- + +## ✨ **FEATURES** + +🎯 **Clean CLI** β€’ 🀫 **Quiet Mode** β€’ πŸ€– **Dual AI Support** β€’ ⚑ **Non-Interactive** β€’ πŸ“Š **Smart Defaults** β€’ πŸͺŸ **Multi-Window AI** --- ## πŸš€ **THE MAGIC** -**Peekaboo** is your silent screenshot assassin. Point it at any app, and SNAP! β€” it's captured and saved before you can blink. +**Peekaboo** captures any app, any window, any time β€” no clicking required. Now with a beautiful command-line interface and AI vision analysis. -- 🎯 **Smart targeting**: App names or bundle IDs -- πŸš€ **Auto-launch**: Sleeping apps? No problem! -- πŸ‘ **Brings apps forward**: Always gets the shot -- πŸ— **Creates directories**: Paths don't exist? Fixed! -- 🎨 **Multi-format**: PNG, JPG, PDF β€” you name it -- πŸ’₯ **Zero interaction**: 100% unattended operation -- 🧠 **Smart filenames**: Model-friendly names with app info -- ⚑ **Optimized speed**: 70% faster capture delays -- πŸ€– **AI Vision Analysis**: Local Ollama integration with auto-model detection -- ☁️ **Cloud AI Ready**: Self-documenting for Claude, Windsurf, ChatGPT integration +### 🎯 **Core Features** +- **Smart capture**: App window by default, fullscreen when no app specified +- **Zero interaction**: Uses window IDs, not mouse clicks +- **AI vision**: Ask questions about your screenshots (Ollama + Claude CLI) +- **Quiet mode**: Perfect for scripts and automation (`-q`) +- **Multi-window**: Capture all app windows separately (`-m`) +- **Format control**: PNG, JPG, PDF with auto-detection +- **Smart paths**: Auto-generated filenames or custom paths +- **Fast & reliable**: Optimized delays, robust error handling + +### 🌟 **Key Highlights** +- **Smart Multi-Window AI**: Automatically analyzes ALL windows for multi-window apps +- **Timeout Protection**: 90-second timeout prevents hanging on slow models +- **Clean CLI Design**: Consistent flags, short aliases, logical defaults +- **Claude CLI support**: Smart provider selection (Ollama preferred) +- **Performance tracking**: See how long AI analysis takes +- **Comprehensive help**: Clear sections, real examples --- -## πŸŽͺ **HOW TO USE** - -### 🎯 **Basic Usage** -*Simple screenshot capture* +## 🎯 **QUICK START** ```bash -# πŸ‘€ Quick shot with smart filename -osascript peekaboo.scpt "Safari" -# β†’ /tmp/peekaboo_safari_20250522_143052.png +# Install (one-time) +chmod +x peekaboo.scpt -# 🎯 Custom output path -osascript peekaboo.scpt "Safari" "/Users/you/Desktop/safari.png" - -# 🎯 Bundle ID targeting -osascript peekaboo.scpt "com.apple.TextEdit" "/tmp/textedit.jpg" +# Basic usage +osascript peekaboo.scpt # Capture fullscreen +osascript peekaboo.scpt Safari # Capture Safari window +osascript peekaboo.scpt help # Show all options ``` -### πŸŽͺ **Advanced Features** -*All the power. All the windows. All the time.* +--- +## πŸ“– **COMMAND REFERENCE** + +### 🎨 **Command Structure** +``` +peekaboo [app] [options] # Capture app or fullscreen +peekaboo analyze "question" [opts] # Analyze existing image +peekaboo list|ls # List running apps +peekaboo help|-h # Show help +``` + +### 🏷️ **Options** +| Option | Short | Description | +|--------|-------|-------------| +| `--output ` | `-o` | Output file or directory path | +| `--fullscreen` | `-f` | Force fullscreen capture | +| `--window` | `-w` | Single window (default with app) | +| `--multi` | `-m` | Capture all app windows | +| `--ask "question"` | `-a` | AI analysis of screenshot | +| `--quiet` | `-q` | Minimal output (just path) | +| `--verbose` | `-v` | Debug output | +| `--format ` | | Output format: png\|jpg\|pdf | +| `--model ` | | AI model (e.g., llava:7b) | +| `--provider

` | | AI provider: auto\|ollama\|claude | + +--- + +## πŸŽͺ **USAGE EXAMPLES** + +### πŸ“Έ **Basic Screenshots** ```bash -# πŸ” What's running right now? +# Simplest captures +osascript peekaboo.scpt # Fullscreen β†’ /tmp/peekaboo_fullscreen_[timestamp].png +osascript peekaboo.scpt Safari # Safari window β†’ /tmp/peekaboo_safari_[timestamp].png +osascript peekaboo.scpt com.apple.Terminal # Using bundle ID β†’ /tmp/peekaboo_com_apple_terminal_[timestamp].png + +# Custom output paths +osascript peekaboo.scpt Safari -o ~/Desktop/safari.png +osascript peekaboo.scpt Finder -o ~/screenshots/finder.jpg --format jpg +osascript peekaboo.scpt -f -o ~/fullscreen.pdf # Fullscreen as PDF +``` + +### 🀫 **Quiet Mode** (Perfect for Scripts) +```bash +# Just get the file path - no extra output +FILE=$(osascript peekaboo.scpt Safari -q) +echo "Screenshot saved to: $FILE" + +# Use in scripts +SCREENSHOT=$(osascript peekaboo.scpt Terminal -q) +scp "$SCREENSHOT" user@server:/uploads/ + +# Chain commands +osascript peekaboo.scpt Finder -q | pbcopy # Copy path to clipboard +``` + +### 🎭 **Multi-Window Capture** +```bash +# Capture all windows of an app +osascript peekaboo.scpt Chrome -m +# Creates: /tmp/peekaboo_chrome_[timestamp]_window_1_[title].png +# /tmp/peekaboo_chrome_[timestamp]_window_2_[title].png +# etc. + +# Save to specific directory +osascript peekaboo.scpt Safari -m -o ~/safari-windows/ +# Creates: ~/safari-windows/peekaboo_safari_[timestamp]_window_1_[title].png +# ~/safari-windows/peekaboo_safari_[timestamp]_window_2_[title].png +``` + +### πŸ€– **AI Vision Analysis** +```bash +# One-step: Screenshot + Analysis +osascript peekaboo.scpt Safari -a "What website is this?" +osascript peekaboo.scpt Terminal -a "Are there any error messages?" +osascript peekaboo.scpt -f -a "Describe what's on my screen" + +# Specify AI model +osascript peekaboo.scpt Xcode -a "Is the build successful?" --model llava:13b + +# Two-step: Analyze existing image +osascript peekaboo.scpt analyze screenshot.png "What do you see?" +osascript peekaboo.scpt analyze error.png "Explain this error" --provider ollama +``` + +### πŸ” **App Discovery** +```bash +# List all running apps with window info osascript peekaboo.scpt list +osascript peekaboo.scpt ls # Short alias -# πŸ‘€ Quick shot to /tmp with timestamp -osascript peekaboo.scpt "Chrome" +# Output: +# β€’ Google Chrome (com.google.Chrome) +# Windows: 3 +# - "GitHub - Project" +# - "Documentation" +# - "Stack Overflow" +# β€’ Safari (com.apple.Safari) +# Windows: 2 +# - "Apple.com" +# - "News" +``` -# 🎭 Capture ALL windows with smart names -osascript peekaboo.scpt "Chrome" "/tmp/chrome.png" --multi +### 🎯 **Advanced Combinations** +```bash +# Quiet fullscreen with custom path and format +osascript peekaboo.scpt -f -o ~/desktop-capture --format jpg -q -# πŸͺŸ Just the front window -osascript peekaboo.scpt "TextEdit" "/tmp/textedit.png" --window +# Multi-window with AI analysis (analyzes first window) +osascript peekaboo.scpt Chrome -m -a "What tabs are open?" -# πŸ€– AI analysis: Screenshot + question in one step -osascript peekaboo.scpt "Safari" --ask "What's on this page?" +# Verbose mode for debugging +osascript peekaboo.scpt Safari -v -o ~/debug.png -# πŸ” Analyze existing image -osascript peekaboo.scpt analyze "/tmp/screenshot.png" "Any errors visible?" +# Force window mode on fullscreen request +osascript peekaboo.scpt Safari -f -w # -w overrides -f ``` --- ## ⚑ **QUICK WINS** -### 🎯 **Basic Shot** +### 🎯 **Basic Captures** ```bash -# Quick shot with auto-generated filename -osascript peekaboo.scpt "Finder" +# Fullscreen (no app specified) +osascript peekaboo.scpt ``` -**Result**: Full screen with Finder in focus β†’ `/tmp/peekaboo_finder_20250522_143052.png` -*Notice the smart filename: app name + timestamp, all lowercase with underscores* +**Result**: Full screen β†’ `/tmp/peekaboo_fullscreen_20250522_143052.png` ```bash -# Custom path -osascript peekaboo.scpt "Finder" "/Desktop/finder.png" +# App window with smart filename +osascript peekaboo.scpt Finder ``` -**Result**: Full screen with Finder in focus β†’ `finder.png` +**Result**: Finder window β†’ `/tmp/peekaboo_finder_20250522_143052.png` + +```bash +# Custom output path +osascript peekaboo.scpt Finder -o ~/Desktop/finder.png +``` +**Result**: Finder window β†’ `~/Desktop/finder.png` ### 🎭 **Multi-Window Magic** ```bash -osascript peekaboo.scpt "Safari" "/tmp/safari.png" --multi +osascript peekaboo.scpt Safari -m ``` **Result**: Multiple files with smart names: -- `safari_window_1_github.png` -- `safari_window_2_documentation.png` -- `safari_window_3_google_search.png` +- `/tmp/peekaboo_safari_20250522_143052_window_1_github.png` +- `/tmp/peekaboo_safari_20250522_143052_window_2_docs.png` +- `/tmp/peekaboo_safari_20250522_143052_window_3_search.png` + +```bash +# Save to specific directory +osascript peekaboo.scpt Chrome -m -o ~/screenshots/ +``` +**Result**: All Chrome windows saved to `~/screenshots/` directory ### πŸ” **App Discovery** ```bash -osascript peekaboo.scpt list +osascript peekaboo.scpt list # or use 'ls' ``` **Result**: Every running app + window titles. No guessing! @@ -118,54 +233,90 @@ Peekaboo speaks all the languages: ```bash # PNG (default) - smart filename in /tmp -osascript peekaboo.scpt "Safari" +osascript peekaboo.scpt Safari # β†’ /tmp/peekaboo_safari_20250522_143052.png -# PNG with custom path -osascript peekaboo.scpt "Safari" "/tmp/shot.png" - -# JPG - smaller files -osascript peekaboo.scpt "Safari" "/tmp/shot.jpg" +# JPG with format flag +osascript peekaboo.scpt Safari -o ~/shot --format jpg +# β†’ ~/shot.jpg # PDF - vector goodness -osascript peekaboo.scpt "Safari" "/tmp/shot.pdf" +osascript peekaboo.scpt Safari -o ~/doc.pdf +# β†’ ~/doc.pdf (format auto-detected from extension) + +# Mix and match options +osascript peekaboo.scpt -f --format jpg -o ~/fullscreen -q +# β†’ ~/fullscreen.jpg (quiet mode just prints path) ``` --- ## πŸ€– **AI VISION ANALYSIS** ⭐ -Peekaboo integrates with **Ollama** for powerful local AI vision analysis - ask questions about your screenshots! No cloud, no API keys, just pure local magic. +Peekaboo integrates with AI providers for powerful vision analysis - ask questions about your screenshots! Supports both **Ollama** (local, privacy-focused) and **Claude CLI** (cloud-based). + +**πŸͺŸ NEW: Smart Multi-Window AI** - When analyzing apps with multiple windows, Peekaboo automatically captures and analyzes ALL windows, giving you comprehensive insights about each one! ### 🎯 **Key Features** -- **🧠 Smart Model Auto-Detection** - Automatically picks the best available vision model +- **πŸ€– Smart Provider Selection** - Auto-detects Ollama or Claude CLI +- **🧠 Smart Model Auto-Detection** - Automatically picks the best available vision model (Ollama) - **πŸ“ Intelligent Image Resizing** - Auto-compresses large screenshots (>5MB β†’ 2048px) for optimal AI processing +- **πŸͺŸ Smart Multi-Window Analysis** - Automatically analyzes ALL windows when app has multiple windows - **⚑ One or Two-Step Workflows** - Screenshot+analyze or analyze existing images -- **πŸ”’ 100% Local & Private** - Everything runs on your machine via Ollama -- **🎯 Zero Configuration** - Just install Ollama + model, Peekaboo handles the rest +- **πŸ”’ Privacy Options** - Choose between local (Ollama) or cloud (Claude) analysis +- **⏱️ Performance Tracking** - Shows analysis time for each request +- **⛰️ Timeout Protection** - 90-second timeout prevents hanging on slow models +- **🎯 Zero Configuration** - Just install your preferred AI provider, Peekaboo handles the rest ### πŸš€ **One-Step: Screenshot + Analysis** ```bash -# Take screenshot and analyze it in one command -osascript peekaboo.scpt "Safari" --ask "What's the main content on this page?" -osascript peekaboo.scpt "Terminal" --ask "Any error messages visible?" -osascript peekaboo.scpt "Xcode" --ask "Is the build successful?" -osascript peekaboo.scpt "Chrome" --ask "What product is being shown?" --model llava:13b +# Take screenshot and analyze it in one command (auto-selects provider) +osascript peekaboo.scpt Safari -a "What's the main content on this page?" +osascript peekaboo.scpt Terminal -a "Any error messages visible?" +osascript peekaboo.scpt Xcode -a "Is the build successful?" -# Fullscreen analysis (no app targeting needed) -osascript peekaboo.scpt --ask "Describe what's on my screen" -osascript peekaboo.scpt --verbose --ask "Any UI errors or warnings visible?" +# Multi-window apps: Automatically analyzes ALL windows! +osascript peekaboo.scpt Chrome -a "What tabs are open?" +# πŸ€– Result: Window 1 "GitHub": Shows a pull request page... +# Window 2 "Docs": Shows API documentation... +# Window 3 "Gmail": Shows email inbox... + +# Force single window with -w flag +osascript peekaboo.scpt Chrome -w -a "What's on this tab?" + +# Specify AI provider explicitly +osascript peekaboo.scpt Chrome -a "What product is shown?" --provider ollama +osascript peekaboo.scpt Safari -a "Describe the page" --provider claude + +# Specify custom model (Ollama) +osascript peekaboo.scpt Chrome -a "What product is being shown?" --model llava:13b + +# Fullscreen analysis (no app specified) +osascript peekaboo.scpt -f -a "Describe what's on my screen" +osascript peekaboo.scpt -a "Any UI errors or warnings visible?" -v + +# Quiet mode for scripting (just outputs path after analysis) +osascript peekaboo.scpt Terminal -a "Find errors" -q ``` ### πŸ” **Two-Step: Analyze Existing Images** ```bash # Analyze screenshots you already have -osascript peekaboo.scpt analyze "/tmp/screenshot.png" "Describe what you see" -osascript peekaboo.scpt analyze "/path/error.png" "What error is shown?" -osascript peekaboo.scpt analyze "/Desktop/ui.png" "Any UI issues?" --model qwen2.5vl:7b +osascript peekaboo.scpt analyze /tmp/screenshot.png "Describe what you see" +osascript peekaboo.scpt analyze error.png "What error is shown?" +osascript peekaboo.scpt analyze ui.png "Any UI issues?" --model qwen2.5vl:7b ``` -### πŸ› οΈ **Complete Ollama Setup Guide** +### πŸ€– **AI Provider Comparison** + +| Provider | Type | Image Analysis | Setup | Best For | +|----------|------|---------------|-------|----------| +| **Ollama** | Local | βœ… Direct file analysis | Install + pull models | Privacy, automation | +| **Claude CLI** | Cloud | ❌ Limited support* | Install CLI | Text prompts | + +*Claude CLI currently doesn't support direct image file analysis but can work with images through interactive mode or MCP integrations. + +### πŸ› οΈ **Complete Ollama Setup Guide** (Recommended for Image Analysis) #### 1️⃣ **Install Ollama** ```bash @@ -258,6 +409,50 @@ osascript peekaboo.scpt "VS Code" --ask "Any syntax errors or warnings in the co osascript peekaboo.scpt --ask "Describe the overall layout and any issues" ``` +### πŸͺŸ **Smart Multi-Window Analysis** +When an app has multiple windows, Peekaboo automatically analyzes ALL of them: + +```bash +# Chrome with 3 tabs open? Peekaboo analyzes them all! +osascript peekaboo.scpt Chrome -a "What's on each tab?" + +# Result format: +# Peekaboo πŸ‘€: Multi-window AI Analysis Complete! πŸ€– +# +# πŸ“Έ App: Chrome (3 windows) +# ❓ Question: What's on each tab? +# πŸ€– Model: qwen2.5vl:7b +# +# πŸ’¬ Results for each window: +# +# πŸͺŸ Window 1: "GitHub - Pull Request #42" +# This shows a pull request for adding authentication... +# +# πŸͺŸ Window 2: "Stack Overflow - Python threading" +# A Stack Overflow page discussing Python threading concepts... +# +# πŸͺŸ Window 3: "Gmail - Inbox (42)" +# Gmail inbox showing 42 unread emails... +``` + +**Smart Defaults:** +- βœ… Multi-window apps β†’ Analyzes ALL windows automatically +- βœ… Single window apps β†’ Analyzes the one window +- βœ… Want just one window? β†’ Use `-w` flag to force single window mode +- βœ… Quiet mode β†’ Returns condensed results for each window + +### ⏱️ **Performance Tracking & Timeouts** +Every AI analysis shows execution time and has built-in protection: +``` +Peekaboo πŸ‘€: Analysis via qwen2.5vl:7b took 7 sec. +Peekaboo πŸ‘€: Analysis timed out after 90 seconds. +``` + +**Timeout Protection:** +- ⏰ 90-second timeout prevents hanging on large models +- πŸ›‘οΈ Clear error messages if model is too slow +- πŸ’‘ Suggests using smaller models on timeout + **Perfect for:** - πŸ§ͺ **Automated UI Testing** - "Any error messages visible?" - πŸ“Š **Dashboard Monitoring** - "Are all systems green?" @@ -265,6 +460,7 @@ osascript peekaboo.scpt --ask "Describe the overall layout and any issues" - πŸ“Έ **Content Verification** - "Does this page look correct?" - πŸ” **Visual QA Automation** - "Any broken UI elements?" - πŸ“± **App State Verification** - "Is the login successful?" +- ⏱️ **Performance Benchmarking** - Compare model speeds --- @@ -392,14 +588,14 @@ osascript peekaboo.scpt "Chrome" --multi β†’ chrome_window_1_github.png ### 🎯 **Targeting Options** ```bash # By name (easy) - smart filename -osascript peekaboo.scpt "Safari" +osascript peekaboo.scpt Safari # β†’ /tmp/peekaboo_safari_20250522_143052.png # By name with custom path -osascript peekaboo.scpt "Safari" "/tmp/safari.png" +osascript peekaboo.scpt Safari -o /tmp/safari.png # By bundle ID (precise) - gets sanitized -osascript peekaboo.scpt "com.apple.Safari" +osascript peekaboo.scpt com.apple.Safari # β†’ /tmp/peekaboo_com_apple_safari_20250522_143052.png # By display name (works too!) - spaces become underscores @@ -410,13 +606,15 @@ osascript peekaboo.scpt "Final Cut Pro" ### πŸŽͺ **Pro Features** ```bash # Multi-window capture ---multi # All windows with descriptive names +-m, --multi # All windows with descriptive names # Window modes ---window # Front window only (unattended!) +-w, --window # Front window only (unattended!) +-f, --fullscreen # Force fullscreen capture -# Debug mode ---verbose # See what's happening under the hood +# Output control +-q, --quiet # Minimal output (just path) +-v, --verbose # See what's happening under the hood ``` ### πŸ” **Discovery Mode** @@ -435,15 +633,20 @@ Shows you: ### πŸ“Š **Documentation Screenshots** ```bash -# Quick capture to /tmp -osascript peekaboo.scpt "Xcode" --multi -osascript peekaboo.scpt "Terminal" --multi -osascript peekaboo.scpt "Safari" --multi +# Quick capture to /tmp with descriptive names +osascript peekaboo.scpt Xcode -m +osascript peekaboo.scpt Terminal -m +osascript peekaboo.scpt Safari -m -# Capture your entire workflow with custom paths -osascript peekaboo.scpt "Xcode" "/docs/xcode.png" --multi -osascript peekaboo.scpt "Terminal" "/docs/terminal.png" --multi -osascript peekaboo.scpt "Safari" "/docs/browser.png" --multi +# Capture your entire workflow to specific directory +osascript peekaboo.scpt Xcode -m -o /docs/ +osascript peekaboo.scpt Terminal -m -o /docs/ +osascript peekaboo.scpt Safari -m -o /docs/ + +# Or specific files +osascript peekaboo.scpt Xcode -o /docs/xcode.png +osascript peekaboo.scpt Terminal -o /docs/terminal.png +osascript peekaboo.scpt Safari -o /docs/browser.png ``` ### πŸš€ **CI/CD Integration** @@ -453,35 +656,49 @@ osascript peekaboo.scpt "Your App" # β†’ /tmp/peekaboo_your_app_20250522_143052.png # Automated visual testing with AI -osascript peekaboo.scpt "Your App" --ask "Any error messages or crashes visible?" -osascript peekaboo.scpt "Your App" --ask "Is the login screen displayed correctly?" +osascript peekaboo.scpt "Your App" -a "Any error messages or crashes visible?" +osascript peekaboo.scpt "Your App" -a "Is the login screen displayed correctly?" # Custom path with timestamp -osascript peekaboo.scpt "Your App" "/test-results/app-$(date +%s).png" +osascript peekaboo.scpt "Your App" -o "/test-results/app-$(date +%s).png" + +# Quiet mode for scripts (just outputs path) +SCREENSHOT=$(osascript peekaboo.scpt "Your App" -q) +echo "Screenshot saved: $SCREENSHOT" ``` ### 🎬 **Content Creation** ```bash # Before/after shots with AI descriptions -osascript peekaboo.scpt "Photoshop" --ask "Describe the current design state" +osascript peekaboo.scpt Photoshop -a "Describe the current design state" # ... do your work ... -osascript peekaboo.scpt "Photoshop" --ask "What changes were made to the design?" +osascript peekaboo.scpt Photoshop -a "What changes were made to the design?" # Traditional before/after shots -osascript peekaboo.scpt "Photoshop" "/content/before.png" +osascript peekaboo.scpt Photoshop -o /content/before.png # ... do your work ... -osascript peekaboo.scpt "Photoshop" "/content/after.png" +osascript peekaboo.scpt Photoshop -o /content/after.png + +# Capture all design windows +osascript peekaboo.scpt Photoshop -m -o /content/designs/ ``` ### πŸ§ͺ **Automated QA & Testing** ```bash # Visual regression testing -osascript peekaboo.scpt "Your App" --ask "Does the UI look correct?" -osascript peekaboo.scpt "Safari" --ask "Are there any broken images or layout issues?" -osascript peekaboo.scpt "Terminal" --ask "Any red error text visible?" +osascript peekaboo.scpt "Your App" -a "Does the UI look correct?" +osascript peekaboo.scpt Safari -a "Are there any broken images or layout issues?" +osascript peekaboo.scpt Terminal -a "Any red error text visible?" # Dashboard monitoring -osascript peekaboo.scpt analyze "/tmp/dashboard.png" "Are all metrics green?" +osascript peekaboo.scpt analyze /tmp/dashboard.png "Are all metrics green?" + +# Quiet mode for test scripts +if osascript peekaboo.scpt "Your App" -a "Any errors?" -q | grep -q "No errors"; then + echo "βœ… Test passed" +else + echo "❌ Test failed" +fi ``` --- @@ -496,9 +713,11 @@ osascript peekaboo.scpt analyze "/tmp/dashboard.png" "Are all metrics green?" ```bash # See what's actually running osascript peekaboo.scpt list +# or +osascript peekaboo.scpt ls # Try the bundle ID instead -osascript peekaboo.scpt "com.company.AppName" "/tmp/shot.png" +osascript peekaboo.scpt com.company.AppName -o /tmp/shot.png ``` ### πŸ“ **File Not Created?** @@ -508,7 +727,9 @@ osascript peekaboo.scpt "com.company.AppName" "/tmp/shot.png" ### πŸ› **Debug Mode** ```bash -osascript peekaboo.scpt "Safari" "/tmp/debug.png" --verbose +osascript peekaboo.scpt Safari -o /tmp/debug.png -v +# or +osascript peekaboo.scpt Safari --output /tmp/debug.png --verbose ``` --- @@ -520,17 +741,20 @@ osascript peekaboo.scpt "Safari" "/tmp/debug.png" --verbose | **Basic screenshots** | βœ… Full screen capture with app targeting | | **App targeting** | βœ… By name or bundle ID | | **Multi-format** | βœ… PNG, JPG, PDF support | -| **App discovery** | βœ… `list` command shows running apps | -| **Multi-window** | βœ… `--multi` captures all app windows | +| **App discovery** | βœ… `list`/`ls` command shows running apps | +| **Multi-window** | βœ… `-m`/`--multi` captures all app windows | | **Smart naming** | βœ… Descriptive filenames for windows | -| **Window modes** | βœ… `--window` for front window only | +| **Window modes** | βœ… `-w`/`--window` for front window only | | **Auto paths** | βœ… Optional output path with smart /tmp defaults | | **Smart filenames** | βœ… Model-friendly: app_name_timestamp format | -| **AI Vision Analysis** | βœ… Local Ollama integration with auto-model detection | +| **AI Vision Analysis** | βœ… Ollama + Claude CLI support with smart fallback | | **Smart AI Models** | βœ… Auto-picks best: qwen2.5vl > llava > phi3 > minicpm | | **Smart Image Compression** | βœ… Auto-resizes large images (>5MB β†’ 2048px) for AI | +| **AI Provider Selection** | βœ… Auto-detect or specify with `--provider` flag | +| **Performance Tracking** | βœ… Shows analysis time for benchmarking | | **Cloud AI Integration** | βœ… Self-documenting for Claude, Windsurf, ChatGPT, etc. | -| **Verbose logging** | βœ… `--verbose` for debugging | +| **Quiet mode** | βœ… `-q`/`--quiet` for minimal output | +| **Verbose logging** | βœ… `-v`/`--verbose` for debugging | --- @@ -625,6 +849,7 @@ property verboseLogging : false -- Debug output ### πŸ€– **AI-Powered Vision** - **Local analysis**: Private Ollama integration, no cloud - **Smart model selection**: Auto-picks best available model +- **Multi-window intelligence**: Analyzes ALL windows automatically - **One or two-step**: Screenshot+analyze or analyze existing images - **Perfect for automation**: Visual testing, error detection, QA @@ -645,9 +870,11 @@ Built in the style of the legendary **terminator.scpt** β€” because good pattern ``` πŸ“ Peekaboo/ -β”œβ”€β”€ 🎯 peekaboo.scpt # Main screenshot tool -β”œβ”€β”€ πŸ§ͺ test_screenshotter.sh # Test suite -└── πŸ“– README.md # This awesomeness +β”œβ”€β”€ 🎯 peekaboo.scpt # Main screenshot tool (v1.0) +β”œβ”€β”€ πŸ§ͺ test_peekaboo.sh # Comprehensive test suite +β”œβ”€β”€ πŸ“– README.md # This awesomeness +└── 🎨 assets/ + └── banner.png # Project banner ``` --- diff --git a/peekaboo.scpt b/peekaboo.scpt index 26b69c8..eedbe68 100755 --- a/peekaboo.scpt +++ b/peekaboo.scpt @@ -1,8 +1,12 @@ #!/usr/bin/osascript -------------------------------------------------------------------------------- --- peekaboo_enhanced.scpt - v1.0.0 "Peekaboo Pro! πŸ‘€ β†’ πŸ“Έ β†’ πŸ’Ύ" +-- peekaboo.scpt - v1.0.0 "Peekaboo Pro! πŸ‘€ β†’ πŸ“Έ β†’ πŸ’Ύ" -- Enhanced screenshot capture with multi-window support and app discovery -- Peekabooβ€”screenshot got you! Now you see it, now it's saved. +-- +-- IMPORTANT: This script uses non-interactive screencapture methods +-- Do NOT use flags like -o -W which require user interaction +-- Instead use -l for specific window capture -------------------------------------------------------------------------------- --#region Configuration Properties @@ -17,6 +21,11 @@ property maxWindowTitleLength : 50 property defaultVisionModel : "qwen2.5vl:7b" -- Prioritized list of vision models (best to fallback) property visionModelPriority : {"qwen2.5vl:7b", "llava:7b", "llava-phi3:3.8b", "minicpm-v:8b", "gemma3:4b", "llava:latest", "qwen2.5vl:3b", "llava:13b", "llava-llama3:8b"} +-- AI Provider Configuration +property aiProvider : "auto" -- "auto", "ollama", "claude" +property claudeModel : "sonnet" -- default Claude model alias +-- AI Analysis Timeout (90 seconds) +property aiAnalysisTimeout : 90 --#endregion Configuration Properties --#region Helper Functions @@ -137,6 +146,83 @@ on trimWhitespace(theText) end repeat return newText end trimWhitespace + +on formatCaptureOutput(outputPath, appName, mode, isQuiet) + if isQuiet then + return outputPath + else + set msg to scriptInfoPrefix & "Screenshot captured successfully! πŸ“Έ" & linefeed + set msg to msg & "β€’ File: " & outputPath & linefeed + set msg to msg & "β€’ App: " & appName & linefeed + set msg to msg & "β€’ Mode: " & mode + return msg + end if +end formatCaptureOutput + +on formatMultiOutput(capturedFiles, appName, isQuiet) + if isQuiet then + -- Just return paths separated by newlines + set paths to "" + repeat with fileInfo in capturedFiles + set filePath to item 1 of fileInfo + set paths to paths & filePath & linefeed + end repeat + return paths + else + set windowCount to count of capturedFiles + set msg to scriptInfoPrefix & "Multi-window capture successful! Captured " & windowCount & " window(s) for " & appName & ":" & linefeed + repeat with fileInfo in capturedFiles + set filePath to item 1 of fileInfo + set winTitle to item 2 of fileInfo + set msg to msg & " πŸ“Έ " & filePath & " β†’ \"" & winTitle & "\"" & linefeed + end repeat + return msg + end if +end formatMultiOutput + +on formatMultiWindowAnalysis(capturedFiles, analysisResults, appName, question, model, isQuiet) + if isQuiet then + -- In quiet mode, return condensed results + set output to "" + repeat with result in analysisResults + set winTitle to windowTitle of result + set answer to answer of result + set output to output & scriptInfoPrefix & "Window \"" & winTitle & "\": " & answer & linefeed + end repeat + return output + else + -- Full formatted output + set windowCount to count of capturedFiles + set msg to scriptInfoPrefix & "Multi-window AI Analysis Complete! πŸ€–" & linefeed & linefeed + set msg to msg & "πŸ“Έ App: " & appName & " (" & windowCount & " windows)" & linefeed + set msg to msg & "❓ Question: " & question & linefeed + set msg to msg & "πŸ€– Model: " & model & linefeed & linefeed + + set msg to msg & "πŸ’¬ Results for each window:" & linefeed & linefeed + + set windowNum to 1 + repeat with result in analysisResults + set winTitle to windowTitle of result + set winIndex to windowIndex of result + set answer to answer of result + set success to success of result + + set msg to msg & "πŸͺŸ Window " & windowNum & ": \"" & winTitle & "\"" & linefeed + if success then + set msg to msg & answer & linefeed & linefeed + else + set msg to msg & "⚠️ Analysis failed: " & answer & linefeed & linefeed + end if + + set windowNum to windowNum + 1 + end repeat + + -- Add timing info if available + set msg to msg & scriptInfoPrefix & "Analysis of " & windowCount & " windows complete." + + return msg + end if +end formatMultiWindowAnalysis --#endregion Helper Functions --#region AI Analysis Functions @@ -152,6 +238,16 @@ on checkOllamaAvailable() end try end checkOllamaAvailable +on checkClaudeAvailable() + try + -- Check if claude command exists + do shell script "claude --version >/dev/null 2>&1" + return true + on error + return false + end try +end checkClaudeAvailable + on getAvailableVisionModels() set availableModels to {} try @@ -225,11 +321,14 @@ on getOllamaInstallInstructions() return instructions end getOllamaInstallInstructions -on analyzeImageWithAI(imagePath, question, requestedModel) +on analyzeImageWithOllama(imagePath, question, requestedModel) my logVerbose("Analyzing image with AI: " & imagePath) my logVerbose("Requested model: " & requestedModel) my logVerbose("Question: " & question) + -- Record start time + set startTime to do shell script "date +%s" + -- Check if Ollama is available if not my checkOllamaAvailable() then return my formatErrorMessage("Ollama Error", "Ollama is not installed or not in PATH." & linefeed & linefeed & my getOllamaInstallInstructions(), "ollama unavailable") @@ -278,7 +377,8 @@ on analyzeImageWithAI(imagePath, question, requestedModel) close access fileRef end try end try - set curlCmd to "curl -s -X POST http://localhost:11434/api/generate -H 'Content-Type: application/json' -d @" & quoted form of jsonTempFile + -- Add timeout to curl command (60 seconds) + set curlCmd to "curl -s -X POST http://localhost:11434/api/generate -H 'Content-Type: application/json' -d @" & quoted form of jsonTempFile & " --max-time " & aiAnalysisTimeout set response to do shell script curlCmd @@ -308,10 +408,29 @@ on analyzeImageWithAI(imagePath, question, requestedModel) error "Could not parse response: " & response end if - return scriptInfoPrefix & "AI Analysis Complete! πŸ€–" & linefeed & linefeed & "πŸ“Έ Image: " & imagePath & linefeed & "❓ Question: " & question & linefeed & "πŸ€– Model: " & modelToUse & linefeed & linefeed & "πŸ’¬ Answer:" & linefeed & aiResponse + -- Calculate elapsed time + set endTime to do shell script "date +%s" + set elapsedTime to (endTime as number) - (startTime as number) + -- Simple formatting - just show seconds + set elapsedTimeFormatted to elapsedTime as string + + set resultMsg to scriptInfoPrefix & "AI Analysis Complete! πŸ€–" & linefeed & linefeed + set resultMsg to resultMsg & "πŸ“Έ Image: " & imagePath & linefeed + set resultMsg to resultMsg & "❓ Question: " & question & linefeed + set resultMsg to resultMsg & "πŸ€– Model: " & modelToUse & linefeed & linefeed + set resultMsg to resultMsg & "πŸ’¬ Answer:" & linefeed & aiResponse & linefeed & linefeed + set resultMsg to resultMsg & scriptInfoPrefix & "Analysis via " & modelToUse & " took " & elapsedTimeFormatted & " sec." + + return resultMsg on error errMsg - if errMsg contains "model" and errMsg contains "not found" then + -- Calculate elapsed time even on error + set endTime to do shell script "date +%s" + set elapsedTime to (endTime as number) - (startTime as number) + + if errMsg contains "curl" and (errMsg contains "timed out" or errMsg contains "timeout" or elapsedTime β‰₯ aiAnalysisTimeout) then + return my formatErrorMessage("Timeout Error", "AI analysis timed out after " & aiAnalysisTimeout & " seconds." & linefeed & linefeed & "The model '" & modelToUse & "' may be too large or slow for your system." & linefeed & linefeed & "Try:" & linefeed & "β€’ Using a smaller model (e.g., llava-phi3:3.8b)" & linefeed & "β€’ Checking if Ollama is responding: ollama list" & linefeed & "β€’ Restarting Ollama service", "timeout") + else if errMsg contains "model" and errMsg contains "not found" then return my formatErrorMessage("Model Error", "Model '" & modelToUse & "' not found." & linefeed & linefeed & "Install it with: ollama pull " & modelToUse & linefeed & linefeed & my getOllamaInstallInstructions(), "model not found") else return my formatErrorMessage("Analysis Error", "Failed to analyze image: " & errMsg & linefeed & linefeed & "Make sure Ollama is running and the model is available.", "ollama execution") @@ -327,6 +446,99 @@ on escapeJSON(inputText) set escapedText to my replaceText(escapedText, tab, "\\t") return escapedText end escapeJSON + +on analyzeImageWithClaude(imagePath, question, modelAlias) + my logVerbose("Analyzing image with Claude: " & imagePath) + my logVerbose("Model: " & modelAlias) + my logVerbose("Question: " & question) + + -- Record start time + set startTime to do shell script "date +%s" + + -- Check if Claude is available + if not my checkClaudeAvailable() then + return my formatErrorMessage("Claude Error", "Claude CLI is not installed." & linefeed & linefeed & "Install it from: https://claude.ai/code", "claude unavailable") + end if + + -- Get Claude version + set claudeVersion to "" + try + set claudeVersion to do shell script "claude --version 2>/dev/null | head -1" + on error + set claudeVersion to "unknown" + end try + + try + -- Note: Claude CLI doesn't support direct image file analysis + -- This is a limitation of the current Claude CLI implementation + set errorMsg to "Claude CLI currently doesn't support direct image file analysis." & linefeed & linefeed + set errorMsg to errorMsg & "Claude can analyze images through:" & linefeed + set errorMsg to errorMsg & "β€’ Copy/paste images in interactive mode" & linefeed + set errorMsg to errorMsg & "β€’ MCP (Model Context Protocol) integrations" & linefeed & linefeed + set errorMsg to errorMsg & "For automated image analysis, please use Ollama with vision models instead." + + -- Calculate elapsed time even for error + set endTime to do shell script "date +%s" + set elapsedTime to (endTime as number) - (startTime as number) + set elapsedTimeFormatted to elapsedTime as string + + set errorMsg to errorMsg & linefeed & linefeed & scriptInfoPrefix & "Claude " & claudeVersion & " check took " & elapsedTimeFormatted & " sec." + + return my formatErrorMessage("Claude Limitation", errorMsg, "feature not supported") + + on error errMsg + return my formatErrorMessage("Claude Analysis Error", "Failed to analyze image with Claude: " & errMsg, "claude execution") + end try +end analyzeImageWithClaude + +on analyzeImageWithAI(imagePath, question, requestedModel, requestedProvider) + my logVerbose("Starting AI analysis with smart provider selection") + my logVerbose("Requested provider: " & requestedProvider) + + -- Determine which AI provider to use + set ollamaAvailable to my checkOllamaAvailable() + set claudeAvailable to my checkClaudeAvailable() + + my logVerbose("Ollama available: " & ollamaAvailable) + my logVerbose("Claude available: " & claudeAvailable) + + -- If neither is available, provide helpful error + if not ollamaAvailable and not claudeAvailable then + set errorMsg to "Neither Ollama nor Claude CLI is installed." & linefeed & linefeed + set errorMsg to errorMsg & "Install one of these AI providers:" & linefeed & linefeed + set errorMsg to errorMsg & "πŸ€– Ollama (local, privacy-focused):" & linefeed + set errorMsg to errorMsg & my getOllamaInstallInstructions() & linefeed & linefeed + set errorMsg to errorMsg & "☁️ Claude CLI (cloud-based):" & linefeed + set errorMsg to errorMsg & "Install from: https://claude.ai/code" + return my formatErrorMessage("No AI Provider", errorMsg, "no ai provider") + end if + + -- Smart selection based on availability and preference + if requestedProvider is "ollama" and ollamaAvailable then + return my analyzeImageWithOllama(imagePath, question, requestedModel) + else if requestedProvider is "claude" and claudeAvailable then + return my analyzeImageWithClaude(imagePath, question, requestedModel) + else if requestedProvider is "auto" then + -- Auto mode: prefer Ollama, fallback to Claude + if ollamaAvailable then + return my analyzeImageWithOllama(imagePath, question, requestedModel) + else if claudeAvailable then + return my analyzeImageWithClaude(imagePath, question, requestedModel) + end if + else + -- Requested provider not available, try the other one + if ollamaAvailable then + my logVerbose("Requested provider not available, using Ollama instead") + return my analyzeImageWithOllama(imagePath, question, requestedModel) + else if claudeAvailable then + my logVerbose("Requested provider not available, using Claude instead") + return my analyzeImageWithClaude(imagePath, question, requestedModel) + end if + end if + + -- Should never reach here + return my formatErrorMessage("Provider Error", "Unable to determine AI provider", "provider selection") +end analyzeImageWithAI --#endregion AI Analysis Functions --#region App Discovery Functions @@ -593,10 +805,17 @@ on captureScreenshot(outputPath, captureMode, appName) set screencaptureCmd to "screencapture -x" if captureMode is "window" then - -- Use frontmost window without interaction - set screencaptureCmd to screencaptureCmd & " -o -W" + -- IMPORTANT: Do NOT use -o -W flags as they require user interaction! + -- Instead, get the window ID of the frontmost window programmatically + try + -- Get the window ID of the frontmost window of the frontmost app + set windowID to do shell script "osascript -e 'tell application \"System Events\" to get the id of the first window of (first process whose frontmost is true)' 2>/dev/null" + set screencaptureCmd to screencaptureCmd & " -l" & windowID + on error + -- Fallback to full screen if we can't get window ID + my logVerbose("Could not get window ID, falling back to full screen capture") + end try end if - -- Remove interactive mode - not suitable for unattended operation -- Add format flag if not PNG (default) if fileExt is not "png" then @@ -637,10 +856,16 @@ on captureMultipleWindows(appName, baseOutputPath) -- Get detailed window status first set windowStatus to my getAppWindowStatus(appName) - -- Check if it's an error - if (windowStatus starts with scriptInfoPrefix) then - return windowStatus -- Return the descriptive error - end if + -- Check if it's an error (string) or success (record) + try + set statusClass to class of windowStatus + if statusClass is text or statusClass is string then + -- It's an error message + return windowStatus + end if + on error + -- Assume it's a record and continue + end try -- Extract window info from successful status set windowInfo to windowInfo of windowStatus @@ -708,119 +933,181 @@ end captureMultipleWindows on run argv set appSpecificErrorOccurred to false try - my logVerbose("Starting Screenshotter Enhanced v2.0.0") + my logVerbose("Starting Peekaboo v2.0.0") set argCount to count argv - -- Handle special commands - if argCount = 1 then - set command to item 1 of argv - if command is "list" or command is "--list" or command is "-l" then - set appList to my listRunningApps() - return my formatAppList(appList) - else if command is "help" or command is "--help" or command is "-h" then - return my usageText() - end if - end if - - -- Handle analyze command for existing images (two-step workflow) - if argCount β‰₯ 3 then - set firstArg to item 1 of argv - if firstArg is "analyze" or firstArg is "--analyze" then - set imagePath to item 2 of argv - set question to item 3 of argv - set modelToUse to defaultVisionModel - - -- Check for custom model - if argCount β‰₯ 5 then - set modelFlag to item 4 of argv - if modelFlag is "--model" then - set modelToUse to item 5 of argv - end if - end if - - return my analyzeImageWithAI(imagePath, question, modelToUse) - end if - end if - - if argCount < 1 then return my usageText() - - -- Initialize variables - set captureMode to "screen" -- default + -- Initialize all variables + set command to "" -- "capture", "analyze", "list", "help" + set appIdentifier to "" + set outputPath to "" + set outputSpecified to false + set captureMode to "" -- will be determined + set forceFullscreen to false set multiWindow to false set analyzeMode to false set analysisQuestion to "" set visionModel to defaultVisionModel - set outputPath to "" - set pathProvided to false - set appIdentifier to "" + set requestedProvider to aiProvider + set outputFormat to "" + set quietMode to false - -- Parse all arguments to find options and app identifier + -- Handle no arguments - default to fullscreen + if argCount = 0 then + set command to "capture" + set forceFullscreen to true + else + -- Check first argument for commands + set firstArg to item 1 of argv + if firstArg is "list" or firstArg is "ls" then + return my formatAppList(my listRunningApps()) + else if firstArg is "help" or firstArg is "-h" or firstArg is "--help" then + return my usageText() + else if firstArg is "analyze" then + set command to "analyze" + -- analyze command requires at least image and question + if argCount < 3 then + return my formatErrorMessage("Argument Error", "analyze command requires: analyze \"question\"" & linefeed & linefeed & my usageText(), "validation") + end if + set appIdentifier to item 2 of argv -- actually the image path + set analysisQuestion to item 3 of argv + set analyzeMode to true + else + -- Regular capture command + set command to "capture" + -- Check if first arg is a flag or app name + if not (firstArg starts with "-") then + set appIdentifier to firstArg + end if + end if + end if + + -- Parse remaining arguments set i to 1 + if command is "analyze" then set i to 4 -- Skip "analyze image question" + if command is "capture" and appIdentifier is not "" then set i to 2 -- Skip app name + repeat while i ≀ argCount set arg to item i of argv - if arg is "--window" or arg is "-w" then - set captureMode to "window" - else if arg is "--multi" or arg is "-m" then - set multiWindow to true - else if arg is "--verbose" or arg is "-v" then - set verboseLogging to true - else if arg is "--ask" or arg is "--analyze" then - set analyzeMode to true + + -- Handle flags with values + if arg is "--output" or arg is "-o" then + if i < argCount then + set i to i + 1 + set outputPath to item i of argv + set outputSpecified to true + else + return my formatErrorMessage("Argument Error", arg & " requires a path parameter", "validation") + end if + else if arg is "--ask" or arg is "-a" then if i < argCount then set i to i + 1 set analysisQuestion to item i of argv + set analyzeMode to true else - return my formatErrorMessage("Argument Error", "--ask requires a question parameter" & linefeed & linefeed & my usageText(), "validation") + return my formatErrorMessage("Argument Error", arg & " requires a question parameter", "validation") end if else if arg is "--model" then if i < argCount then set i to i + 1 set visionModel to item i of argv else - return my formatErrorMessage("Argument Error", "--model requires a model name parameter" & linefeed & linefeed & my usageText(), "validation") + return my formatErrorMessage("Argument Error", "--model requires a model name parameter", "validation") end if - else if not (arg starts with "--") then - if appIdentifier is "" then - -- First non-option argument is the app identifier - set appIdentifier to arg - else if outputPath is "" then - -- Second non-option argument is the output path - set outputPath to arg - set pathProvided to true + else if arg is "--provider" then + if i < argCount then + set i to i + 1 + set requestedProvider to item i of argv + if requestedProvider is not "auto" and requestedProvider is not "ollama" and requestedProvider is not "claude" then + return my formatErrorMessage("Argument Error", "--provider must be 'auto', 'ollama', or 'claude'", "validation") + end if + else + return my formatErrorMessage("Argument Error", "--provider requires a provider name parameter", "validation") end if + else if arg is "--format" then + if i < argCount then + set i to i + 1 + set outputFormat to item i of argv + if outputFormat is not "png" and outputFormat is not "jpg" and outputFormat is not "pdf" then + return my formatErrorMessage("Argument Error", "--format must be 'png', 'jpg', or 'pdf'", "validation") + end if + else + return my formatErrorMessage("Argument Error", "--format requires a format parameter", "validation") + end if + + -- Handle boolean flags + else if arg is "--fullscreen" or arg is "-f" then + set forceFullscreen to true + else if arg is "--window" or arg is "-w" then + set captureMode to "window" + else if arg is "--multi" or arg is "-m" then + set multiWindow to true + else if arg is "--verbose" or arg is "-v" then + set verboseLogging to true + else if arg is "--quiet" or arg is "-q" then + set quietMode to true + + -- Handle positional argument (output path for old-style compatibility) + else if not (arg starts with "-") and command is "capture" and not outputSpecified then + set outputPath to arg + set outputSpecified to true end if + set i to i + 1 end repeat - -- Handle case where only analysis is requested (full screen mode) - if appIdentifier is "" and analyzeMode then - set appIdentifier to "fullscreen" + -- Handle analyze command + if command is "analyze" then + -- For analyze command, appIdentifier contains the image path + return my analyzeImageWithAI(appIdentifier, analysisQuestion, visionModel, requestedProvider) + end if + + -- For capture command, determine capture mode + if captureMode is "" then + if forceFullscreen or appIdentifier is "" then + set captureMode to "screen" + else + -- App specified, default to window capture + set captureMode to "window" + end if end if -- Set default output path if none provided - if not pathProvided then + if outputPath is "" then set timestamp to do shell script "date +%Y%m%d_%H%M%S" -- Create model-friendly filename with app name - if appIdentifier is "fullscreen" then + if appIdentifier is "" or appIdentifier is "fullscreen" then set appNameForFile to "fullscreen" else set appNameForFile to my sanitizeAppName(appIdentifier) end if - set outputPath to "/tmp/peekaboo_" & appNameForFile & "_" & timestamp & ".png" + + -- Determine extension based on format + set fileExt to outputFormat + if fileExt is "" then set fileExt to defaultScreenshotFormat + + set outputPath to "/tmp/peekaboo_" & appNameForFile & "_" & timestamp & "." & fileExt + else + -- Check if user specified a directory for multi-window mode + if multiWindow and outputPath ends with "/" then + set timestamp to do shell script "date +%Y%m%d_%H%M%S" + set appNameForFile to my sanitizeAppName(appIdentifier) + set fileExt to outputFormat + if fileExt is "" then set fileExt to defaultScreenshotFormat + set outputPath to outputPath & "peekaboo_" & appNameForFile & "_" & timestamp & "." & fileExt + else if outputFormat is not "" and not (outputPath ends with ("." & outputFormat)) then + -- Apply format if specified but not in path + set outputPath to outputPath & "." & outputFormat + end if end if - -- Validate arguments - if appIdentifier is "" then - return my formatErrorMessage("Argument Error", "App identifier cannot be empty." & linefeed & linefeed & my usageText(), "validation") - end if - - if pathProvided and not my isValidPath(outputPath) then - return my formatErrorMessage("Argument Error", "Output path must be an absolute path starting with '/'." & linefeed & linefeed & my usageText(), "validation") + -- Validate output path + if outputSpecified and not my isValidPath(outputPath) then + return my formatErrorMessage("Argument Error", "Output path must be an absolute path starting with '/'.", "validation") end if -- Resolve app identifier with detailed diagnostics - if appIdentifier is "fullscreen" then + if appIdentifier is "" or appIdentifier is "fullscreen" then set appInfo to {appName:"fullscreen", bundleID:"fullscreen", isRunning:true, resolvedBy:"fullscreen"} else set appInfo to my resolveAppIdentifier(appIdentifier) @@ -833,13 +1120,13 @@ on run argv set errorDetails to errorDetails & " This appears to be a bundle ID. Common issues:" & linefeed set errorDetails to errorDetails & "β€’ Bundle ID may be incorrect (try 'com.apple.' prefix for system apps)" & linefeed set errorDetails to errorDetails & "β€’ App may not be installed" & linefeed - set errorDetails to errorDetails & "β€’ Use 'osascript peekaboo_enhanced.scpt list' to see available apps" + set errorDetails to errorDetails & "β€’ Use 'osascript peekaboo.scpt list' to see available apps" else set errorDetails to errorDetails & " This appears to be an app name. Common issues:" & linefeed set errorDetails to errorDetails & "β€’ App name may be incorrect (case-sensitive)" & linefeed set errorDetails to errorDetails & "β€’ App may not be installed or running" & linefeed set errorDetails to errorDetails & "β€’ Try the full app name (e.g., 'Activity Monitor' not 'Activity')" & linefeed - set errorDetails to errorDetails & "β€’ Use 'osascript peekaboo_enhanced.scpt list' to see running apps" + set errorDetails to errorDetails & "β€’ Use 'osascript peekaboo.scpt list' to see running apps" end if return my formatErrorMessage("App Resolution Error", errorDetails, "app resolution") @@ -853,41 +1140,99 @@ on run argv set frontError to my bringAppToFront(appInfo) if frontError is not "" then return frontError - -- Pre-capture window validation for better error messages - if multiWindow or captureMode is "window" then + -- Smart multi-window detection for AI analysis + if analyzeMode and resolvedAppName is not "fullscreen" and not forceFullscreen then + -- Check how many windows the app has set windowStatus to my getAppWindowStatus(resolvedAppName) - if (windowStatus starts with scriptInfoPrefix) then - -- Add context about what the user was trying to do - if multiWindow then - set contextError to "Multi-window capture failed: " & windowStatus - set contextError to contextError & linefeed & "πŸ’‘ Suggestion: Try basic screenshot mode without --multi flag" - else - set contextError to "Window capture failed: " & windowStatus - set contextError to contextError & linefeed & "πŸ’‘ Suggestion: Try full-screen capture mode without --window flag" + try + set statusClass to class of windowStatus + if statusClass is not text and statusClass is not string then + -- It's a success record + set totalWindows to totalWindows of windowStatus + if totalWindows > 1 and not multiWindow and captureMode is not "screen" then + -- Automatically enable multi-window mode for AI analysis + set multiWindow to true + my logVerbose("Auto-enabling multi-window mode for AI analysis (app has " & totalWindows & " windows)") + end if end if - return contextError - end if - - -- Log successful window detection - set statusMsg to message of windowStatus - my logVerbose("Window validation passed: " & statusMsg) + on error + -- Continue without auto-enabling + end try + end if + + -- Pre-capture window validation for better error messages + if (multiWindow or captureMode is "window") and resolvedAppName is not "fullscreen" then + set windowStatus to my getAppWindowStatus(resolvedAppName) + -- Check if it's an error (string starting with prefix) or success (record) + try + set statusClass to class of windowStatus + if statusClass is text or statusClass is string then + -- It's an error message + if multiWindow then + set contextError to "Multi-window capture failed: " & windowStatus + set contextError to contextError & linefeed & "πŸ’‘ Suggestion: Try basic screenshot mode without --multi flag" + else + set contextError to "Window capture failed: " & windowStatus + set contextError to contextError & linefeed & "πŸ’‘ Suggestion: Try full-screen capture mode without --window flag" + end if + return contextError + else + -- It's a success record + set statusMsg to message of windowStatus + my logVerbose("Window validation passed: " & statusMsg) + end if + on error + -- Fallback if type check fails + my logVerbose("Window validation status check bypassed") + end try end if -- Handle multi-window capture if multiWindow then set capturedFiles to my captureMultipleWindows(resolvedAppName, outputPath) - if capturedFiles starts with scriptInfoPrefix then - return capturedFiles -- Error message - else - set windowCount to count of capturedFiles - set resultMsg to scriptInfoPrefix & "Multi-window capture successful! Captured " & windowCount & " window(s) for " & resolvedAppName & ":" & linefeed + -- Check if it's an error (string) or success (list) + try + set capturedClass to class of capturedFiles + if capturedClass is text or capturedClass is string then + return capturedFiles -- Error message + end if + on error + -- Continue with list processing + end try + + -- If AI analysis requested, analyze all captured windows + if analyzeMode and (count of capturedFiles) > 0 then + set analysisResults to {} + set allSuccess to true + repeat with fileInfo in capturedFiles set filePath to item 1 of fileInfo - set winTitle to item 2 of fileInfo - set resultMsg to resultMsg & " πŸ“Έ " & filePath & " β†’ \"" & winTitle & "\"" & linefeed + set windowTitle to item 2 of fileInfo + set windowIndex to item 3 of fileInfo + + set analysisResult to my analyzeImageWithAI(filePath, analysisQuestion, visionModel, requestedProvider) + + if analysisResult starts with scriptInfoPrefix and analysisResult contains "Analysis Complete" then + -- Extract just the answer part from the analysis + set answerStart to (offset of "πŸ’¬ Answer:" in analysisResult) + 10 + set answerEnd to (offset of (scriptInfoPrefix & "Analysis via") in analysisResult) - 1 + if answerStart > 10 and answerEnd > answerStart then + set windowAnswer to text answerStart thru answerEnd of analysisResult + else + set windowAnswer to analysisResult + end if + set end of analysisResults to {windowTitle:windowTitle, windowIndex:windowIndex, answer:windowAnswer, success:true} + else + set allSuccess to false + set end of analysisResults to {windowTitle:windowTitle, windowIndex:windowIndex, answer:analysisResult, success:false} + end if end repeat - set resultMsg to resultMsg & linefeed & "πŸ’‘ All windows captured with descriptive filenames. Each file shows a different window of " & resolvedAppName & "." - return resultMsg + + -- Format multi-window AI analysis results + return my formatMultiWindowAnalysis(capturedFiles, analysisResults, resolvedAppName, analysisQuestion, visionModel, quietMode) + else + -- Process successful capture without AI + return my formatMultiOutput(capturedFiles, resolvedAppName, quietMode) end if else -- Single capture @@ -900,7 +1245,7 @@ on run argv -- If AI analysis requested, analyze the screenshot if analyzeMode then - set analysisResult to my analyzeImageWithAI(screenshotResult, analysisQuestion, visionModel) + set analysisResult to my analyzeImageWithAI(screenshotResult, analysisQuestion, visionModel, requestedProvider) if analysisResult starts with scriptInfoPrefix and analysisResult contains "Analysis Complete" then -- Successful analysis return analysisResult @@ -910,7 +1255,7 @@ on run argv end if else -- Regular screenshot without analysis - return scriptInfoPrefix & "Screenshot captured successfully! πŸ“Έ" & linefeed & "β€’ File: " & screenshotResult & linefeed & "β€’ App: " & resolvedAppName & linefeed & "β€’ Mode: " & modeDescription & linefeed & "πŸ’‘ The " & modeDescription & " of " & resolvedAppName & " has been saved." + return my formatCaptureOutput(screenshotResult, resolvedAppName, modeDescription, quietMode) end if end if end if @@ -925,60 +1270,72 @@ end run --#region Usage Function on usageText() set LF to linefeed - set scriptName to "peekaboo_enhanced.scpt" + set scriptName to "peekaboo.scpt" - set outText to scriptName & " - v1.0.0 \"Peekaboo Pro! πŸ‘€ β†’ πŸ“Έ β†’ πŸ’Ύ\" – Enhanced AppleScript Screenshot Utility" & LF & LF - set outText to outText & "Peekabooβ€”screenshot got you! Now you see it, now it's saved." & LF - set outText to outText & "Takes unattended screenshots with multi-window support and app discovery." & LF & LF + set outText to "Peekaboo v1.0.0 - Screenshot automation that actually works! πŸ‘€ β†’ πŸ“Έ β†’ πŸ’Ύ" & LF & LF - set outText to outText & "Usage:" & LF - set outText to outText & " osascript " & scriptName & " \"\" [\"\"] [options]" & LF - set outText to outText & " osascript " & scriptName & " analyze \"\" \"\" [--model model_name]" & LF - set outText to outText & " osascript " & scriptName & " list" & LF - set outText to outText & " osascript " & scriptName & " help" & LF & LF + set outText to outText & "USAGE:" & LF + set outText to outText & " peekaboo [app] [options] # Screenshot app or fullscreen" & LF + set outText to outText & " peekaboo analyze \"question\" [opts] # Analyze existing image" & LF + set outText to outText & " peekaboo list # List running apps" & LF + set outText to outText & " peekaboo help # Show this help" & LF & LF - set outText to outText & "Parameters:" & LF - set outText to outText & " app_name_or_bundle_id: Application name (e.g., 'Safari') or bundle ID (e.g., 'com.apple.Safari')" & LF - set outText to outText & " output_path: Optional absolute path for screenshot file(s)" & LF - set outText to outText & " If not provided, saves to /tmp/peekaboo_appname_TIMESTAMP.png" & LF & LF + set outText to outText & "COMMANDS:" & LF + set outText to outText & " [app] App name or bundle ID (optional, defaults to fullscreen)" & LF + set outText to outText & " analyze Analyze existing image with AI vision" & LF + set outText to outText & " list, ls List all running apps with window info" & LF + set outText to outText & " help, -h Show this help message" & LF & LF - set outText to outText & "Options:" & LF - set outText to outText & " --window, -w: Capture frontmost window only" & LF - set outText to outText & " --multi, -m: Capture all windows with descriptive names" & LF - set outText to outText & " --ask \"question\": AI analysis of screenshot (requires Ollama)" & LF - set outText to outText & " --model model_name: Custom vision model (auto-detects best available)" & LF - set outText to outText & " --verbose, -v: Enable verbose logging" & LF & LF + set outText to outText & "OPTIONS:" & LF + set outText to outText & " -o, --output Output file or directory path" & LF + set outText to outText & " -f, --fullscreen Force fullscreen capture" & LF + set outText to outText & " -w, --window Single window capture (default with app)" & LF + set outText to outText & " -m, --multi Capture all app windows separately" & LF + set outText to outText & " -a, --ask \"question\" AI analysis of screenshot" & LF + set outText to outText & " --model AI model (e.g., llava:7b)" & LF + set outText to outText & " --provider AI provider: auto|ollama|claude" & LF + set outText to outText & " --format Output format: png|jpg|pdf" & LF + set outText to outText & " -v, --verbose Enable debug output" & LF + set outText to outText & " -q, --quiet Minimal output (just file path)" & LF & LF - set outText to outText & "Commands:" & LF - set outText to outText & " list: List all running apps with window titles" & LF - set outText to outText & " analyze: Analyze existing image with AI vision" & LF - set outText to outText & " help: Show this help message" & LF & LF + set outText to outText & "EXAMPLES:" & LF + set outText to outText & " # Basic captures" & LF + set outText to outText & " peekaboo # Fullscreen" & LF + set outText to outText & " peekaboo Safari # Safari window" & LF + set outText to outText & " peekaboo Safari -o ~/Desktop/safari.png # Specific path" & LF + set outText to outText & " peekaboo -f -o screenshot.jpg --format jpg # Fullscreen as JPG" & LF & LF - set outText to outText & "Examples:" & LF - set outText to outText & " # List running applications:" & LF - set outText to outText & " osascript " & scriptName & " list" & LF - set outText to outText & " # Screenshot Safari to /tmp with timestamp:" & LF - set outText to outText & " osascript " & scriptName & " \"Safari\"" & LF - set outText to outText & " # Full screen capture with custom path:" & LF - set outText to outText & " osascript " & scriptName & " \"Safari\" \"/Users/username/Desktop/safari.png\"" & LF - set outText to outText & " # Front window only:" & LF - set outText to outText & " osascript " & scriptName & " \"TextEdit\" \"/tmp/textedit.png\" --window" & LF - set outText to outText & " # All windows with descriptive names:" & LF - set outText to outText & " osascript " & scriptName & " \"Safari\" \"/tmp/safari_windows.png\" --multi" & LF - set outText to outText & " # One-step: Screenshot + AI analysis:" & LF - set outText to outText & " osascript " & scriptName & " \"Safari\" --ask \"What's on this page?\"" & LF - set outText to outText & " # Two-step: Analyze existing image:" & LF - set outText to outText & " osascript " & scriptName & " analyze \"/tmp/screenshot.png\" \"Describe what you see\"" & LF - set outText to outText & " # Custom model:" & LF - set outText to outText & " osascript " & scriptName & " \"Safari\" --ask \"Any errors?\" --model llava:13b" & LF & LF + set outText to outText & " # Multi-window capture" & LF + set outText to outText & " peekaboo Chrome -m # All Chrome windows" & LF + set outText to outText & " peekaboo Safari -m -o ~/screenshots/ # To directory" & LF & LF + + set outText to outText & " # AI analysis" & LF + set outText to outText & " peekaboo Safari -a \"What's on this page?\" # Screenshot + analyze" & LF + set outText to outText & " peekaboo -f -a \"Any errors visible?\" # Fullscreen + analyze" & LF + set outText to outText & " peekaboo analyze photo.png \"What is this?\" # Analyze existing" & LF + set outText to outText & " peekaboo Terminal -a \"Show the error\" --model llava:13b" & LF & LF + + set outText to outText & " # Other commands" & LF + set outText to outText & " peekaboo list # Show running apps" & LF + set outText to outText & " peekaboo help # This help" & LF & LF + + set outText to outText & "Note: When using with osascript, quote arguments and escape as needed:" & LF + set outText to outText & " osascript peekaboo.scpt Safari -a \"What's shown?\"" & LF & LF set outText to outText & "AI Analysis Features:" & LF - set outText to outText & " β€’ Local inference with Ollama (private, no data sent to cloud)" & LF - set outText to outText & " β€’ Auto-detects best available vision model from your Ollama install" & LF - set outText to outText & " β€’ Priority: qwen2.5vl:7b > llava:7b > llava-phi3:3.8b > minicpm-v:8b" & LF + set outText to outText & " β€’ Smart provider detection: auto-detects Ollama or Claude CLI" & LF + set outText to outText & " β€’ Smart multi-window: Automatically analyzes ALL windows for multi-window apps" & LF + set outText to outText & " - App has 3 windows? Analyzes all 3 and reports on each" & LF + set outText to outText & " - Use -w flag to force single window analysis" & LF + set outText to outText & " β€’ Ollama: Local inference with vision models (recommended)" & LF + set outText to outText & " - Supports direct image file analysis" & LF + set outText to outText & " - Priority: qwen2.5vl:7b > llava:7b > llava-phi3:3.8b > minicpm-v:8b" & LF + set outText to outText & " β€’ Claude: Limited support (CLI doesn't analyze image files)" & LF + set outText to outText & " - Claude CLI detected but can't process image files directly" & LF + set outText to outText & " - Use Ollama for automated image analysis" & LF set outText to outText & " β€’ One-step: Screenshot + analysis in single command" & LF set outText to outText & " β€’ Two-step: Analyze existing images separately" & LF - set outText to outText & " β€’ Detailed setup guide if models missing" & LF & LF + set outText to outText & " β€’ Timeout protection: 90-second timeout prevents hanging" & LF & LF set outText to outText & "Multi-Window Features:" & LF set outText to outText & " β€’ --multi creates separate files with descriptive names" & LF @@ -987,6 +1344,7 @@ on usageText() set outText to outText & " β€’ Each window is focused before capture for accuracy" & LF & LF set outText to outText & "Notes:" & LF + set outText to outText & " β€’ Default behavior: App specified = window capture, No app = full screen" & LF set outText to outText & " β€’ Requires Screen Recording permission in System Preferences" & LF set outText to outText & " β€’ Accessibility permission may be needed for window enumeration" & LF set outText to outText & " β€’ Window titles longer than " & maxWindowTitleLength & " characters are truncated" & LF diff --git a/test_peekaboo.sh b/test_peekaboo.sh index 6e88c40..45982a1 100755 --- a/test_peekaboo.sh +++ b/test_peekaboo.sh @@ -166,9 +166,13 @@ run_ai_test() { # Build command arguments based on test type case "$test_type" in "one-step") - cmd_args=("$app_or_image" "--ask" "$question") + cmd_args=("$app_or_image" "-a" "$question") if [[ -n "$model" ]]; then - cmd_args+=(--model "$model") + if [[ "$model" == "--provider"* ]]; then + cmd_args+=($model) + else + cmd_args+=(--model "$model") + fi fi ;; "two-step") @@ -181,13 +185,21 @@ run_ai_test() { fi cmd_args=("analyze" "$test_image" "$question") if [[ -n "$model" ]]; then - cmd_args+=(--model "$model") + if [[ "$model" == "--provider"* ]]; then + cmd_args+=($model) + else + cmd_args+=(--model "$model") + fi fi ;; "analyze-only") cmd_args=("analyze" "$app_or_image" "$question") if [[ -n "$model" ]]; then - cmd_args+=(--model "$model") + if [[ "$model" == "--provider"* ]]; then + cmd_args+=($model) + else + cmd_args+=(--model "$model") + fi fi ;; esac @@ -205,6 +217,11 @@ run_ai_test() { if [[ $exit_code -eq 0 ]] && [[ "$result" == *"AI Analysis Complete"* ]]; then log_success "$test_name - AI analysis completed successfully" log_info " Model used: $(echo "$result" | grep "πŸ€– Model:" | cut -d: -f2 | xargs || echo "Unknown")" + # Check for timing info + if [[ "$result" == *"took"* && "$result" == *"sec."* ]]; then + local timing=$(echo "$result" | grep -o "took [0-9.]* sec\." || echo "") + log_info " Timing: $timing" + fi # Show first few words of AI response local ai_answer=$(echo "$result" | sed -n '/πŸ’¬ Answer:/,$ p' | tail -n +2 | head -1 | cut -c1-60) if [[ -n "$ai_answer" ]]; then @@ -232,6 +249,43 @@ run_ai_test() { log_warning "$test_name - Skipped (expected)" ((TESTS_FAILED--)) # Don't count as failure ;; + "timing") + if [[ $exit_code -eq 0 ]] && [[ "$result" == *"took"* && "$result" == *"sec."* ]]; then + local timing=$(echo "$result" | grep -o "took [0-9.]* sec\." || echo "") + log_success "$test_name - Timing info present: $timing" + else + log_error "$test_name - Expected timing info but not found" + fi + ;; + "multi-window-success") + if [[ $exit_code -eq 0 ]] && [[ "$result" == *"Multi-window AI Analysis Complete"* ]]; then + log_success "$test_name - Multi-window AI analysis completed successfully" + # Count analyzed windows + local window_count=$(echo "$result" | grep -c "πŸͺŸ Window" || echo "0") + log_info " Analyzed $window_count windows" + # Check for timing info + if [[ "$result" == *"Analysis of"* && "$result" == *"windows complete"* ]]; then + log_info " Multi-window analysis completed" + fi + elif [[ "$result" == *"AI Analysis Complete"* ]]; then + # Single window fallback (app might have closed windows) + log_success "$test_name - Completed (single window mode)" + else + log_error "$test_name - Expected multi-window analysis but got: $(echo "$result" | head -1)" + fi + ;; + "claude-limitation") + if [[ "$result" == *"Claude Limitation"* || "$result" == *"doesn't support direct image file analysis"* ]]; then + log_success "$test_name - Claude limitation correctly reported" + # Check for timing even in error + if [[ "$result" == *"took"* && "$result" == *"sec."* ]]; then + local timing=$(echo "$result" | grep -o "took [0-9.]* sec\." || echo "") + log_info " Timing: $timing" + fi + else + log_error "$test_name - Expected Claude limitation message" + fi + ;; esac echo "" @@ -309,57 +363,112 @@ run_basic_tests() { log_info "=== BASIC FUNCTIONALITY TESTS ===" echo "" - # Test 1: Basic Finder test (Classic) - run_test "Classic: Basic Finder test" \ - "$PEEKABOO_CLASSIC" \ - "Finder" \ - "$TEST_OUTPUT_DIR/classic_finder_${TIMESTAMP}.png" \ - "success" + # Test 1: Basic app window capture + ((TESTS_RUN++)) + log_info "Running test: Basic app window capture" + if result=$(osascript "$PEEKABOO_SCRIPT" Finder -q 2>&1); then + if [[ "$result" =~ ^/tmp/peekaboo_finder_[0-9_]+\.png$ ]]; then + log_success "Basic app window capture - Success" + else + log_error "Basic app window capture - Unexpected output: $result" + fi + else + log_error "Basic app window capture - Failed" + fi + echo "" - # Test 2: Basic Finder test (Pro) - run_test "Pro: Basic Finder test" \ - "$PEEKABOO_PRO" \ - "Finder" \ - "$TEST_OUTPUT_DIR/pro_finder_${TIMESTAMP}.png" \ - "success" + # Test 2: Fullscreen capture (no app) + ((TESTS_RUN++)) + log_info "Running test: Fullscreen capture" + if result=$(osascript "$PEEKABOO_SCRIPT" -q 2>&1); then + if [[ "$result" =~ ^/tmp/peekaboo_fullscreen_[0-9_]+\.png$ ]]; then + log_success "Fullscreen capture - Success" + else + log_error "Fullscreen capture - Unexpected output: $result" + fi + else + log_error "Fullscreen capture - Failed" + fi + echo "" - # Test 3: Bundle ID test - run_test "Classic: Bundle ID test" \ - "$PEEKABOO_CLASSIC" \ - "com.apple.finder" \ - "$TEST_OUTPUT_DIR/classic_finder_bundle_${TIMESTAMP}.png" \ - "success" + # Test 3: Custom output path + ((TESTS_RUN++)) + log_info "Running test: Custom output path" + local custom_path="$TEST_OUTPUT_DIR/custom_test_${TIMESTAMP}.png" + if result=$(osascript "$PEEKABOO_SCRIPT" Safari -o "$custom_path" -q 2>&1); then + if [[ "$result" == "$custom_path" ]] && [[ -f "$custom_path" ]]; then + log_success "Custom output path - File created correctly" + else + log_error "Custom output path - Output mismatch or file missing" + fi + else + log_error "Custom output path - Failed" + fi + echo "" - # Test 4: TextEdit test - run_test "Classic: TextEdit test" \ - "$PEEKABOO_CLASSIC" \ - "TextEdit" \ - "$TEST_OUTPUT_DIR/classic_textedit_${TIMESTAMP}.png" \ - "success" + # Test 4: Bundle ID support + ((TESTS_RUN++)) + log_info "Running test: Bundle ID support" + if result=$(osascript "$PEEKABOO_SCRIPT" com.apple.finder -q 2>&1); then + if [[ "$result" =~ ^/tmp/peekaboo_com_apple_finder_[0-9_]+\.png$ ]]; then + log_success "Bundle ID support - Success" + else + log_error "Bundle ID support - Unexpected output: $result" + fi + else + log_error "Bundle ID support - Failed" + fi + echo "" } run_format_tests() { log_info "=== FORMAT SUPPORT TESTS ===" echo "" - # Test different formats - run_test "Classic: PNG format" \ - "$PEEKABOO_CLASSIC" \ - "Finder" \ - "$TEST_OUTPUT_DIR/format_png_${TIMESTAMP}.png" \ - "success" + # Test 1: PNG format (default) + ((TESTS_RUN++)) + log_info "Running test: PNG format (default)" + local png_path="$TEST_OUTPUT_DIR/format_test_${TIMESTAMP}.png" + if result=$(osascript "$PEEKABOO_SCRIPT" Finder -o "$png_path" -q 2>&1); then + if [[ -f "$png_path" ]]; then + log_success "PNG format - File created successfully" + else + log_error "PNG format - File not created" + fi + else + log_error "PNG format - Failed" + fi + echo "" - run_test "Classic: JPG format" \ - "$PEEKABOO_CLASSIC" \ - "Finder" \ - "$TEST_OUTPUT_DIR/format_jpg_${TIMESTAMP}.jpg" \ - "success" + # Test 2: JPG format with --format flag + ((TESTS_RUN++)) + log_info "Running test: JPG format with flag" + local jpg_base="$TEST_OUTPUT_DIR/format_jpg_${TIMESTAMP}" + if result=$(osascript "$PEEKABOO_SCRIPT" Finder -o "$jpg_base" --format jpg -q 2>&1); then + if [[ -f "${jpg_base}.jpg" ]]; then + log_success "JPG format - File created with correct extension" + else + log_error "JPG format - File not created or wrong extension" + fi + else + log_error "JPG format - Failed" + fi + echo "" - run_test "Classic: PDF format" \ - "$PEEKABOO_CLASSIC" \ - "TextEdit" \ - "$TEST_OUTPUT_DIR/format_pdf_${TIMESTAMP}.pdf" \ - "success" + # Test 3: PDF format via extension + ((TESTS_RUN++)) + log_info "Running test: PDF format via extension" + local pdf_path="$TEST_OUTPUT_DIR/format_pdf_${TIMESTAMP}.pdf" + if result=$(osascript "$PEEKABOO_SCRIPT" Finder -o "$pdf_path" -q 2>&1); then + if [[ -f "$pdf_path" ]]; then + log_success "PDF format - File created with auto-detected format" + else + log_error "PDF format - File not created" + fi + else + log_error "PDF format - Failed" + fi + echo "" run_test "Pro: No extension (default PNG)" \ "$PEEKABOO_PRO" \ @@ -369,36 +478,68 @@ run_format_tests() { } run_advanced_tests() { - log_info "=== ADVANCED PEEKABOO PRO TESTS ===" + log_info "=== ADVANCED FEATURE TESTS ===" echo "" - # Test window mode - run_test "Pro: Window mode test" \ - "$PEEKABOO_PRO" \ - "Finder" \ - "$TEST_OUTPUT_DIR/pro_window_${TIMESTAMP}.png" \ - "success" \ - "--window" + # Test 1: Multi-window mode + ((TESTS_RUN++)) + log_info "Running test: Multi-window mode" + if result=$(osascript "$PEEKABOO_SCRIPT" Finder -m -o "$TEST_OUTPUT_DIR/" 2>&1); then + # Check if multiple files were created + local window_files=$(ls "$TEST_OUTPUT_DIR"/peekaboo_finder_*_window_*.png 2>/dev/null | wc -l) + if [[ $window_files -gt 0 ]]; then + log_success "Multi-window mode - Created $window_files window files" + else + log_error "Multi-window mode - No window files created" + fi + else + log_error "Multi-window mode - Failed: $result" + fi + echo "" - # Test multi-window mode - run_test "Pro: Multi-window mode" \ - "$PEEKABOO_PRO" \ - "Finder" \ - "$TEST_OUTPUT_DIR/pro_multi_${TIMESTAMP}.png" \ - "success" \ - "--multi" + # Test 2: Forced fullscreen with app + ((TESTS_RUN++)) + log_info "Running test: Forced fullscreen with app" + if result=$(osascript "$PEEKABOO_SCRIPT" Safari -f -q 2>&1); then + if [[ "$result" =~ fullscreen ]]; then + log_success "Forced fullscreen - Correctly captured fullscreen despite app" + else + log_error "Forced fullscreen - Wrong capture mode" + fi + else + log_error "Forced fullscreen - Failed" + fi + echo "" - # Test verbose mode - run_test "Pro: Verbose mode" \ - "$PEEKABOO_PRO" \ - "Finder" \ - "$TEST_OUTPUT_DIR/pro_verbose_${TIMESTAMP}.png" \ - "success" \ - "--verbose" + # Test 3: Verbose mode + ((TESTS_RUN++)) + log_info "Running test: Verbose mode" + if result=$(osascript "$PEEKABOO_SCRIPT" Finder -v -q 2>&1); then + # Verbose should still output just path in quiet mode + if [[ "$result" =~ ^/tmp/peekaboo_finder_[0-9_]+\.png$ ]]; then + log_success "Verbose mode - Works with quiet mode" + else + log_warning "Verbose mode - May have extra output" + fi + else + log_error "Verbose mode - Failed" + fi + echo "" - # Test combined flags - run_test "Pro: Window + Verbose" \ - "$PEEKABOO_PRO" \ + # Test 4: Combined options + ((TESTS_RUN++)) + log_info "Running test: Combined options" + local combo_path="$TEST_OUTPUT_DIR/combo_${TIMESTAMP}" + if result=$(osascript "$PEEKABOO_SCRIPT" TextEdit -w -o "$combo_path" --format jpg -v -q 2>&1); then + if [[ -f "${combo_path}.jpg" ]]; then + log_success "Combined options - All options work together" + else + log_error "Combined options - File not created correctly" + fi + else + log_error "Combined options - Failed" + fi + echo "" "TextEdit" \ "$TEST_OUTPUT_DIR/pro_combined_${TIMESTAMP}.png" \ "success" \ @@ -406,20 +547,65 @@ run_advanced_tests() { } run_discovery_tests() { - log_info "=== APP DISCOVERY TESTS ===" + log_info "=== COMMAND TESTS ===" echo "" - # Test list command - run_command_test "Pro: List running apps" \ - "$PEEKABOO_PRO" \ - "list" \ - "Running Applications" + # Test 1: List command + ((TESTS_RUN++)) + log_info "Running test: List command" + if result=$(osascript "$PEEKABOO_SCRIPT" list 2>&1); then + if [[ "$result" == *"Running Applications:"* ]]; then + local app_count=$(echo "$result" | grep -c "^β€’" || echo "0") + log_success "List command - Found $app_count running applications" + else + log_error "List command - Unexpected output" + fi + else + log_error "List command - Failed" + fi + echo "" - # Test help command - run_command_test "Pro: Help command" \ - "$PEEKABOO_PRO" \ - "help" \ - "Peekaboo Pro" + # Test 2: ls alias + ((TESTS_RUN++)) + log_info "Running test: ls command alias" + if result=$(osascript "$PEEKABOO_SCRIPT" ls 2>&1); then + if [[ "$result" == *"Running Applications:"* ]]; then + log_success "ls alias - Works correctly" + else + log_error "ls alias - Unexpected output" + fi + else + log_error "ls alias - Failed" + fi + echo "" + + # Test 3: Help command + ((TESTS_RUN++)) + log_info "Running test: Help command" + if result=$(osascript "$PEEKABOO_SCRIPT" help 2>&1); then + if [[ "$result" == *"USAGE:"* ]] && [[ "$result" == *"OPTIONS:"* ]]; then + log_success "Help command - Shows proper help text" + else + log_error "Help command - Missing expected sections" + fi + else + log_error "Help command - Failed" + fi + echo "" + + # Test 4: -h flag + ((TESTS_RUN++)) + log_info "Running test: -h help flag" + if result=$(osascript "$PEEKABOO_SCRIPT" -h 2>&1); then + if [[ "$result" == *"USAGE:"* ]]; then + log_success "-h flag - Shows help correctly" + else + log_error "-h flag - Unexpected output" + fi + else + log_error "-h flag - Failed" + fi + echo "" run_command_test "Classic: Help command" \ "$PEEKABOO_CLASSIC" \ @@ -521,26 +707,66 @@ run_ai_analysis_tests() { log_info "=== AI VISION ANALYSIS TESTS ===" echo "" - # Check if Ollama is available - if ! check_ollama_available; then - log_warning "Ollama not found - skipping AI analysis tests" - log_info "To enable AI tests: curl -fsSL https://ollama.ai/install.sh | sh && ollama pull llava:7b" + # Check which providers are available + local ollama_available=false + local claude_available=false + + if check_ollama_available; then + ollama_available=true + log_info "βœ… Ollama is available" + else + log_warning "❌ Ollama not found" + fi + + if command -v claude >/dev/null 2>&1; then + claude_available=true + log_info "βœ… Claude CLI is available" + else + log_warning "❌ Claude CLI not found" + fi + + if [[ "$ollama_available" == false && "$claude_available" == false ]]; then + log_warning "No AI providers found - skipping AI analysis tests" + log_info "To enable AI tests:" + log_info " β€’ Ollama: curl -fsSL https://ollama.ai/install.sh | sh && ollama pull llava:7b" + log_info " β€’ Claude: Install from https://claude.ai/code" return fi - # Get available vision models - local models=($(get_test_vision_models)) - if [[ ${#models[@]} -eq 0 ]]; then - log_warning "No vision models found - skipping AI analysis tests" - log_info "To enable AI tests: ollama pull qwen2.5vl:7b # or llava:7b" - return + # Test Ollama if available + if [[ "$ollama_available" == true ]]; then + # Get available vision models + local models=($(get_test_vision_models)) + if [[ ${#models[@]} -eq 0 ]]; then + log_warning "No Ollama vision models found - skipping Ollama tests" + log_info "To enable: ollama pull qwen2.5vl:7b # or llava:7b" + else + log_info "Found Ollama vision models: ${models[*]}" + local test_model="${models[0]}" # Use first available model + + # Run Ollama-specific tests + run_ollama_tests "$test_model" + fi fi - log_info "Found vision models: ${models[*]}" - local test_model="${models[0]}" # Use first available model + # Test Claude if available + if [[ "$claude_available" == true ]]; then + run_claude_tests + fi + + # Test provider selection + if [[ "$ollama_available" == true || "$claude_available" == true ]]; then + run_provider_selection_tests "$ollama_available" "$claude_available" + fi +} + +run_ollama_tests() { + local test_model="$1" + log_info "" + log_info "--- Ollama Provider Tests ---" # Test 1: One-step AI analysis (screenshot + analyze) - run_ai_test "AI: One-step screenshot + analysis" \ + run_ai_test "Ollama: One-step screenshot + analysis" \ "$PEEKABOO_SCRIPT" \ "one-step" \ "Finder" \ @@ -581,7 +807,54 @@ run_ai_analysis_tests() { log_warning "Could not create test screenshot for analysis" fi - # Test 5: Error handling - invalid model + # Test 5: Multi-window AI analysis (if supported app available) + # Try to find an app with multiple windows + local multi_window_app="" + for app in "Safari" "Chrome" "Google Chrome" "Firefox" "TextEdit"; do + if osascript -e "tell application \"System Events\" to get name of every process whose name is \"$app\"" >/dev/null 2>&1; then + # Check if app has multiple windows + local window_count=$(osascript -e "tell application \"System Events\" to tell process \"$app\" to count windows" 2>/dev/null || echo "0") + if [[ $window_count -gt 1 ]]; then + multi_window_app="$app" + break + fi + fi + done + + if [[ -n "$multi_window_app" ]]; then + run_ai_test "Ollama: Multi-window AI analysis" \ + "$PEEKABOO_SCRIPT" \ + "one-step" \ + "$multi_window_app" \ + "What's in each window?" \ + "" \ + "multi-window-success" + else + log_warning "No app with multiple windows found - skipping multi-window AI test" + fi + + # Test 6: Force single window mode with -w flag + if [[ -n "$multi_window_app" ]]; then + ((TESTS_RUN++)) + log_info "Running AI test: Force single window with -w flag" + if result=$(osascript "$PEEKABOO_SCRIPT" "$multi_window_app" -w -a "What's on this tab?" 2>&1); then + if [[ "$result" == *"AI Analysis Complete"* ]] && [[ "$result" != *"Multi-window"* ]]; then + log_success "Single window mode - Correctly analyzed only one window" + else + log_error "Single window mode - Unexpected result" + fi + else + log_error "Single window mode - Failed" + fi + echo "" + fi + + # Note: Timeout testing (90 seconds) is not included in automated tests + # to avoid long test runs. The timeout is implemented with curl --max-time 90 + log_info "Note: AI timeout protection (90s) is active but not tested here" + echo "" + + # Test 7: Error handling - invalid model run_ai_test "AI: Invalid model error handling" \ "$PEEKABOO_SCRIPT" \ "one-step" \ @@ -590,7 +863,7 @@ run_ai_analysis_tests() { "nonexistent-model:999b" \ "error" - # Test 6: Error handling - invalid image path + # Test 8: Error handling - invalid image path run_ai_test "AI: Invalid image path error handling" \ "$PEEKABOO_SCRIPT" \ "analyze-only" \ @@ -618,10 +891,106 @@ run_ai_analysis_tests() { run_ai_test "AI: Complex question with special chars" \ "$PEEKABOO_SCRIPT" \ "one-step" \ - "Finder" \ - "Is there any text that says \"Finder\" or similar? What colors do you see?" \ + "Safari" \ + "What's the URL? Are there any errors?" \ "" \ "success" + + # Test 9: Timing verification + run_ai_test "AI: Verify timing output" \ + "$PEEKABOO_SCRIPT" \ + "one-step" \ + "Finder" \ + "What is shown?" \ + "" \ + "timing" +} + +run_claude_tests() { + log_info "" + log_info "--- Claude Provider Tests ---" + + # Test 1: Claude provider selection + run_ai_test "Claude: Provider selection test" \ + "$PEEKABOO_SCRIPT" \ + "one-step" \ + "Finder" \ + "What do you see?" \ + "--provider claude" \ + "claude-limitation" + + # Test 2: Claude analyze command + run_ai_test "Claude: Analyze existing image" \ + "$PEEKABOO_SCRIPT" \ + "analyze-only" \ + "$TEST_OUTPUT_DIR/test_image.png" \ + "Describe this" \ + "--provider claude" \ + "claude-limitation" + + # Test 3: Claude timing verification + ((TESTS_RUN++)) + log_info "Running AI test: Claude timing verification" + local result + if result=$(osascript "$PEEKABOO_SCRIPT" "Safari" "--ask" "Test" "--provider" "claude" 2>&1); then + if [[ "$result" == *"check took"* && "$result" == *"sec."* ]]; then + log_success "Claude: Timing verification - Shows execution time" + else + log_error "Claude: Timing verification - Missing timing info" + fi + else + log_error "Claude: Timing verification - Unexpected error: $result" + fi + echo "" +} + +run_provider_selection_tests() { + local ollama_available="$1" + local claude_available="$2" + + log_info "" + log_info "--- Provider Selection Tests ---" + + # Test auto selection + ((TESTS_RUN++)) + log_info "Running AI test: Auto provider selection" + local result + if result=$(osascript "$PEEKABOO_SCRIPT" "Finder" "--ask" "What is shown?" 2>&1); then + if [[ "$ollama_available" == true ]]; then + if [[ "$result" == *"Model:"* || "$result" == *"Analysis via"* ]]; then + log_success "Provider: Auto selection - Correctly used Ollama" + else + log_error "Provider: Auto selection - Unexpected result" + fi + else + if [[ "$result" == *"Claude"* ]]; then + log_success "Provider: Auto selection - Correctly fell back to Claude" + else + log_error "Provider: Auto selection - Unexpected result" + fi + fi + fi + echo "" + + # Test explicit Ollama selection + if [[ "$ollama_available" == true ]]; then + run_ai_test "Provider: Explicit Ollama selection" \ + "$PEEKABOO_SCRIPT" \ + "one-step" \ + "Finder" \ + "What do you see?" \ + "--provider ollama" \ + "success" + fi + + # Test invalid provider + run_ai_test "Provider: Invalid provider error" \ + "$PEEKABOO_SCRIPT" \ + "one-step" \ + "Finder" \ + "Test" \ + "--provider invalid" \ + "error" } run_performance_tests() { @@ -751,29 +1120,29 @@ show_usage_tests() { echo "" # Test Classic usage - log_info "Testing Classic usage output..." - local classic_usage - if classic_usage=$(osascript "$PEEKABOO_CLASSIC" 2>&1); then - if [[ "$classic_usage" == *"Usage:"* ]] && [[ "$classic_usage" == *"Peekaboo"* ]]; then - log_success "Classic usage test - Proper usage information displayed" + log_info "Testing help output..." + local help_output + if help_output=$(osascript "$PEEKABOO_SCRIPT" help 2>&1); then + if [[ "$help_output" == *"USAGE:"* ]] && [[ "$help_output" == *"Peekaboo"* ]]; then + log_success "Help output test - Proper usage information displayed" else - log_error "Classic usage test - Usage information incomplete" + log_error "Help output test - Usage information incomplete" fi else - log_error "Classic usage test - Failed to get usage output" + log_error "Help output test - Failed to get help output" fi - # Test Pro usage - log_info "Testing Pro usage output..." - local pro_usage - if pro_usage=$(osascript "$PEEKABOO_PRO" 2>&1); then - if [[ "$pro_usage" == *"Usage:"* ]] && [[ "$pro_usage" == *"Peekaboo Pro"* ]]; then - log_success "Pro usage test - Proper usage information displayed" + # Test no arguments (should capture fullscreen) + log_info "Testing no arguments (fullscreen capture)..." + local no_args_output + if no_args_output=$(osascript "$PEEKABOO_SCRIPT" -q 2>&1); then + if [[ "$no_args_output" =~ ^/tmp/peekaboo_fullscreen_[0-9_]+\.png$ ]]; then + log_success "No args test - Correctly captures fullscreen" else - log_error "Pro usage test - Usage information incomplete" + log_error "No args test - Unexpected output: $no_args_output" fi else - log_error "Pro usage test - Failed to get usage output" + log_error "No args test - Failed" fi echo "" }