mirror of
https://github.com/samsonjs/Peekaboo.git
synced 2026-04-27 15:07:41 +00:00
🤖 Added AI Vision Analysis with Smart Model Selection
Major new feature: Local AI vision analysis with Ollama integration Features: • One-step: Screenshot + AI analysis in single command • Two-step: Analyze existing images separately • Smart model auto-detection with priority ranking • Simplified ollama run commands (no complex API calls) • Comprehensive error handling and setup instructions Priority models: qwen2.5vl:7b > llava:7b > llava-phi3:3.8b > minicpm-v:8b Examples: osascript peekaboo.scpt "Safari" --ask "What's on this page?" osascript peekaboo.scpt analyze "/tmp/shot.png" "Any errors?" Perfect for automated testing, QA, and visual verification\! 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
939c60aaaf
commit
a5132f53c1
2 changed files with 288 additions and 6 deletions
92
README.md
92
README.md
|
|
@ -18,6 +18,7 @@
|
||||||
- 💥 **Zero interaction**: 100% unattended operation
|
- 💥 **Zero interaction**: 100% unattended operation
|
||||||
- 🧠 **Smart filenames**: Model-friendly names with app info
|
- 🧠 **Smart filenames**: Model-friendly names with app info
|
||||||
- ⚡ **Optimized speed**: 70% faster capture delays
|
- ⚡ **Optimized speed**: 70% faster capture delays
|
||||||
|
- 🤖 **AI Vision Analysis**: Local Ollama integration with auto-model detection
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -53,6 +54,12 @@ osascript peekaboo.scpt "Chrome" "/tmp/chrome.png" --multi
|
||||||
|
|
||||||
# 🪟 Just the front window
|
# 🪟 Just the front window
|
||||||
osascript peekaboo.scpt "TextEdit" "/tmp/textedit.png" --window
|
osascript peekaboo.scpt "TextEdit" "/tmp/textedit.png" --window
|
||||||
|
|
||||||
|
# 🤖 AI analysis: Screenshot + question in one step
|
||||||
|
osascript peekaboo.scpt "Safari" --ask "What's on this page?"
|
||||||
|
|
||||||
|
# 🔍 Analyze existing image
|
||||||
|
osascript peekaboo.scpt analyze "/tmp/screenshot.png" "Any errors visible?"
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
@ -125,6 +132,60 @@ osascript peekaboo.scpt "Safari" "/tmp/shot.pdf"
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## 🤖 **AI VISION ANALYSIS**
|
||||||
|
|
||||||
|
Peekaboo integrates with **Ollama** for local AI vision analysis - ask questions about your screenshots!
|
||||||
|
|
||||||
|
### 🚀 **One-Step: Screenshot + Analysis**
|
||||||
|
```bash
|
||||||
|
# Take screenshot and analyze it in one command
|
||||||
|
osascript peekaboo.scpt "Safari" --ask "What's the main content on this page?"
|
||||||
|
osascript peekaboo.scpt "Terminal" --ask "Any error messages visible?"
|
||||||
|
osascript peekaboo.scpt "Xcode" --ask "Is the build successful?"
|
||||||
|
osascript peekaboo.scpt "Chrome" --ask "What product is being shown?" --model llava:13b
|
||||||
|
```
|
||||||
|
|
||||||
|
### 🔍 **Two-Step: Analyze Existing Images**
|
||||||
|
```bash
|
||||||
|
# Analyze screenshots you already have
|
||||||
|
osascript peekaboo.scpt analyze "/tmp/screenshot.png" "Describe what you see"
|
||||||
|
osascript peekaboo.scpt analyze "/path/error.png" "What error is shown?"
|
||||||
|
osascript peekaboo.scpt analyze "/Desktop/ui.png" "Any UI issues?" --model qwen2.5vl:7b
|
||||||
|
```
|
||||||
|
|
||||||
|
### 🧠 **Smart Model Selection**
|
||||||
|
Peekaboo automatically picks the best available vision model:
|
||||||
|
|
||||||
|
**Priority order:**
|
||||||
|
1. `qwen2.5vl:7b` (6GB) - Best doc/chart understanding
|
||||||
|
2. `llava:7b` (4.7GB) - Solid all-rounder
|
||||||
|
3. `llava-phi3:3.8b` (2.9GB) - Tiny but chatty
|
||||||
|
4. `minicpm-v:8b` (5.5GB) - Killer OCR
|
||||||
|
5. `gemma3:4b` (3.3GB) - Multilingual support
|
||||||
|
|
||||||
|
### ⚡ **Quick Setup**
|
||||||
|
```bash
|
||||||
|
# Install Ollama
|
||||||
|
curl -fsSL https://ollama.ai/install.sh | sh
|
||||||
|
|
||||||
|
# Pull a vision model (pick one)
|
||||||
|
ollama pull qwen2.5vl:7b # Recommended: best overall
|
||||||
|
ollama pull llava:7b # Popular: good balance
|
||||||
|
ollama pull llava-phi3:3.8b # Lightweight: low RAM
|
||||||
|
|
||||||
|
# Ready to analyze!
|
||||||
|
osascript peekaboo.scpt "Safari" --ask "What's on this page?"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Perfect for:**
|
||||||
|
- 🧪 Automated UI testing
|
||||||
|
- 📊 Dashboard monitoring
|
||||||
|
- 🐛 Error detection
|
||||||
|
- 📸 Content verification
|
||||||
|
- 🔍 Visual QA automation
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 🧠 **SMART FILENAMES**
|
## 🧠 **SMART FILENAMES**
|
||||||
|
|
||||||
Peekaboo automatically generates **model-friendly** filenames that are perfect for automation:
|
Peekaboo automatically generates **model-friendly** filenames that are perfect for automation:
|
||||||
|
|
@ -214,18 +275,38 @@ osascript peekaboo.scpt "Safari" "/docs/browser.png" --multi
|
||||||
osascript peekaboo.scpt "Your App"
|
osascript peekaboo.scpt "Your App"
|
||||||
# → /tmp/peekaboo_your_app_20250522_143052.png
|
# → /tmp/peekaboo_your_app_20250522_143052.png
|
||||||
|
|
||||||
|
# Automated visual testing with AI
|
||||||
|
osascript peekaboo.scpt "Your App" --ask "Any error messages or crashes visible?"
|
||||||
|
osascript peekaboo.scpt "Your App" --ask "Is the login screen displayed correctly?"
|
||||||
|
|
||||||
# Custom path with timestamp
|
# Custom path with timestamp
|
||||||
osascript peekaboo.scpt "Your App" "/test-results/app-$(date +%s).png"
|
osascript peekaboo.scpt "Your App" "/test-results/app-$(date +%s).png"
|
||||||
```
|
```
|
||||||
|
|
||||||
### 🎬 **Content Creation**
|
### 🎬 **Content Creation**
|
||||||
```bash
|
```bash
|
||||||
# Before/after shots
|
# Before/after shots with AI descriptions
|
||||||
|
osascript peekaboo.scpt "Photoshop" --ask "Describe the current design state"
|
||||||
|
# ... do your work ...
|
||||||
|
osascript peekaboo.scpt "Photoshop" --ask "What changes were made to the design?"
|
||||||
|
|
||||||
|
# Traditional before/after shots
|
||||||
osascript peekaboo.scpt "Photoshop" "/content/before.png"
|
osascript peekaboo.scpt "Photoshop" "/content/before.png"
|
||||||
# ... do your work ...
|
# ... do your work ...
|
||||||
osascript peekaboo.scpt "Photoshop" "/content/after.png"
|
osascript peekaboo.scpt "Photoshop" "/content/after.png"
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### 🧪 **Automated QA & Testing**
|
||||||
|
```bash
|
||||||
|
# Visual regression testing
|
||||||
|
osascript peekaboo.scpt "Your App" --ask "Does the UI look correct?"
|
||||||
|
osascript peekaboo.scpt "Safari" --ask "Are there any broken images or layout issues?"
|
||||||
|
osascript peekaboo.scpt "Terminal" --ask "Any red error text visible?"
|
||||||
|
|
||||||
|
# Dashboard monitoring
|
||||||
|
osascript peekaboo.scpt analyze "/tmp/dashboard.png" "Are all metrics green?"
|
||||||
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 🚨 **TROUBLESHOOTING**
|
## 🚨 **TROUBLESHOOTING**
|
||||||
|
|
@ -268,6 +349,8 @@ osascript peekaboo.scpt "Safari" "/tmp/debug.png" --verbose
|
||||||
| **Window modes** | ✅ `--window` for front window only |
|
| **Window modes** | ✅ `--window` for front window only |
|
||||||
| **Auto paths** | ✅ Optional output path with smart /tmp defaults |
|
| **Auto paths** | ✅ Optional output path with smart /tmp defaults |
|
||||||
| **Smart filenames** | ✅ Model-friendly: app_name_timestamp format |
|
| **Smart filenames** | ✅ Model-friendly: app_name_timestamp format |
|
||||||
|
| **AI Vision Analysis** | ✅ Local Ollama integration with auto-model detection |
|
||||||
|
| **Smart AI Models** | ✅ Auto-picks best: qwen2.5vl > llava > phi3 > minicpm |
|
||||||
| **Verbose logging** | ✅ `--verbose` for debugging |
|
| **Verbose logging** | ✅ `--verbose` for debugging |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
@ -318,6 +401,7 @@ property verboseLogging : false -- Debug output
|
||||||
- **Smart filenames**: Model-friendly with app names
|
- **Smart filenames**: Model-friendly with app names
|
||||||
- **Smart targeting**: Works with app names OR bundle IDs
|
- **Smart targeting**: Works with app names OR bundle IDs
|
||||||
- **Smart delays**: Optimized for speed (70% faster)
|
- **Smart delays**: Optimized for speed (70% faster)
|
||||||
|
- **Smart AI analysis**: Auto-detects best vision model
|
||||||
- Auto-launches sleeping apps and brings them forward
|
- Auto-launches sleeping apps and brings them forward
|
||||||
|
|
||||||
### 🎭 **Multi-Window Mastery**
|
### 🎭 **Multi-Window Mastery**
|
||||||
|
|
@ -331,6 +415,12 @@ property verboseLogging : false -- Debug output
|
||||||
- **0.1s multi-window focus** (down from 0.3s)
|
- **0.1s multi-window focus** (down from 0.3s)
|
||||||
- Responsive and practical for daily use
|
- Responsive and practical for daily use
|
||||||
|
|
||||||
|
### 🤖 **AI-Powered Vision**
|
||||||
|
- **Local analysis**: Private Ollama integration, no cloud
|
||||||
|
- **Smart model selection**: Auto-picks best available model
|
||||||
|
- **One or two-step**: Screenshot+analyze or analyze existing images
|
||||||
|
- **Perfect for automation**: Visual testing, error detection, QA
|
||||||
|
|
||||||
### 🔍 **Discovery Built-In**
|
### 🔍 **Discovery Built-In**
|
||||||
- See exactly what's running
|
- See exactly what's running
|
||||||
- Get precise window titles
|
- Get precise window titles
|
||||||
|
|
|
||||||
202
peekaboo.scpt
202
peekaboo.scpt
|
|
@ -13,6 +13,10 @@ property windowActivationDelay : 0.2
|
||||||
property enhancedErrorReporting : true
|
property enhancedErrorReporting : true
|
||||||
property verboseLogging : false
|
property verboseLogging : false
|
||||||
property maxWindowTitleLength : 50
|
property maxWindowTitleLength : 50
|
||||||
|
-- AI Analysis Configuration
|
||||||
|
property defaultVisionModel : "qwen2.5vl:7b"
|
||||||
|
-- Prioritized list of vision models (best to fallback)
|
||||||
|
property visionModelPriority : {"qwen2.5vl:7b", "llava:7b", "llava-phi3:3.8b", "minicpm-v:8b", "gemma3:4b", "llava:latest", "qwen2.5vl:3b", "llava:13b", "llava-llama3:8b"}
|
||||||
--#endregion Configuration Properties
|
--#endregion Configuration Properties
|
||||||
|
|
||||||
--#region Helper Functions
|
--#region Helper Functions
|
||||||
|
|
@ -135,6 +139,125 @@ on trimWhitespace(theText)
|
||||||
end trimWhitespace
|
end trimWhitespace
|
||||||
--#endregion Helper Functions
|
--#endregion Helper Functions
|
||||||
|
|
||||||
|
--#region AI Analysis Functions
|
||||||
|
on checkOllamaAvailable()
|
||||||
|
try
|
||||||
|
do shell script "ollama --version >/dev/null 2>&1"
|
||||||
|
return true
|
||||||
|
on error
|
||||||
|
return false
|
||||||
|
end try
|
||||||
|
end checkOllamaAvailable
|
||||||
|
|
||||||
|
on getAvailableVisionModels()
|
||||||
|
set availableModels to {}
|
||||||
|
try
|
||||||
|
set ollamaList to do shell script "ollama list 2>/dev/null | tail -n +2 | awk '{print $1}' | grep -v '^$'"
|
||||||
|
set modelLines to paragraphs of ollamaList
|
||||||
|
repeat with modelLine in modelLines
|
||||||
|
set modelName to contents of modelLine
|
||||||
|
if modelName is not "" then
|
||||||
|
set end of availableModels to modelName
|
||||||
|
end if
|
||||||
|
end repeat
|
||||||
|
on error
|
||||||
|
-- Return empty list if ollama list fails
|
||||||
|
end try
|
||||||
|
return availableModels
|
||||||
|
end getAvailableVisionModels
|
||||||
|
|
||||||
|
on findBestVisionModel(requestedModel)
|
||||||
|
my logVerbose("Finding best vision model, requested: " & requestedModel)
|
||||||
|
|
||||||
|
set availableModels to my getAvailableVisionModels()
|
||||||
|
my logVerbose("Available models: " & (availableModels as string))
|
||||||
|
|
||||||
|
-- If specific model requested and available, use it
|
||||||
|
if requestedModel is not defaultVisionModel then
|
||||||
|
repeat with availModel in availableModels
|
||||||
|
if contents of availModel is requestedModel then
|
||||||
|
my logVerbose("Using requested model: " & requestedModel)
|
||||||
|
return requestedModel
|
||||||
|
end if
|
||||||
|
end repeat
|
||||||
|
-- Requested model not found, will fall back to priority list
|
||||||
|
my logVerbose("Requested model '" & requestedModel & "' not found, checking priority list")
|
||||||
|
end if
|
||||||
|
|
||||||
|
-- Find best available model from priority list
|
||||||
|
repeat with priorityModel in visionModelPriority
|
||||||
|
repeat with availModel in availableModels
|
||||||
|
if contents of availModel is contents of priorityModel then
|
||||||
|
my logVerbose("Using priority model: " & contents of priorityModel)
|
||||||
|
return contents of priorityModel
|
||||||
|
end if
|
||||||
|
end repeat
|
||||||
|
end repeat
|
||||||
|
|
||||||
|
-- No priority models available, use first available vision model
|
||||||
|
repeat with availModel in availableModels
|
||||||
|
set modelName to contents of availModel
|
||||||
|
if modelName contains "llava" or modelName contains "qwen" or modelName contains "gemma" or modelName contains "minicpm" then
|
||||||
|
my logVerbose("Using first available vision model: " & modelName)
|
||||||
|
return modelName
|
||||||
|
end if
|
||||||
|
end repeat
|
||||||
|
|
||||||
|
-- No vision models found
|
||||||
|
return ""
|
||||||
|
end findBestVisionModel
|
||||||
|
|
||||||
|
on getOllamaInstallInstructions()
|
||||||
|
set instructions to scriptInfoPrefix & "AI Analysis requires Ollama with a vision model." & linefeed & linefeed
|
||||||
|
set instructions to instructions & "🚀 Quick Setup:" & linefeed
|
||||||
|
set instructions to instructions & "1. Install Ollama: curl -fsSL https://ollama.ai/install.sh | sh" & linefeed
|
||||||
|
set instructions to instructions & "2. Pull a vision model: ollama pull " & defaultVisionModel & linefeed
|
||||||
|
set instructions to instructions & "3. Models are ready to use!" & linefeed & linefeed
|
||||||
|
set instructions to instructions & "💡 Recommended models:" & linefeed
|
||||||
|
set instructions to instructions & " • qwen2.5vl:7b (6GB) - Best doc/chart understanding" & linefeed
|
||||||
|
set instructions to instructions & " • llava:7b (4.7GB) - Solid all-rounder" & linefeed
|
||||||
|
set instructions to instructions & " • llava-phi3:3.8b (2.9GB) - Tiny but chatty" & linefeed
|
||||||
|
set instructions to instructions & " • minicpm-v:8b (5.5GB) - Killer OCR" & linefeed & linefeed
|
||||||
|
set instructions to instructions & "Then retry your Peekaboo command with --ask or --analyze!"
|
||||||
|
return instructions
|
||||||
|
end getOllamaInstallInstructions
|
||||||
|
|
||||||
|
on analyzeImageWithAI(imagePath, question, requestedModel)
|
||||||
|
my logVerbose("Analyzing image with AI: " & imagePath)
|
||||||
|
my logVerbose("Requested model: " & requestedModel)
|
||||||
|
my logVerbose("Question: " & question)
|
||||||
|
|
||||||
|
-- Check if Ollama is available
|
||||||
|
if not my checkOllamaAvailable() then
|
||||||
|
return my formatErrorMessage("Ollama Error", "Ollama is not installed or not in PATH." & linefeed & linefeed & my getOllamaInstallInstructions(), "ollama unavailable")
|
||||||
|
end if
|
||||||
|
|
||||||
|
-- Find best available vision model
|
||||||
|
set modelToUse to my findBestVisionModel(requestedModel)
|
||||||
|
if modelToUse is "" then
|
||||||
|
return my formatErrorMessage("Model Error", "No vision models found." & linefeed & linefeed & my getOllamaInstallInstructions(), "no vision models")
|
||||||
|
end if
|
||||||
|
|
||||||
|
-- Use ollama run command (much simpler than API)
|
||||||
|
try
|
||||||
|
my logVerbose("Using model: " & modelToUse)
|
||||||
|
set ollamaCmd to "ollama run " & quoted form of modelToUse & " --image " & quoted form of imagePath & " " & quoted form of question
|
||||||
|
my logVerbose("Running: " & ollamaCmd)
|
||||||
|
|
||||||
|
set aiResponse to do shell script ollamaCmd
|
||||||
|
|
||||||
|
return scriptInfoPrefix & "AI Analysis Complete! 🤖" & linefeed & linefeed & "📸 Image: " & imagePath & linefeed & "❓ Question: " & question & linefeed & "🤖 Model: " & modelToUse & linefeed & linefeed & "💬 Answer:" & linefeed & aiResponse
|
||||||
|
|
||||||
|
on error errMsg
|
||||||
|
if errMsg contains "model" and errMsg contains "not found" then
|
||||||
|
return my formatErrorMessage("Model Error", "Model '" & modelToUse & "' not found." & linefeed & linefeed & "Install it with: ollama pull " & modelToUse & linefeed & linefeed & my getOllamaInstallInstructions(), "model not found")
|
||||||
|
else
|
||||||
|
return my formatErrorMessage("Analysis Error", "Failed to analyze image: " & errMsg & linefeed & linefeed & "Make sure Ollama is running and the model is available.", "ollama execution")
|
||||||
|
end if
|
||||||
|
end try
|
||||||
|
end analyzeImageWithAI
|
||||||
|
--#endregion AI Analysis Functions
|
||||||
|
|
||||||
--#region App Discovery Functions
|
--#region App Discovery Functions
|
||||||
on listRunningApps()
|
on listRunningApps()
|
||||||
set appList to {}
|
set appList to {}
|
||||||
|
|
@ -523,6 +646,26 @@ on run argv
|
||||||
end if
|
end if
|
||||||
end if
|
end if
|
||||||
|
|
||||||
|
-- Handle analyze command for existing images (two-step workflow)
|
||||||
|
if argCount ≥ 3 then
|
||||||
|
set firstArg to item 1 of argv
|
||||||
|
if firstArg is "analyze" or firstArg is "--analyze" then
|
||||||
|
set imagePath to item 2 of argv
|
||||||
|
set question to item 3 of argv
|
||||||
|
set modelToUse to defaultVisionModel
|
||||||
|
|
||||||
|
-- Check for custom model
|
||||||
|
if argCount ≥ 5 then
|
||||||
|
set modelFlag to item 4 of argv
|
||||||
|
if modelFlag is "--model" then
|
||||||
|
set modelToUse to item 5 of argv
|
||||||
|
end if
|
||||||
|
end if
|
||||||
|
|
||||||
|
return my analyzeImageWithAI(imagePath, question, modelToUse)
|
||||||
|
end if
|
||||||
|
end if
|
||||||
|
|
||||||
if argCount < 1 then return my usageText()
|
if argCount < 1 then return my usageText()
|
||||||
|
|
||||||
set appIdentifier to item 1 of argv
|
set appIdentifier to item 1 of argv
|
||||||
|
|
@ -538,19 +681,38 @@ on run argv
|
||||||
end if
|
end if
|
||||||
set captureMode to "screen" -- default
|
set captureMode to "screen" -- default
|
||||||
set multiWindow to false
|
set multiWindow to false
|
||||||
|
set analyzeMode to false
|
||||||
|
set analysisQuestion to ""
|
||||||
|
set visionModel to defaultVisionModel
|
||||||
|
|
||||||
-- Parse additional options
|
-- Parse additional options
|
||||||
if argCount > 2 then
|
if argCount > 2 then
|
||||||
repeat with i from 3 to argCount
|
set i to 3
|
||||||
|
repeat while i ≤ argCount
|
||||||
set arg to item i of argv
|
set arg to item i of argv
|
||||||
if arg is "--window" or arg is "-w" then
|
if arg is "--window" or arg is "-w" then
|
||||||
set captureMode to "window"
|
set captureMode to "window"
|
||||||
-- Remove interactive mode option
|
|
||||||
else if arg is "--multi" or arg is "-m" then
|
else if arg is "--multi" or arg is "-m" then
|
||||||
set multiWindow to true
|
set multiWindow to true
|
||||||
else if arg is "--verbose" or arg is "-v" then
|
else if arg is "--verbose" or arg is "-v" then
|
||||||
set verboseLogging to true
|
set verboseLogging to true
|
||||||
|
else if arg is "--ask" or arg is "--analyze" then
|
||||||
|
set analyzeMode to true
|
||||||
|
if i < argCount then
|
||||||
|
set i to i + 1
|
||||||
|
set analysisQuestion to item i of argv
|
||||||
|
else
|
||||||
|
return my formatErrorMessage("Argument Error", "--ask requires a question parameter" & linefeed & linefeed & my usageText(), "validation")
|
||||||
|
end if
|
||||||
|
else if arg is "--model" then
|
||||||
|
if i < argCount then
|
||||||
|
set i to i + 1
|
||||||
|
set visionModel to item i of argv
|
||||||
|
else
|
||||||
|
return my formatErrorMessage("Argument Error", "--model requires a model name parameter" & linefeed & linefeed & my usageText(), "validation")
|
||||||
|
end if
|
||||||
end if
|
end if
|
||||||
|
set i to i + 1
|
||||||
end repeat
|
end repeat
|
||||||
end if
|
end if
|
||||||
|
|
||||||
|
|
@ -638,7 +800,20 @@ on run argv
|
||||||
set modeDescription to "full screen"
|
set modeDescription to "full screen"
|
||||||
if captureMode is "window" then set modeDescription to "front window only"
|
if captureMode is "window" then set modeDescription to "front window only"
|
||||||
|
|
||||||
return scriptInfoPrefix & "Screenshot captured successfully! 📸" & linefeed & "• File: " & screenshotResult & linefeed & "• App: " & resolvedAppName & linefeed & "• Mode: " & modeDescription & linefeed & "💡 The " & modeDescription & " of " & resolvedAppName & " has been saved."
|
-- If AI analysis requested, analyze the screenshot
|
||||||
|
if analyzeMode then
|
||||||
|
set analysisResult to my analyzeImageWithAI(screenshotResult, analysisQuestion, visionModel)
|
||||||
|
if analysisResult starts with scriptInfoPrefix and analysisResult contains "Analysis Complete" then
|
||||||
|
-- Successful analysis
|
||||||
|
return analysisResult
|
||||||
|
else
|
||||||
|
-- Analysis failed, return screenshot success + analysis error
|
||||||
|
return scriptInfoPrefix & "Screenshot captured successfully! 📸" & linefeed & "• File: " & screenshotResult & linefeed & "• App: " & resolvedAppName & linefeed & "• Mode: " & modeDescription & linefeed & linefeed & "⚠️ AI Analysis failed:" & linefeed & analysisResult
|
||||||
|
end if
|
||||||
|
else
|
||||||
|
-- Regular screenshot without analysis
|
||||||
|
return scriptInfoPrefix & "Screenshot captured successfully! 📸" & linefeed & "• File: " & screenshotResult & linefeed & "• App: " & resolvedAppName & linefeed & "• Mode: " & modeDescription & linefeed & "💡 The " & modeDescription & " of " & resolvedAppName & " has been saved."
|
||||||
|
end if
|
||||||
end if
|
end if
|
||||||
end if
|
end if
|
||||||
|
|
||||||
|
|
@ -660,6 +835,7 @@ on usageText()
|
||||||
|
|
||||||
set outText to outText & "Usage:" & LF
|
set outText to outText & "Usage:" & LF
|
||||||
set outText to outText & " osascript " & scriptName & " \"<app_name_or_bundle_id>\" [\"<output_path>\"] [options]" & LF
|
set outText to outText & " osascript " & scriptName & " \"<app_name_or_bundle_id>\" [\"<output_path>\"] [options]" & LF
|
||||||
|
set outText to outText & " osascript " & scriptName & " analyze \"<image_path>\" \"<question>\" [--model model_name]" & LF
|
||||||
set outText to outText & " osascript " & scriptName & " list" & LF
|
set outText to outText & " osascript " & scriptName & " list" & LF
|
||||||
set outText to outText & " osascript " & scriptName & " help" & LF & LF
|
set outText to outText & " osascript " & scriptName & " help" & LF & LF
|
||||||
|
|
||||||
|
|
@ -670,12 +846,14 @@ on usageText()
|
||||||
|
|
||||||
set outText to outText & "Options:" & LF
|
set outText to outText & "Options:" & LF
|
||||||
set outText to outText & " --window, -w: Capture frontmost window only" & LF
|
set outText to outText & " --window, -w: Capture frontmost window only" & LF
|
||||||
set outText to outText & " --interactive, -i: Interactive window selection" & LF
|
|
||||||
set outText to outText & " --multi, -m: Capture all windows with descriptive names" & LF
|
set outText to outText & " --multi, -m: Capture all windows with descriptive names" & LF
|
||||||
|
set outText to outText & " --ask \"question\": AI analysis of screenshot (requires Ollama)" & LF
|
||||||
|
set outText to outText & " --model model_name: Custom vision model (auto-detects best available)" & LF
|
||||||
set outText to outText & " --verbose, -v: Enable verbose logging" & LF & LF
|
set outText to outText & " --verbose, -v: Enable verbose logging" & LF & LF
|
||||||
|
|
||||||
set outText to outText & "Commands:" & LF
|
set outText to outText & "Commands:" & LF
|
||||||
set outText to outText & " list: List all running apps with window titles" & LF
|
set outText to outText & " list: List all running apps with window titles" & LF
|
||||||
|
set outText to outText & " analyze: Analyze existing image with AI vision" & LF
|
||||||
set outText to outText & " help: Show this help message" & LF & LF
|
set outText to outText & " help: Show this help message" & LF & LF
|
||||||
|
|
||||||
set outText to outText & "Examples:" & LF
|
set outText to outText & "Examples:" & LF
|
||||||
|
|
@ -688,7 +866,21 @@ on usageText()
|
||||||
set outText to outText & " # Front window only:" & LF
|
set outText to outText & " # Front window only:" & LF
|
||||||
set outText to outText & " osascript " & scriptName & " \"TextEdit\" \"/tmp/textedit.png\" --window" & LF
|
set outText to outText & " osascript " & scriptName & " \"TextEdit\" \"/tmp/textedit.png\" --window" & LF
|
||||||
set outText to outText & " # All windows with descriptive names:" & LF
|
set outText to outText & " # All windows with descriptive names:" & LF
|
||||||
set outText to outText & " osascript " & scriptName & " \"Safari\" \"/tmp/safari_windows.png\" --multi" & LF & LF
|
set outText to outText & " osascript " & scriptName & " \"Safari\" \"/tmp/safari_windows.png\" --multi" & LF
|
||||||
|
set outText to outText & " # One-step: Screenshot + AI analysis:" & LF
|
||||||
|
set outText to outText & " osascript " & scriptName & " \"Safari\" --ask \"What's on this page?\"" & LF
|
||||||
|
set outText to outText & " # Two-step: Analyze existing image:" & LF
|
||||||
|
set outText to outText & " osascript " & scriptName & " analyze \"/tmp/screenshot.png\" \"Describe what you see\"" & LF
|
||||||
|
set outText to outText & " # Custom model:" & LF
|
||||||
|
set outText to outText & " osascript " & scriptName & " \"Safari\" --ask \"Any errors?\" --model llava:13b" & LF & LF
|
||||||
|
|
||||||
|
set outText to outText & "AI Analysis Features:" & LF
|
||||||
|
set outText to outText & " • Local inference with Ollama (private, no data sent to cloud)" & LF
|
||||||
|
set outText to outText & " • Auto-detects best available vision model from your Ollama install" & LF
|
||||||
|
set outText to outText & " • Priority: qwen2.5vl:7b > llava:7b > llava-phi3:3.8b > minicpm-v:8b" & LF
|
||||||
|
set outText to outText & " • One-step: Screenshot + analysis in single command" & LF
|
||||||
|
set outText to outText & " • Two-step: Analyze existing images separately" & LF
|
||||||
|
set outText to outText & " • Detailed setup guide if models missing" & LF & LF
|
||||||
|
|
||||||
set outText to outText & "Multi-Window Features:" & LF
|
set outText to outText & "Multi-Window Features:" & LF
|
||||||
set outText to outText & " • --multi creates separate files with descriptive names" & LF
|
set outText to outText & " • --multi creates separate files with descriptive names" & LF
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue