From ee6aecda82488092bf25db97e43bbb7c8a955372 Mon Sep 17 00:00:00 2001 From: Peter Steinberger Date: Sun, 8 Jun 2025 03:49:54 +0100 Subject: [PATCH] update docs --- CHANGELOG.md | 36 ++++++++++++++++++++++++++++++++++++ docs/spec.md | 47 +++++++++++++++++++++++++++++++++-------------- 2 files changed, 69 insertions(+), 14 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 956f47f..7033824 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,42 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +## [1.0.0-beta.18] - 2025-01-08 + +### Fixed +- Fixed a bug where providing an empty string for the `capture_focus` parameter in the `image` tool would cause a validation error. The schema now correctly handles this case and applies the default value ('background'), making the parameter truly optional. + +## [1.0.0-beta.17] - 2025-01-08 + +### Added +- The `image` tool's analysis capability has been significantly enhanced. When a capture results in multiple images (e.g., targeting an application with multiple windows) and a `question` is provided, the tool will now perform an AI analysis for **every single captured image**. +- The analysis results are returned in a single, clearly formatted text block, with each window's analysis presented under a descriptive header. + +## [1.0.0-beta.16] - 2025-01-08 + +### Enhanced +- **Smart Path Handling**: The Swift CLI now intelligently detects whether a provided path is intended as a file or directory: + - **File paths** (with extensions): Uses exact path for single screen captures, appends screen identifiers for multiple captures + - **Directory paths** (no extension or trailing `/`): Places generated filenames inside the directory + - **Auto-Creation**: Automatically creates intermediate directories as needed for both file and directory paths + - **Edge Cases**: Properly handles special directory indicators (`.`, `..`), hidden files, unicode characters, and paths with spaces + +### Improved +- **Enhanced Error Messages**: File write errors now provide detailed, actionable guidance: + - Permission denied errors include specific directory permission checks + - Missing directory errors suggest ensuring parent directories exist + - Disk space errors clearly indicate insufficient storage + - Generic I/O errors include underlying system error details + +### Added +- **Comprehensive Test Coverage**: Added 52+ new tests covering path handling, error scenarios, and edge cases +- **Path Logic Validation**: Tests for file vs directory detection, multiple format support, and special character handling + +### Fixed +- Fixed original issue where `/tmp/screenshot.png` was incorrectly treated as a directory instead of a filename +- Improved file extension preservation when appending screen/window identifiers to filenames +- Enhanced path validation for complex nested directory structures + ## [1.0.0-beta.15] - 2025-01-08 ### Improved diff --git a/docs/spec.md b/docs/spec.md index 44c7fef..367b6ba 100644 --- a/docs/spec.md +++ b/docs/spec.md @@ -1,4 +1,4 @@ -## Peekaboo: Full & Final Detailed Specification v1.1.1 +## Peekaboo: Full & Final Detailed Specification v1.1.2 https://aistudio.google.com/prompts/1B0Va41QEZz5ZMiGmLl2gDme8kQ-LQPW- **Project Vision:** Peekaboo is a macOS utility exposed via a Node.js MCP server, enabling AI agents to perform advanced screen captures, image analysis via user-configured AI providers, and query application/window information. The core macOS interactions are handled by a native Swift command-line interface (CLI) named `peekaboo`, which is called by the Node.js server. All image captures automatically exclude window shadows/frames. @@ -132,10 +132,7 @@ Configured AI Providers (from PEEKABOO_AI_PROVIDERS ENV): `: App identifier. - * `--path `: Base output directory or file prefix/path. + * `--path `: Output path for the captured image(s). Can be either a file path or directory path. + * **File Path Logic**: If the path appears to be a file (contains an extension and doesn't end with `/`), the CLI intelligently handles it: + * For single screen capture (`--screen-index` specified): Uses the exact file path provided. + * For multiple screen/window capture: Appends screen/window identifiers to avoid overwriting (e.g., `/tmp/capture.png` becomes `/tmp/capture_1_timestamp.png`, `/tmp/capture_2_timestamp.png`). + * **Directory Path Logic**: If the path appears to be a directory (no extension or ends with `/`), generated filenames are placed in that directory. + * **Auto-Creation**: The CLI automatically creates intermediate directories as needed for both file and directory paths. + * **Edge Cases**: Special directory indicators like `.` and `..` are handled correctly. * `--mode `: `ModeEnum` is `screen, window, multi`. Default logic: if `--app` then `window`, else `screen`. * `--window-title `: For `mode window`. * `--window-index `: For `mode window`. @@ -512,7 +517,21 @@ Comprehensive testing is crucial for ensuring the reliability and correctness of * **Node.js Server & Swift CLI**: Tests that verify the correct interaction between the Node.js server and the Swift CLI. This involves the Node.js server actually spawning the Swift CLI process and validating that arguments are passed correctly and JSON responses are parsed as expected. These tests might use a real (but controlled) Swift CLI binary. * **Node.js Server & AI Providers**: Tests that verify the interaction with AI providers. These would typically involve mocking the AI provider SDKs/APIs to simulate various responses (success, error, specific content) and ensure the Node.js server handles them correctly. -#### C. End-to-End (E2E) Tests +#### C. Path Handling & Error Message Tests + +* **Path Logic Testing**: Comprehensive tests for the enhanced Swift CLI path handling: + * **File vs Directory Detection**: Tests validating the logic that determines whether a path is intended as a file or directory. + * **Single vs Multiple Capture**: Tests ensuring single screen captures use exact file paths, while multiple captures append identifiers appropriately. + * **Auto-Creation**: Tests verifying automatic creation of intermediate directories for both file and directory paths. + * **Special Cases**: Tests for edge cases like `.`, `..`, hidden files, unicode characters, and paths with spaces. + * **Extension Preservation**: Tests ensuring file extensions are preserved correctly when appending screen/window identifiers. + +* **Enhanced Error Messages**: Tests for the improved error reporting system: + * **File Write Errors**: Tests validating detailed error messages for permission denied, missing directories, disk space issues, and generic I/O errors. + * **Error Context**: Tests ensuring error messages include helpful guidance for common issues. + * **Error Code Consistency**: Tests verifying error codes remain stable and exit codes are consistent. + +#### D. End-to-End (E2E) Tests E2E tests validate the entire system flow from the perspective of an MCP client. They ensure all components work together as expected.