# Tool Schema Description: Debugging and Resolution This document outlines the process undertaken to debug and resolve issues related to tool schema generation and compatibility, particularly for the `image` tool in the Peekaboo MCP server. The goal was to ensure that tool parameters, including their descriptions, were correctly processed by the `zodToJsonSchema` function and displayed accurately in the client (Cursor, powered by Gemini 2.5 Pro). ## 1. Initial Problem The primary issue was that the `image` tool's parameters were not loading or being displayed correctly in the client. The client often reported an "incompatible schema" error for this tool, while other tools like `analyze` and `list` (which had simpler schemas) were working correctly after some initial refinements. This indicated a problem specific to the complexity or structure of the `image` tool's Zod schema or how it was being converted to a JSON schema. ## 2. Refinement of `zodToJsonSchema` A significant early step was to refactor the `zodToJsonSchema` function located in `src/index.ts`. The initial version was somewhat simplistic and did not robustly handle various Zod constructs: * Extracting descriptions from `.describe()` calls, especially when nested within `.optional()` or `.default()`. * Properly representing Zod unions (`z.union()`), objects (`z.object()`), enums (`z.enum()`), and custom types (`z.custom()`). The refactoring involved: * Creating a recursive helper function (`unwrapZodSchema`) to get to the core Zod type and its description, peeling off wrappers like `ZodOptional` and `ZodDefault`. * Ensuring that descriptions from `.describe()` calls were consistently picked up and added to the `description` field in the resulting JSON schema properties. * Explicitly handling different Zod types (`ZodString`, `ZodBoolean`, `ZodNumber`, `ZodEnum`, `ZodObject`, `ZodArray`, `ZodUnion`, `ZodLiteral`, `ZodNativeEnum`, `ZodEffects` for transformations/refinements) to build a more accurate JSON schema representation. This refactoring was crucial for the `analyze` and `list` tools to display their parameters correctly and laid the foundation for debugging the `image` tool. ## 3. Debugging the `imageToolSchema` The `imageToolSchema` was the most complex, involving several optional fields, enums, a union of objects, and a custom Zod type. The debugging approach was methodical: 1. **Bottom-Up Simplification**: The `imageToolSchema` was initially reduced to its simplest possible form (e.g., a single optional string field like `app`). The `imageToolHandler` logic was also temporarily stubbed out to return a static success message to avoid TypeScript errors due to the schema changes. 2. **Incremental Re-addition of Fields**: One by one, each original field was added back to the `imageToolSchema`, followed by a build and client test: * `app` (optional string) * `question` (optional string) * `return_data` (optional boolean with default) * `format` (optional enum with default) * `capture_focus` (optional enum with default) * `path` (optional string) * `mode` (optional enum without Zod default) * All these fields, when added individually or in combination, resulted in a schema that was correctly displayed in the client. This confirmed that basic Zod types, optionals, defaults, and enums were being handled correctly by the improved `zodToJsonSchema` and were compatible with the client/model. 3. **Identifying the Problematic Fields**: The issues arose when reintroducing the more complex fields: * `window_specifier`: An optional `z.union([z.object({ title: ... }), z.object({ index: ... })])`. This field, surprisingly, *did* work and its parameters displayed correctly once the main tool description in `src/index.ts` was shortened (see section 4). * `provider_config`: This was the most problematic. Initially defined in `imageToolSchema` as `z.custom().optional().describe(...)`, where `AIProviderConfig` was a `z.union([OllamaConfig, OpenAIConfig])`. ## 4. Addressing UI Space and Main Descriptions During testing, it became apparent that the client UI (Cursor's tooltip/parameter display area) had limited space. The verbose, multi-line `description` strings in `src/index.ts` (which manually listed parameters as a hack) were consuming this space, preventing the schema-derived parameters from being fully visible. **Solution**: The main `description` strings for all tools in `src/index.ts` were shortened to be concise summaries. This allowed the client to properly display the parameter details generated by `zodToJsonSchema`. ## 5. Resolving `provider_config` Incompatibility Even with the refined `zodToJsonSchema` and shortened main descriptions, the `image` tool would often trigger an "incompatible schema" error message from the Gemini model when `provider_config` was included with its `z.custom(z.union(...))` definition. However, paradoxically, the client UI *would sometimes still display the parameters correctly*, suggesting a discrepancy between the client's display-rendering schema validation and the model's execution-time schema validation. **The Fix**: The `provider_config` field in `imageToolSchema` (in `src/tools/image.ts`) was changed to mirror the structure used in the `analyzeToolSchema` (which was working reliably). Instead of `z.custom()`, it became a direct `z.object()` definition: ```typescript provider_config: z .object({ type: z .enum(["auto", "ollama", "openai"]) .default("auto") .describe( "AI provider type. 'auto' uses server default.", ), model: z .string() .optional() .describe( "Optional model name. Uses server default if omitted.", ), }) .optional() .describe( "Optional. Specify AI provider/model for analysis.", ), ``` This simpler, more explicit Zod structure for `provider_config` resolved the incompatibility. The client was then able to consistently load and display all parameters for the `image` tool without the "incompatible schema" error blocking its usability (though the error message itself sometimes lingered, possibly due to caching, it no longer prevented parameter display). ## 6. Restoring Handler Logic and Testing After the schema was confirmed to be working, the stubbed-out implementations of `imageToolHandler`, `buildSwiftCliArgs`, and `generateImageCaptureSummary` in `src/tools/image.ts` needed to be reverted to their original, fully functional code. Following this, `npm test` was run to ensure all unit and integration tests passed. ## 7. Fine-tuning Description Length for Client UI After successfully resolving the Gemini model's schema compatibility issues, a separate observation was made regarding the client UI (Cursor). When the main tool descriptions in `src/index.ts` were made very verbose (e.g., including long use-case examples directly in the description string), Cursor's UI would not display these long descriptions, defaulting to showing only the parameter list generated from the schema. This was *not* a Gemini model rejection but rather a UI display limitation or choice. **Solution**: The main descriptions were adjusted to be moderately detailed, providing core capabilities and multi-screen/window behavior, while omitting extremely long examples. This length was found to be acceptable for Cursor's UI, allowing it to display the richer description alongside the schema-derived parameters. ## Key Learnings: * **Schema Simplicity**: The Gemini model's schema validation appears to be sensitive to complex Zod structures, especially combinations like `z.custom()` wrapping `z.union()` of `z.object()`s. Favoring more direct and explicit Zod definitions (e.g., `z.object()` with clearly defined properties) improves compatibility. * **`zodToJsonSchema` Robustness**: A comprehensive `zodToJsonSchema` function that correctly handles various Zod types and extracts `.describe()` metadata is crucial for accurate schema generation. * **Client UI vs. Model Validation**: There can be slight differences in how a client UI parses/displays a schema and how the underlying model validates it for execution. Successful display in the UI is a good sign but not a definitive guarantee of model compatibility if complex types are involved. The inverse can also occur: the model may accept a schema that a client UI truncates or simplifies for display due to its own constraints. * **Main Description Length for Client UI**: Client UIs (like Cursor) may have their own limitations or display preferences for the length of the main tool description string. Overly verbose descriptions might not be fully displayed. It's important to balance richness of information with conciseness suitable for the UI, relying on the schema for detailed parameter information. * **Concise Main Descriptions (Initial Approach)**: Initially, keeping the primary `description` field for a tool (in `src/index.ts`) very concise was a workaround for UI space issues, allowing schema-derived parameters to appear. The final approach found a middle ground. * **Iterative Debugging**: The bottom-up approach (simplifying then incrementally adding complexity) was highly effective in isolating the problematic parts of the schema. By addressing these points, particularly the structure of `provider_config`, the verbosity of main descriptions for UI compatibility, and ensuring a robust `zodToJsonSchema` implementation, the Peekaboo tools' schemas were made fully compatible and are now presented effectively in the client.