Skip to content

MCP Workflow Integration

farscry works in two modes: directly through a CLI pipe, or as a local MCP server that an MCP-compatible workflow can call when it needs screenshot context.

Terminal window
farscry extract screenshot.png | your-runner "fix this"

The workflow receives typed VASP context instead of a raw image.

With the MCP server, the workflow can call farscry_extract or farscry_diff without manual piping.

Run farscry setup to auto-detect your agent and get the config snippet to paste:

Terminal window
farscry setup

Or add manually to your agent’s MCP config file:

File: ~/.claude/mcp.json

{
"mcpServers": {
"farscry": {
"command": "farscry",
"args": ["serve", "--mcp"]
}
}
}

The MCP host starts the farscry server when the session begins and keeps OCR engines warm for the duration.

Without farscry, the workflow receives a raw image and must interpret the full screenshot again, adding latency and cost.

With farscry MCP:

vasp_version: 1.0
state_id: phash:a3f7c2b1...
screen_type: error
confidence: high
agent_context: "Payment error - card declined, retry available"
---
[bottom] error "Payment failed - card declined" at (20,350)
[bottom] button "Retry" enabled:true at (400,420)
[bottom] button "Back" enabled:true at (400,470)
affordances:
click → "Retry" at (400,420) enabled:true
click → "Back" at (400,470) enabled:true
Terminal window
farscry extract terminal.png | your-runner "fix this build error"
Terminal window
farscry extract before.png -o before.vasp
farscry diff before.png after.png | your-runner "did the save succeed?"
Terminal window
farscry extract form.png --affordances | your-runner "fill in the form with test data"
  • Use farscry extract screenshot.png --context for a one-line agent_context summary when you don’t need the full tree
  • Use farscry diff before asking “did this work?”. Avoids a full-screen vision pass, returns an exact typed delta
  • Use --affordances when the workflow needs to interact with UI elements. Lists exactly what can be clicked or typed