Quick Start
-
Install farscry
Terminal window npm install -g farscryTerminal window pip install farscryTerminal window brew install teles-forge/tap/farscryTerminal window cargo install farscryTerminal window curl -fsSL https://farscry.dev/install | sh -
Wire up your agent
Terminal window farscry setupDetects Claude Code, Devin, Codex, and Aider. Shows MCP config for Claude Code, Cursor, Windsurf, and Zed. farscry never modifies your files.
-
Extract from a screenshot
Terminal window farscry extract screenshot.pngOutput:
=== farscry visual context ===source: screenshot.pngscreen_type: configstate_id: phash:3d9b1e7a...confidence: highlang: engagent_context: "Payment Settings - 3 editable fields, Save available"---[top-left] heading "Payment Settings"[middle-left] label "Max Value:"[middle-center] input "1500"value="1500"enabled:true[middle-right] button "Save Changes"enabled:true[bottom-left] error "Value must be <= 10000"affordances:click: "Save Changes" enabled: truetype: input "Max Value" current: "1500" -
Diff two screenshots
Terminal window farscry diff before.png after.pngOutput:
=== farscry diff ===state_id: phash:3d9b1e7a...delta_from: phash:8f4a2c3d...context_similarity: 0.847context_changed: true---appeared: error "Card declined"changed: button "Submit" -> disabledremoved: label "spinner"unchanged: [9 elements]Token savings: ~312 tokens saved vs re-sending both images -
Pipe to an agent
Terminal window farscry extract screen.png | claude -p "fix this"farscry extract --from-clipboard | claude -p "fix this"farscry writes to stdout. Pipe anywhere.
Visual debugging with annotate
Section titled “Visual debugging with annotate”farscry annotate shows you exactly what farscry sees: same screenshot with bounding boxes drawn over each detected element.
farscry annotate screenshot.png -o annotated.png# or from clipboard:farscry annotate --from-clipboard -o /tmp/out.pngUse this to:
- Verify farscry is detecting elements correctly before wiring your agent
- Debug agent failures: did farscry miss the button?
- Share annotated screenshots with your team
Add the fannot alias for one-command workflow:
echo "alias fannot='farscry annotate --from-clipboard -o /tmp/farscry_annotated.png && open /tmp/farscry_annotated.png'" >> ~/.zshrc && source ~/.zshrcThen: screenshot -> fannot -> annotated image opens automatically.
Zero-friction workflow
Section titled “Zero-friction workflow”The fastest way to use farscry. One command, every time.
farscry setupDetects claude, devin, codex, aider. Shows the alias to add and MCP config to paste.
Offers to create ~/.farscry/smart-paste.sh and show terminal key binding instructions.
Then add the short alias:
echo "alias fp='farscry paste'" >> ~/.zshrc && source ~/.zshrcNow: screenshot → fp → done.
Smart paste: Cmd+V auto-detects images
Section titled “Smart paste: Cmd+V auto-detects images”After running farscry setup, answer y to “Configure smart paste?” to create the script and see instructions for your terminal.
The script (~/.farscry/smart-paste.sh) checks whether the clipboard contains an image:
- Image in clipboard → runs
farscry paste→ sends to your agent - Text in clipboard → falls back to normal paste (
pbpaste/xclip/Get-Clipboard)
macOS (iTerm2):
Preferences → Keys → Key Bindings → +Shortcut: Cmd+VAction: Run CommandCommand: ~/.farscry/smart-paste.shmacOS (Warp):
Settings → Features → Custom Key BindingsKey: Cmd+VAction: Run Command: ~/.farscry/smart-paste.shmacOS (Terminal.app): Not supported natively. Use fp instead.
Linux (Gnome Terminal): Add to ~/.bashrc:
bind -x '"\C-v": ~/.farscry/smart-paste.sh'Linux (Kitty) (~/.config/kitty/kitty.conf):
map ctrl+v launch --stdin-source=@last_cmd_output ~/.farscry/smart-paste.shWindows Terminal:
Settings → Actions → Add newCommand: wt.exe new-tab powershell -Command ~/.farscry/smart-paste.ps1Keys: ctrl+vResult: Screenshot with any tool → press Cmd+V in your terminal → farscry detects the image and sends it to your agent. No command to type.
Agent integrations
Section titled “Agent integrations”Claude Code
Section titled “Claude Code”farscry extract screen.png | claude -p "fix this"farscry extract --from-clipboard | claude -p "fix this"devin -p "$(farscry extract screen.png): fix this"devin -p "$(farscry extract --from-clipboard): fix this"farscry extract screen.png | codex exec "fix this:"farscry extract --from-clipboard | codex exec "fix this:"MCP (all agents, recommended)
Section titled “MCP (all agents, recommended)”farscry serve --mcpSupports multiple images via image_paths parameter.
Supported image formats
Section titled “Supported image formats”PNG, JPEG, GIF, WEBP, TIFF. From clipboard, file, or stdin. From clipboard: Cmd+Shift+4, Shottr, or Cmd+C on an image file in Finder.
Common flags
Section titled “Common flags”| Flag | Description |
|---|---|
--json | JSON output instead of VASP text |
--affordances | Show only interactive elements |
--context | One-line agent_context summary |
--lang por | Explicit language (default: auto-detect) |
-v | Verbose, show processing steps |
Known limitations in v0.1.0
Section titled “Known limitations in v0.1.0”| Scenario | Status | Notes |
|---|---|---|
| Text-heavy UIs (terminal, config, forms) | Works well | Core use case |
| Icon-only toolbars | Partial | Buttons without text labels are missed |
| Charts, graphs, images | Not supported | OCR extracts no structured data |
--from-clipboard on Linux | Requires xclip | apt install xclip |
| Windows | Untested in v0.1.0 | Binary ships, not CI-validated |
Next steps
Section titled “Next steps”- CLI Reference, extract
- CLI Reference, diff
- MCP Server, keep OCR warm, integrate with MCP-compatible agents
- VASP Format, the output schema