Skip to content

farscry extract

Converts any screenshot into a typed VASP output with screen classification, UI tree, and affordances.

Usage

Terminal window
farscry extract <image>
farscry extract <image> [options]
cat <image> | farscry extract
farscry extract --from-clipboard

Examples

Terminal window
# Single image
farscry extract screenshot.png
# From stdin
cat screenshot.png | farscry extract
# From clipboard (Cmd+Shift+4 on macOS)
farscry extract --from-clipboard
# Batch (parallel processing)
farscry extract *.png
farscry extract img1.png img2.png img3.png
# JSON output
farscry extract screenshot.png --json
# Save to file
farscry extract screenshot.png -o context.vasp
# Affordances only
farscry extract screenshot.png --affordances
# One-line agent_context summary
farscry extract screenshot.png --context
# Explicit language
farscry extract screenshot.png --lang por
# Multi-language
farscry extract screenshot.png --lang eng+por

Options

FlagDefaultDescription
--from-clipboardfalseRead image from system clipboard (macOS and Linux only)
--jsonfalseOutput JSON instead of VASP
-o <file>stdoutWrite output to file
--affordancesfalseOutput only interactive elements
--contextfalseOutput only the one-line agent_context summary
--text-onlyfalseSuppress image forwarding to workflow
--lang <code>autoForce language (e.g. eng, por, eng+por)
--max-size <n>mb10mbOverride 10MB input size limit
-vfalseVerbose, show processing steps
--debugfalseFull debug output to stderr

Output format

See VASP Overview for the full schema.

=== farscry visual context ===
screen_type: config
state_id: phash:<16-char-hex>
confidence: high
lang: eng
agent_context: "<one-line summary>"
---
[top-center] heading "Payment Settings"
[middle-right] button "Save Changes" enabled:true
[bottom] error "Value must be ≤ 10000"
affordances:
click → "Save Changes" at (400,300)
type → "Max Value" at (200,120)

Supported input formats

FormatMagic bytes
PNG89 50 4E 47
JPEGFF D8 FF
WebP52 49 46 46
GIF47 49 46 38
TIFF49 49 2A 00 / 4D 4D 00 2A

Input validation uses magic bytes. File extension is ignored..

Exit codes

CodeMeaning
0Success
1Input error (file not found, wrong format, too large)
2Processing error (OCR failed)
3Configuration error (language not installed)

Performance

PlatformWarm daemonCold CLI
Apple Silicon M-series (CoreML)38ms~350ms
x86 CPU (ORT)~222ms~350ms

First run downloads OCR assets (~12MB). Subsequent runs use the local cache. Use farscry serve --mcp to keep OCR engines warm and hit the 38ms figure consistently.