Accessibility Describe

View as Markdown

A RocketRide image-filter node that turns an image into a structured scene description optimized for blind and visually impaired users.

What it does

Receives an image on its input lane, sends it to Google Gemini Vision (via the google-genai SDK), and emits a text description designed for assistive use, for example real-time narration on smart glasses. The description covers environment type, hazards with positions, key objects, visible text read verbatim (OCR), people, and navigation guidance, kept under 150 words by the default prompt.

The node buffers the incoming image stream, base64-encodes it as a data URL, and sends it together with the analysis prompt in a single generate_content call. Requests run with temperature: 0.3 and max_output_tokens: 1024. Transient failures (timeouts, connection errors, 5xx responses) are retried up to 3 times with exponential backoff. API errors are translated into user-friendly messages covering authentication, rate limiting, safety blocks, model unavailability, and timeouts.

The default model is gemini-2.5-flash. A Google AI API key is required: the node fails at startup without one, and rejects keys starting with sk- (OpenAI keys) with a clear error.

Configuration

Lanes

Lane	Direction	Description
`image`	input	The image to describe (streamed; any image MIME type)
`text`	output	The accessibility-optimized scene description

Fields

Field	Type	Description
`model`	string	Google Gemini vision model
`modelTotalTokens`	number	Maximum context length in tokens
`systemPrompt`	string	Default "You are an accessibility-focused scene analyzer designed to help blind and visually impaired users understand their surroundings through image descriptions.". Define the accessibility description behavior and priorities
`prompt`	string	Default "Describe this image for a blind person. Include: environment type, hazards with positions, key objects with clock positions, visible text, people, and navigation guidance. Keep under 150 words.". Prompt template for generating accessibility descriptions from images
`prioritizeHazards`	string	Default "high". How aggressively to prioritize hazard detection
`spatialFormat`	string	Default "clock". How to describe spatial positions
`profile`	string	Default "gemini-2.5-flash". Select the Gemini vision model for accessibility descriptions

If accessibility.systemPrompt or accessibility.prompt is left empty, the node falls back to a generic systemPrompt / prompt config value, then to its built-in defaults.

Hazard priority

Value	Effect
`high` (default)	The model must lead with hazards; if none exist it explicitly states the area appears safe
`medium`	Hazards are included in their spatial context when present
`low`	Standard description order, no extra hazard emphasis

Spatial format

Value	Effect
`clock` (default)	Clock positions (12 o'clock = straight ahead)
`relative`	Relative directions (left, right, ahead, behind)
`both`	Both clock positions and relative directions

Both settings are applied as modifiers appended to the system prompt at runtime.

Profiles

Profile	Model	Notes
`gemini-2.5-flash` (default)	`gemini-2.5-flash`	Fast and efficient, suitable for real-time use
`gemini-2.5-pro`	`gemini-2.5-pro`	Highest quality
`gemini-2.0-flash`	`gemini-2.0-flash`	Balanced

All profiles use a 1M (1,048,576) token context window.

Default output structure

ENVIRONMENT  - type of place
HAZARDS      - obstacles, stairs, vehicles (with positions)
KEY OBJECTS  - notable items with clock positions and distances
TEXT         - any visible text read verbatim
PEOPLE       - count, positions, and actions
NAVIGATION   - clear path forward, turns, or barriers

Customize the Analysis Prompt (accessibility.prompt) field to change this structure.

Authentication

Requires a Google AI API key: get one at aistudio.google.com/apikey and set it in the node's API key field. The node validates the key at startup: a missing key raises an error immediately, and a key with the sk- prefix is rejected as an OpenAI key.

Schema

Field	Type	Description	Default
`accessibility.prioritizeHazards`	`string`	Hazard Priority How aggressively to prioritize hazard detection	`"high"`
`accessibility.prompt`	`string`	Analysis Prompt Prompt template for generating accessibility descriptions from images	`"Describe this image for a blind person. Include: environment type, hazards with positions, key objects with clock positions, visible text, people, and navigation guidance. Keep under 150 words."`
`accessibility.spatialFormat`	`string`	Spatial Format How to describe spatial positions	`"clock"`
`accessibility.systemPrompt`	`string`	System Instructions Define the accessibility description behavior and priorities	`"You are an accessibility-focused scene analyzer designed to help blind and visually impaired users understand their surroundings through image descriptions."`
`accessibility_describe.profile`	`string`	Vision Model Select the Gemini vision model for accessibility descriptions	`"gemini-2.5-flash"`
`model`	`string`	Model Google Gemini vision model
`modelTotalTokens`	`number`	Tokens Maximum context length in tokens

Dependencies

google-genai >=1.14.0

What it does​

Configuration​

Lanes​

Fields​

Hazard priority​

Spatial format​

Profiles​

Default output structure​

Authentication​

Schema​

Dependencies​