Accessibility Describe
A RocketRide image-filter node that turns an image into a structured scene description optimized for blind and visually impaired users.
What it does
Receives an image on its input lane, sends it to Google Gemini Vision (via the
google-genai SDK), and emits a text description designed for assistive use, for
example real-time narration on smart glasses. The description covers environment type,
hazards with positions, key objects, visible text read verbatim (OCR), people, and
navigation guidance, kept under 150 words by the default prompt.
The node buffers the incoming image stream, base64-encodes it as a data URL, and sends
it together with the analysis prompt in a single generate_content call. Requests run
with temperature: 0.3 and max_output_tokens: 1024. Transient failures (timeouts,
connection errors, 5xx responses) are retried up to 3 times with exponential backoff.
API errors are translated into user-friendly messages covering authentication, rate
limiting, safety blocks, model unavailability, and timeouts.
The default model is gemini-2.5-flash. A Google AI API key is required: the node
fails at startup without one, and rejects keys starting with sk- (OpenAI keys) with a
clear error.
Configuration
Lanes
| Lane | Direction | Description |
|---|---|---|
image | input | The image to describe (streamed; any image MIME type) |
text | output | The accessibility-optimized scene description |
Fields
| Field | Type | Description |
|---|---|---|
model | string | Google Gemini vision model |
modelTotalTokens | number | Maximum context length in tokens |
systemPrompt | string | Default "You are an accessibility-focused scene analyzer designed to help blind and visually impaired users understand their surroundings through image descriptions.". Define the accessibility description behavior and priorities |
prompt | string | Default "Describe this image for a blind person. Include: environment type, hazards with positions, key objects with clock positions, visible text, people, and navigation guidance. Keep under 150 words.". Prompt template for generating accessibility descriptions from images |
prioritizeHazards | string | Default "high". How aggressively to prioritize hazard detection |
spatialFormat | string | Default "clock". How to describe spatial positions |
profile | string | Default "gemini-2.5-flash". Select the Gemini vision model for accessibility descriptions |
If accessibility.systemPrompt or accessibility.prompt is left empty, the node falls
back to a generic systemPrompt / prompt config value, then to its built-in defaults.
Hazard priority
| Value | Effect |
|---|---|
high (default) | The model must lead with hazards; if none exist it explicitly states the area appears safe |
medium | Hazards are included in their spatial context when present |
low | Standard description order, no extra hazard emphasis |
Spatial format
| Value | Effect |
|---|---|
clock (default) | Clock positions (12 o'clock = straight ahead) |
relative | Relative directions (left, right, ahead, behind) |
both | Both clock positions and relative directions |
Both settings are applied as modifiers appended to the system prompt at runtime.
Profiles
| Profile | Model | Notes |
|---|---|---|
gemini-2.5-flash (default) | gemini-2.5-flash | Fast and efficient, suitable for real-time use |
gemini-2.5-pro | gemini-2.5-pro | Highest quality |
gemini-2.0-flash | gemini-2.0-flash | Balanced |
All profiles use a 1M (1,048,576) token context window.
Default output structure
1. ENVIRONMENT - type of place
2. HAZARDS - obstacles, stairs, vehicles (with positions)
3. KEY OBJECTS - notable items with clock positions and distances
4. TEXT - any visible text read verbatim
5. PEOPLE - count, positions, and actions
6. NAVIGATION - clear path forward, turns, or barriers
Customize the Analysis Prompt (accessibility.prompt) field to change this structure.
Authentication
Requires a Google AI API key: get one at
aistudio.google.com/apikey and set it in the
node's API key field. The node validates the key at startup: a missing key raises an
error immediately, and a key with the sk- prefix is rejected as an OpenAI key.
Schema
| Field | Type | Description | Default |
|---|---|---|---|
accessibility.prioritizeHazards | string | Hazard Priority How aggressively to prioritize hazard detection | "high" |
accessibility.prompt | string | Analysis Prompt Prompt template for generating accessibility descriptions from images | "Describe this image for a blind person. Include: environment type, hazards with positions, key objects with clock positions, visible text, people, and navigation guidance. Keep under 150 words." |
accessibility.spatialFormat | string | Spatial Format How to describe spatial positions | "clock" |
accessibility.systemPrompt | string | System Instructions Define the accessibility description behavior and priorities | "You are an accessibility-focused scene analyzer designed to help blind and visually impaired users understand their surroundings through image descriptions." |
accessibility_describe.profile | string | Vision Model Select the Gemini vision model for accessibility descriptions | "gemini-2.5-flash" |
model | string | Model Google Gemini vision model | |
modelTotalTokens | number | Tokens Maximum context length in tokens |
Dependencies
google-genai>=1.14.0