OpenRouter

Evaluate OpenRouter video generation with traces and benchmarks

Frametail wraps the OpenRouter SDK so chat and videoGeneration calls export spans automatically — then you pin datasets, attach scorers, and run immutable benchmarks across models.

Why OpenRouter teams add Frametail

OpenRouter routes requests across many models through one API — useful for swapping video endpoints without rewiring your app. The hard part is knowing whether a model change actually improved output quality. Frametail records each call as a trace with model, latency, and artifact references, then lets you compare runs on a fixed dataset.

You keep OpenRouter for inference and model choice. Frametail is the evaluation layer: benchmarks that do not drift, scorers your team agrees on, and experiments that become linkable release artifacts.

What gets traced

The SDK wraps `@openrouter/sdk` for `chat.send` (including streaming) and `videoGeneration` operations — generate, poll jobs, fetch video content, and list video models. Each span carries provider metadata, model id, and sanitized inputs/outputs suitable for debugging beside clips in the dashboard.

Use the same project for OpenRouter and fal if you run multiple providers; benchmarks stay comparable when dataset rows and scorers are pinned.

Typical OpenRouter workflow

Enable tracing in staging, generate video through OpenRouter while you iterate on prompts, then pin representative rows and run a benchmark before promoting a new model slug to production. When live traffic regresses, open the trace for that generation — not a generic log line.

Workflow

  1. Install `frametail` and `@openrouter/sdk`, create a FrametailClient with tracing enabled.
  2. Call `wrapOpenRouterClient` on your OpenRouter instance before chat or videoGeneration calls.
  3. Review traces beside artifacts; tune prompts in experiments.
  4. Pin a dataset and run a benchmark with your scorer roster across model variants.
  5. Share benchmark links when you change OpenRouter model routes in release review.

Read the technical setup guide →