Core concepts
Projects, traces, benchmarks, scorers, datasets, and prompts — how Frametail models your work.
Organization and project
Your organization owns billing, members, and API keys. A project is the unit of isolation for operational data: traces, prompts, benchmarks, and related artifacts. Most API calls resolve a project through the X-Project-Key header.
Traces and spans
A trace is a single end-to-end request or job you care about (for example, one video generation). Spans are timed steps inside that trace — model calls, preprocessing, uploads, or custom stages you instrument. Traces power latency charts, error triage, and comparisons across releases.
Benchmarks
A benchmark runs a defined evaluation over a dataset using one or more scorers. Use benchmarks to track quality over time, compare prompts or models, and gate releases with repeatable metrics.
Datasets
Datasets are curated tables of inputs (and optional references) your benchmark jobs iterate over. They keep evaluation reproducible and shareable across teammates.
Scorers
Scorers encode how outputs are judged — from structured rubrics to model-assisted graders. Attach scorers to benchmarks so every row receives consistent scoring.
Prompts
Prompts are versioned text (or parameter sets) tied to your generative workflows. Store canonical prompt bodies in Frametail so experiments and production stay aligned.