Frametail
Observability

Cost & token analytics

Track spending and token usage across your LLM traces and benchmarks.

Frametail tracks the cost and token consumption of every LLM call, benchmark run, and live-scored trace. This helps you understand where money goes, optimize expensive operations, and set budgets.

Understanding costs

When you ingest a trace or run a benchmark, Frametail computes cost in integer micro-USD (USD × 1,000,000) to ensure exact sums across thousands of spans. This eliminates the floating-point rounding errors that accumulate with float cost fields.

How cost is calculated

Cost is determined in this order (first match wins):

  1. Explicit cost — If you send metrics.costMicroUsd directly, use it as-is.
  2. Converted legacy cost — If you send metrics.costUsd (deprecated float), convert to micro-USD: costMicroUsd = round(costUsd × 1,000,000).
  3. Computed from tokens + model — If tokens and a recognized model are present, look up the per-token price and compute:
    costMicroUsd = round(inputTokens × inputPricePerMillion + outputTokens × outputPricePerMillion)

Supported models & pricing

Frametail has built-in pricing for common LLM providers (list prices, rounded):

ProviderModelInput (per 1M)Output (per 1M)
OpenAIgpt-4o$2.50$10.00
OpenAIgpt-4.1-mini$0.40$1.60
Anthropicclaude-3.5-sonnet$3.00$15.00
Anthropicclaude-3.5-haiku$0.80$4.00
Googlegemini-2.0-flash$0.10$0.40
Metallama-3.3-70b$0.12$0.30

Custom pricing

Override or extend model pricing via the FRAMETAIL_MODEL_PRICING environment variable:

export FRAMETAIL_MODEL_PRICING='{
  "your-vendor/custom-model": {
    "inputPer1M": 0.50,
    "outputPer1M": 1.50
  }
}'

This merges with the built-in catalog; your custom models take precedence.

SDK usage

When instrumenting with the Frametail SDK, include tokens in your span metrics:

import Frametail from 'frametail'

const client = new Frametail()

const result = await client.trace('generate-response', async (span) => {
  const response = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: [{ role: 'user', content: 'Summarize this...' }],
  })

  span.setMetrics({
    promptTokens: response.usage.prompt_tokens,
    completionTokens: response.usage.completion_tokens,
    totalTokens: response.usage.total_tokens,
    // Optionally set explicit cost if you negotiate custom rates:
    // costMicroUsd: Math.round(0.00042 * 1_000_000),
  })

  return response.choices[0].message.content
})

The model is derived from the span name or attributes; Frametail auto-computes cost from tokens.

Querying usage

Fetch cost and token rollups for a project via the API:

curl https://api.frametail.io/graphql \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "query": "query { projectUsageRollup(projectId: \"...\", startTime: 1718731200000, endTime: 1718817600000) { costMicroUsd spanCount promptTokens completionTokens byModel { model spanCount costMicroUsd totalTokens } } }"
  }'

Response (example):

{
  "data": {
    "projectUsageRollup": {
      "costMicroUsd": 125000,
      "spanCount": 3200,
      "promptTokens": 520000,
      "completionTokens": 180000,
      "byModel": [
        {
          "model": "openai/gpt-4o",
          "spanCount": 500,
          "costMicroUsd": 85000,
          "totalTokens": 420000
        },
        {
          "model": "openai/gpt-4.1-mini",
          "spanCount": 2700,
          "costMicroUsd": 40000,
          "totalTokens": 480000
        }
      ]
    }
  }
}

Note: costMicroUsd is in integer micro-USD. Divide by 1,000,000 to get USD:

125000 micro-USD = $0.125 USD

Time ranges

The startTime and endTime are Unix timestamps in milliseconds:

// Last 24 hours
const now = Date.now()
const startTime = now - 24 * 60 * 60 * 1000
const endTime = now

// Specific date range
const startTime = new Date('2024-06-18T00:00:00Z').getTime()
const endTime = new Date('2024-06-19T00:00:00Z').getTime()

Benchmarks

Every benchmark run aggregates cost across all generated samples. In the benchmark detail view, you can see:

  • Total cost for the run
  • Average cost per sample
  • Cost by model (if multiple models are used)

This helps you understand whether a new model version is worth the extra spend.

Dashboard & alerts

Future versions of Frametail will include:

  • A Cost Over Time graph showing daily/weekly spend trends.
  • Budget alerts when you approach thresholds.
  • Cost breakdown by model, project, and scorer.

For now, use the API queries above to build your own dashboards.

Troubleshooting

Cost is $0 even though I have tokens:

  • The model may not be in Frametail's pricing catalog. Check FRAMETAIL_MODEL_PRICING or set it explicitly via span.setMetrics({ costMicroUsd: ... }).

Cost looks too high / too low:

  • Verify your token counts are correct (print response.usage).
  • Check the model pricing table above; some list prices vary by region.
  • If using custom pricing, double-check the inputPer1M and outputPer1M values.

Costs don't sum correctly:

  • If you're using float costUsd on old spans, rounding differences can accumulate. Migrate to costMicroUsd (integer) for new spans.