Browse docs

API

Tap to expand

Contribute

APIUpdated 2026-03-18

Cache, Cost, and Usage Endpoints

Inspect cache behavior, understand cost, and measure API activity with the operational endpoints most teams reach for after the first successful integration.

These endpoints are for teams that already have RetainDB running and now want to answer operational questions like:

  • Is cache doing useful work?
  • What is this project costing?
  • Which request types are driving usage?

They are not part of the initial setup path, but they become useful quickly once traffic is real.

Endpoints covered here

  • GET /v1/cache/stats
  • GET /v1/cost/summary
  • GET /v1/cost/breakdown
  • GET /v1/cost/savings
  • GET /v1/usage
  • GET /v1/usage/timeseries

Cache stats

Use GET /v1/cache/stats to answer a simple question: is cache helping or just existing?

bash
curl "https://api.retaindb.com/v1/cache/stats" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

Example response:

json
{
  "cache_type": "redis",
  "hit_rate": 0.78,
  "total_requests": 19700,
  "hits": 15420,
  "misses": 4280,
  "size_bytes": 1843200,
  "keys_count": 913,
  "average_latency_ms": 5,
  "uptime_seconds": 86400
}

The fields that matter most at first:

  • hit_rate
  • total_requests
  • average_latency_ms
  • cache_type

Cost summary

Use GET /v1/cost/summary for the big picture.

Query parameters:

  • project optional
  • start_date optional ISO datetime
  • end_date optional ISO datetime
bash
curl "https://api.retaindb.com/v1/cost/summary?project=retaindb-quickstart" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

Example response:

json
{
  "org_id": "org_123",
  "project_id": "proj_456",
  "period": {
    "start": "2026-03-01T00:00:00.000Z",
    "end": "2026-03-09T00:00:00.000Z"
  },
  "total_cost_usd": 125.5,
  "total_requests": 125000,
  "cost_by_model": {
    "claude-sonnet": 74.2
  },
  "cost_by_task": {
    "query": 80.1,
    "ingest": 45.4
  },
  "average_cost_per_request": 0.001,
  "estimated_monthly_cost": 512.0
}

Cost breakdown

Use GET /v1/cost/breakdown when the summary is not enough and you need to see where spend is concentrating.

Query parameters:

  • project optional
  • group_by optional: model, task, day, or hour
  • start_date optional
  • end_date optional
bash
curl "https://api.retaindb.com/v1/cost/breakdown?group_by=task" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

Good first choice: group_by=task

That usually answers the operational question faster than grouping by model.

Cost savings

Use GET /v1/cost/savings when you want RetainDB's optimization story, not just raw spend.

bash
curl "https://api.retaindb.com/v1/cost/savings" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

The response compares actual cost against an “always use the expensive path” baseline.

Look for:

  • actual_cost_usd
  • opus_only_cost_usd
  • savings_usd

Usage summary

Use GET /v1/usage for aggregate usage over the last n days.

bash
curl "https://api.retaindb.com/v1/usage?days=30" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

The response groups usage by event type and includes:

  • request count
  • total tokens
  • total embedding tokens
  • average latency

Usage timeseries

Use GET /v1/usage/timeseries when you need a trend line instead of a single rollup.

Query parameters:

  • days optional, defaults to 7
  • project_id optional
bash
curl "https://api.retaindb.com/v1/usage/timeseries?days=7" \
  -H "Authorization: Bearer $RETAINDB_API_KEY"

This is the better endpoint for dashboards and regression checks.

Good defaults

  • start with org-wide summary before filtering by project
  • use group_by=task for the first cost investigation
  • use days=7 or days=30 before reaching for custom ranges
  • check cache hit rate and usage trends together; one without the other is easy to misread

Common mistakes

Treating usage as billing

Usage and cost are related, but they are not the same endpoint family. Start with usage for volume and cost for spend.

Filtering too early

If you jump straight to a project filter, you can miss whether the problem is systemic across the org.

Debugging latency from cost pages

These pages help with operational visibility, not request-level trace debugging. For single-request behavior, use endpoint-specific responses and trace ids.

Next step

If you want the dashboard view of similar data, go to usage analytics. If you are tuning request behavior rather than monitoring spend, continue to latency accounting.

Was this page helpful?

Your feedback helps us prioritize docs improvements weekly.