Skip to main content
Export real-time resource and execution metrics from your Cerebrium applications to your existing observability platform. Monitor CPU, memory, GPU usage, request counts, and latency alongside your other services. We support most major monitoring platforms that are OTLP-compatible.

How it works

Cerebrium automatically pushes metrics from your applications to your monitoring platform every 60 seconds using the OpenTelemetry Protocol (OTLP). You provide an OTLP endpoint and authentication credentials through the Cerebrium dashboard, and Cerebrium handles the rest — collecting resource usage and execution data, formatting it as OpenTelemetry metrics, and delivering it to your platform.
  • Metrics are pushed every 60 seconds
  • Failed pushes are retried 3 times with exponential backoff
  • If pushes fail 10 consecutive times, export is automatically paused to avoid noise (you can re-enable at any time from the dashboard)
  • Your credentials are stored encrypted and are never returned in API responses

Supported destinations

  • Grafana Cloud — Primary supported destination
  • Datadog — Via OTLP endpoint
  • Prometheus — Self-hosted with OTLP receiver enabled
  • Custom — Any OTLP-compatible endpoint (New Relic, Honeycomb, etc.)

What metrics are exported?

Resource Metrics

MetricTypeUnitDescription
cerebrium_cpu_utilization_coresGaugecoresCPU cores actively in use per app
cerebrium_memory_usage_bytesGaugebytesMemory actively in use per app
cerebrium_gpu_memory_usage_bytesGaugebytesGPU VRAM in use per app
cerebrium_gpu_compute_utilization_percentGaugepercentGPU compute utilization (0-100) per app
cerebrium_containers_running_countGaugecountNumber of running containers per app
cerebrium_containers_ready_countGaugecountNumber of ready containers per app

Execution Metrics

MetricTypeUnitDescription
cerebrium_run_execution_time_msHistogrammsTime spent executing user code
cerebrium_run_queue_time_msHistogrammsTime spent waiting in queue
cerebrium_run_coldstart_time_msHistogrammsTime for container cold start
cerebrium_run_response_time_msHistogrammsTotal end-to-end response time
cerebrium_run_total_totalCounterTotal run count
cerebrium_run_successes_totalCounterSuccessful run count
cerebrium_run_errors_totalCounterFailed run count

Labels

Every metric includes the following labels for filtering and grouping:
LabelDescriptionExample
project_idYour Cerebrium project IDp-abc12345
app_idFull application identifierp-abc12345-my-model
app_nameHuman-readable app namemy-model
regionDeployment regionus-east-1

Setup Guide

Step 1: Get your platform credentials

Before heading to the Cerebrium dashboard, you’ll need an OTLP endpoint and authentication credentials from your monitoring platform.
  1. Sign in to Grafana Cloud
  2. Go to your stack → ConnectionsAdd new connection
  3. Search for “OpenTelemetry” and click Configure
  4. Copy the OTLP endpoint — this will match your stack’s region:
    • US: https://otlp-gateway-prod-us-east-0.grafana.net/otlp
    • EU: https://otlp-gateway-prod-eu-west-0.grafana.net/otlp
    • Other regions will show their specific URL on the configuration page
  5. On the same page, generate an API token with the MetricsPublisher role
  6. The page will show you an Instance ID and the generated token. Run the following in your terminal to create the Basic auth string:
echo -n "INSTANCE_ID:TOKEN" | base64
Copy the output — you’ll paste it in the dashboard in the next step.
Make sure the API token has the MetricsPublisher role. The default Prometheus Remote Write token will not work with the OTLP endpoint.

Step 2: Configure in the Cerebrium dashboard

  1. In the Cerebrium dashboard, go to your project → IntegrationsMetrics Export
  2. Paste your OTLP endpoint from Step 1
  3. Add your authentication headers:
  • Header name: Authorization - Header value: Basic YOUR_BASE64_STRING (the output from the terminal command in Step 1)
  1. Click Save & Enable
Your metrics will start flowing within 60 seconds. The dashboard will show a green “Connected” status with the time of the last successful export.

Step 3: Verify the connection

Click Test Connection in the dashboard to verify Cerebrium can reach your monitoring platform. You’ll see a success or failure message with details. If the test fails, double-check your endpoint URL and credentials from Step 1.

Viewing Metrics

Once connected, metrics will appear in your monitoring platform within a minute.
  1. Go to your Grafana Cloud dashboard → Explore
  2. Select your Prometheus data source — it will be named something like grafanacloud-yourstack-prom (you can find it under ConnectionsData sources if you’re unsure)
  3. Search for metrics starting with cerebrium_
Example queries:
# CPU usage by app (replace with your project ID, e.g. p-9676c59f)
cerebrium_cpu_utilization_cores{project_id="p-9676c59f"}

# Memory for a specific app
cerebrium_memory_usage_bytes{app_name="my-model"}

# Container scaling over time
cerebrium_containers_running_count{project_id="p-9676c59f"}

# Request rate (requests per second over 5 minutes)
rate(cerebrium_run_total_total{app_name="my-model"}[5m])

# p99 execution latency
histogram_quantile(0.99, rate(cerebrium_run_execution_time_ms_bucket{app_name="my-model"}[5m]))

# p99 end-to-end response time
histogram_quantile(0.99, rate(cerebrium_run_response_time_ms_bucket{app_name="my-model"}[5m]))

Managing Metrics Export

You can manage your metrics export configuration from the dashboard at any time by going to IntegrationsMetrics Export.
  • Disable export: Toggle the switch off. Your configuration is preserved — you can re-enable at any time without reconfiguring.
  • Update credentials: Enter new authentication headers and click Save Changes. Useful when rotating API keys.
  • Change endpoint: Update the OTLP endpoint field and click Save Changes.
  • Check status: The dashboard shows whether export is connected, the time of the last successful export, and any error messages.
You can also manage metrics export programmatically. Find your Cerebrium API key in the dashboard under SettingsAPI Keys.
MethodEndpointDescription
GET/v2/metrics-export/{project_id}/configGet current export configuration
PUT/v2/metrics-export/{project_id}/configUpdate export configuration
POST/v2/metrics-export/{project_id}/testTest connection to your monitoring platform
Enable with endpoint and credentials:
curl -X PUT "https://rest.cerebrium.ai/v2/metrics-export/YOUR_PROJECT_ID/config" \
  -H "Authorization: Bearer YOUR_CEREBRIUM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "otlpEndpoint": "https://otlp-gateway-prod-us-east-0.grafana.net/otlp",
    "authHeaders": {
      "Authorization": "Basic YOUR_BASE64_CREDENTIALS"
    }
  }'
Test connection:
curl -X POST "https://rest.cerebrium.ai/v2/metrics-export/YOUR_PROJECT_ID/test" \
  -H "Authorization: Bearer YOUR_CEREBRIUM_API_KEY"
Disable export:
curl -X PUT "https://rest.cerebrium.ai/v2/metrics-export/YOUR_PROJECT_ID/config" \
  -H "Authorization: Bearer YOUR_CEREBRIUM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"enabled": false}'
The authHeaders field is a map of header name → header value. These are stored encrypted and never returned in API responses.