Exporting Metrics to Monitoring Platforms

Export real-time resource and execution metrics from your Cerebrium applications to your existing observability platform. Monitor CPU, memory, GPU usage, request counts, and latency alongside your other services. We support most major monitoring platforms that are OTLP-compatible.

How it works

Cerebrium automatically pushes metrics from your applications to your monitoring platform every 60 seconds using the OpenTelemetry Protocol (OTLP). You provide an OTLP endpoint and authentication credentials through the Cerebrium dashboard, and Cerebrium handles the rest — collecting resource usage and execution data, formatting it as OpenTelemetry metrics, and delivering it to your platform.

Metrics are pushed every 60 seconds
Failed pushes are retried 3 times with exponential backoff
If pushes fail 10 consecutive times, export is automatically paused to avoid noise (you can re-enable at any time from the dashboard)
Your credentials are stored encrypted and are never returned in API responses

Supported destinations

Grafana Cloud — Primary supported destination
Datadog — Via OTLP endpoint
Prometheus — Self-hosted with OTLP receiver enabled
Custom — Any OTLP-compatible endpoint (New Relic, Honeycomb, etc.)

What metrics are exported?

Resource Metrics

Metric	Type	Unit	Description
`cerebrium_cpu_utilization_cores`	Gauge	cores	CPU cores actively in use per app
`cerebrium_memory_usage_bytes`	Gauge	bytes	Memory actively in use per app
`cerebrium_gpu_memory_usage_bytes`	Gauge	bytes	GPU VRAM in use per app
`cerebrium_gpu_compute_utilization_percent`	Gauge	percent	GPU compute utilization (0-100) per app
`cerebrium_containers_running_count`	Gauge	count	Number of running containers per app
`cerebrium_containers_ready_count`	Gauge	count	Number of ready containers per app

Execution Metrics

Metric	Type	Unit	Description
`cerebrium_run_execution_time_ms`	Histogram	ms	Time spent executing user code
`cerebrium_run_queue_time_ms`	Histogram	ms	Time spent waiting in queue
`cerebrium_run_coldstart_time_ms`	Histogram	ms	Time for container cold start
`cerebrium_run_response_time_ms`	Histogram	ms	Total end-to-end response time
`cerebrium_run_total_total`	Counter	—	Total run count
`cerebrium_run_successes_total`	Counter	—	Successful run count
`cerebrium_run_errors_total`	Counter	—	Failed run count

Labels

Every metric includes the following labels for filtering and grouping:

Label	Description	Example
`project_id`	Your Cerebrium project ID	`p-abc12345`
`app_id`	Full application identifier	`p-abc12345-my-model`
`app_name`	Human-readable app name	`my-model`
`region`	Deployment region	`us-east-1`

Setup Guide

Step 1: Get your platform credentials

Before heading to the Cerebrium dashboard, you’ll need an OTLP endpoint and authentication credentials from your monitoring platform.

Grafana Cloud
Datadog
Self-hosted Prometheus
Custom OTLP

Sign in to Grafana Cloud
Go to your stack → Connections → Add new connection
Search for “OpenTelemetry” and click Configure
Copy the OTLP endpoint — this will match your stack’s region:
- US: https://otlp-gateway-prod-us-east-0.grafana.net/otlp
- EU: https://otlp-gateway-prod-eu-west-0.grafana.net/otlp
- Other regions will show their specific URL on the configuration page
On the same page, generate an API token with the MetricsPublisher role
The page will show you an Instance ID and the generated token. Run the following in your terminal to create the Basic auth string:

echo -n "INSTANCE_ID:TOKEN" | base64

Copy the output — you’ll paste it in the dashboard in the next step.

Make sure the API token has the MetricsPublisher role. The default Prometheus Remote Write token will not work with the OTLP endpoint.

Sign in to Datadog
Go to Organization Settings → API Keys
Create or copy an existing API key
Your OTLP endpoint depends on your Datadog site:

Datadog Site	OTLP Endpoint
US1 (datadoghq.com)	`https://api.datadoghq.com/api/v2/otlp`
US3 (us3.datadoghq.com)	`https://api.us3.datadoghq.com/api/v2/otlp`
US5 (us5.datadoghq.com)	`https://api.us5.datadoghq.com/api/v2/otlp`
EU (datadoghq.eu)	`https://api.datadoghq.eu/api/v2/otlp`
AP1 (ap1.datadoghq.com)	`https://api.ap1.datadoghq.com/api/v2/otlp`

You can find your site in your Datadog URL — for example, if you log in at app.us3.datadoghq.com, your site is US3.Keep your API key and endpoint handy for the next step.

Enable the OTLP receiver in your Prometheus config:
- Add --enable-feature=otlp-write-receiver flag
- Or use an OpenTelemetry Collector as a sidecar
Your endpoint will be http://YOUR_PROMETHEUS_HOST:4318 — copy this for the next step

Any platform that supports OpenTelemetry OTLP over HTTP will work, including New Relic, Honeycomb, Lightstep, and others.

Get the OTLP HTTP endpoint from your provider’s documentation
Get the required authentication headers

Common examples:

Platform	Auth Header Name	Auth Header Value
New Relic	`api-key`	Your New Relic license key
Honeycomb	`x-honeycomb-team`	Your Honeycomb API key
Lightstep	`lightstep-access-token`	Your Lightstep token

Step 2: Configure in the Cerebrium dashboard

In the Cerebrium dashboard, go to your project → Integrations → Metrics Export
Paste your OTLP endpoint from Step 1
Add your authentication headers:

Grafana Cloud
Datadog
Self-hosted Prometheus
Custom OTLP

Header name: Authorization - Header value: Basic YOUR_BASE64_STRING (the output from the terminal command in Step 1)

Header name: DD-API-KEY - Header value: Your Datadog API key

Header name: Authorization (if auth is enabled on your Prometheus, otherwise leave empty) - Header value: Bearer your-token (if auth is enabled)

Click Save & Enable

Your metrics will start flowing within 60 seconds. The dashboard will show a green “Connected” status with the time of the last successful export.

Step 3: Verify the connection

Click Test Connection in the dashboard to verify Cerebrium can reach your monitoring platform. You’ll see a success or failure message with details. If the test fails, double-check your endpoint URL and credentials from Step 1.

Viewing Metrics

Once connected, metrics will appear in your monitoring platform within a minute.

Grafana Cloud
Datadog
Prometheus

Go to your Grafana Cloud dashboard → Explore
Select your Prometheus data source — it will be named something like grafanacloud-yourstack-prom (you can find it under Connections → Data sources if you’re unsure)
Search for metrics starting with cerebrium_

Example queries:

# CPU usage by app (replace with your project ID, e.g. p-9676c59f)
cerebrium_cpu_utilization_cores{project_id="p-9676c59f"}

# Memory for a specific app
cerebrium_memory_usage_bytes{app_name="my-model"}

# Container scaling over time
cerebrium_containers_running_count{project_id="p-9676c59f"}

# Request rate (requests per second over 5 minutes)
rate(cerebrium_run_total_total{app_name="my-model"}[5m])

# p99 execution latency
histogram_quantile(0.99, rate(cerebrium_run_execution_time_ms_bucket{app_name="my-model"}[5m]))

# p99 end-to-end response time
histogram_quantile(0.99, rate(cerebrium_run_response_time_ms_bucket{app_name="my-model"}[5m]))

Go to Metrics → Explorer in your Datadog dashboard
Search for metrics starting with cerebrium
You can filter by project_id, app_name, and other labels using the “from” field

Query your Prometheus instance directly. All Cerebrium metrics are prefixed with cerebrium_:

# List all Cerebrium metrics
{__name__=~"cerebrium_.*"}

# CPU usage across all apps
cerebrium_cpu_utilization_cores

Managing Metrics Export

You can manage your metrics export configuration from the dashboard at any time by going to Integrations → Metrics Export.

Disable export: Toggle the switch off. Your configuration is preserved — you can re-enable at any time without reconfiguring.
Update credentials: Enter new authentication headers and click Save Changes. Useful when rotating API keys.
Change endpoint: Update the OTLP endpoint field and click Save Changes.
Check status: The dashboard shows whether export is connected, the time of the last successful export, and any error messages.

API Reference

You can also manage metrics export programmatically. Find your Cerebrium API key in the dashboard under Settings → API Keys.

Method	Endpoint	Description
`GET`	`/v2/metrics-export/{project_id}/config`	Get current export configuration
`PUT`	`/v2/metrics-export/{project_id}/config`	Update export configuration
`POST`	`/v2/metrics-export/{project_id}/test`	Test connection to your monitoring platform

Enable with endpoint and credentials:

curl -X PUT "https://rest.cerebrium.ai/v2/metrics-export/YOUR_PROJECT_ID/config" \
  -H "Authorization: Bearer YOUR_CEREBRIUM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "enabled": true,
    "otlpEndpoint": "https://otlp-gateway-prod-us-east-0.grafana.net/otlp",
    "authHeaders": {
      "Authorization": "Basic YOUR_BASE64_CREDENTIALS"
    }
  }'

Test connection:

curl -X POST "https://rest.cerebrium.ai/v2/metrics-export/YOUR_PROJECT_ID/test" \
  -H "Authorization: Bearer YOUR_CEREBRIUM_API_KEY"

Disable export:

curl -X PUT "https://rest.cerebrium.ai/v2/metrics-export/YOUR_PROJECT_ID/config" \
  -H "Authorization: Bearer YOUR_CEREBRIUM_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"enabled": false}'

The authHeaders field is a map of header name → header value. These are stored encrypted and never returned in API responses.

Getting Started

Container Images

GPUs and Compute Resources

Scaling apps

Deployments

Endpoints

Networking

Storage

Partner Services

Integrations

Other concepts

Exporting Metrics to Monitoring Platforms

How it works

Supported destinations

What metrics are exported?

Resource Metrics

Execution Metrics

Labels

Setup Guide

Step 1: Get your platform credentials

Step 2: Configure in the Cerebrium dashboard

Step 3: Verify the connection

Viewing Metrics

Managing Metrics Export

Getting Started

Container Images

GPUs and Compute Resources

Scaling apps

Deployments

Endpoints

Networking

Storage

Partner Services

Integrations

Other concepts

​How it works

​Supported destinations

​What metrics are exported?

​Resource Metrics

​Execution Metrics

​Labels

​Setup Guide

​Step 1: Get your platform credentials

​Step 2: Configure in the Cerebrium dashboard

​Step 3: Verify the connection

​Viewing Metrics

​Managing Metrics Export

How it works

Supported destinations

What metrics are exported?

Resource Metrics

Execution Metrics

Labels

Setup Guide

Step 1: Get your platform credentials

Step 2: Configure in the Cerebrium dashboard

Step 3: Verify the connection

Viewing Metrics

Managing Metrics Export