Studio Backend¶

This guide covers the deployable relayna-studio backend packaged under studio/backend/. It is the centralized control-plane service that federates registered Relayna services for the Studio frontend.

If you are integrating a downstream service with the SDK first, start with Getting Started.

Role In The Architecture¶

The Studio backend is the single API surface the browser should talk to for control-plane operations.

It owns:

a Redis-backed service registry
Redis-backed event, health, and task-search stores
a federated read layer that proxies registered Relayna services
background workers for event pull sync, health refresh, and retention pruning

The intended request flow is:

browser -> Studio frontend -> /studio/* -> Studio backend -> registered Relayna services

The browser should not call each registered service directly.

Package And Runtime Model¶

The backend is packaged separately from the SDK:

source package root: studio/backend/src/relayna_studio
distribution name: relayna-studio
CLI entrypoint: relayna-studio
ASGI app import: relayna_studio.asgi:app

Repo-backed build targets:

docker build -f studio/backend/Dockerfile -t relayna-studio-backend .

The backend image builds both wheels from the repo root:

relayna
relayna-studio

That split matters operationally: downstream services consume the SDK, while the central control plane runs the backend package.

Runtime Configuration¶

The backend reads configuration from StudioBackendSettings.from_env().

Required¶

Variable	Default	Purpose
`RELAYNA_STUDIO_REDIS_URL`	none	Required Redis connection for registry, events, health, and search state.

Optional network and process settings¶

Variable	Default	Purpose
`RELAYNA_STUDIO_TITLE`	`Relayna Studio Backend`	FastAPI app title and operator-facing name.
`RELAYNA_STUDIO_HOST`	`0.0.0.0`	Bind address for `uvicorn`.
`RELAYNA_STUDIO_PORT`	`8000`	Listening port.
`RELAYNA_STUDIO_APP_STATE_KEY`	`studio`	Key used on `app.state` for the runtime object.
`RELAYNA_STUDIO_FEDERATION_TIMEOUT_SECONDS`	`5.0`	Timeout used by the backend HTTP client when talking to registered services.

Optional registry and capability settings¶

Variable	Default	Purpose
`RELAYNA_STUDIO_REGISTRY_PREFIX`	`studio:services`	Redis prefix for persisted service registry entries.
`RELAYNA_STUDIO_CAPABILITY_REFRESH_ALLOWED_HOSTS`	unset	Comma-separated host suffix allowlist for Studio backend egress to registered services, Loki, Prometheus, and Tempo.
`RELAYNA_STUDIO_CAPABILITY_REFRESH_ALLOWED_NETWORKS`	unset	Comma-separated CIDR allowlist for Studio backend egress to literal IP targets.
`RELAYNA_STUDIO_CAPABILITY_STALE_AFTER_SECONDS`	`180`	Threshold after which a cached capability snapshot is considered stale.

Optional event ingestion settings¶

Variable	Default	Purpose
`RELAYNA_STUDIO_EVENT_STORE_PREFIX`	`studio:events`	Redis prefix for retained Studio events.
`RELAYNA_STUDIO_EVENT_STORE_TTL_SECONDS`	`86400`	TTL for retained events. Use `none`, `null`, or `off` to disable TTL.
`RELAYNA_STUDIO_EVENT_HISTORY_MAXLEN`	`5000`	Max retained event history length.
`RELAYNA_STUDIO_PUSH_INGEST_ENABLED`	`false`	Enables direct `POST /studio/ingest/events` push ingestion. Pull sync remains available when this is disabled.
`RELAYNA_STUDIO_PULL_SYNC_INTERVAL_SECONDS`	`5.0`	Interval for the background pull-sync worker. Use `none`, `null`, or `off` to disable the worker.

Optional health settings¶

Variable	Default	Purpose
`RELAYNA_STUDIO_HEALTH_STORE_PREFIX`	`studio:health`	Redis prefix for service health snapshots.
`RELAYNA_STUDIO_HEALTH_REFRESH_INTERVAL_SECONDS`	`60.0`	Interval for health refresh polling. The default Studio frontend services screen polls `/studio/services` on a similar cadence so updated health summaries appear without a manual page reload. Use `none`, `null`, or `off` to disable.
`RELAYNA_STUDIO_OBSERVATION_STALE_AFTER_SECONDS`	`300`	Threshold after which ingested observations are considered stale.
`RELAYNA_STUDIO_WORKER_HEARTBEAT_STALE_AFTER_SECONDS`	`90`	Threshold after which worker heartbeats are considered stale.

Optional search and retention settings¶

Variable	Default	Purpose
`RELAYNA_STUDIO_TASK_SEARCH_INDEX_PREFIX`	`studio:search`	Redis prefix for task search index state.
`RELAYNA_STUDIO_TASK_INDEX_TTL_SECONDS`	`86400`	TTL for search index entries.
`RELAYNA_STUDIO_RETENTION_PRUNE_INTERVAL_SECONDS`	`60.0`	Interval for retention pruning. Use `none`, `null`, or `off` to disable.

Practical recommendations¶

Local development:
keep defaults for intervals and prefixes
use redis://localhost:6379/0
Shared dev or staging:
keep intervals enabled
use a dedicated Redis DB or dedicated prefixes
set capability refresh allowlists explicitly
Production:
set explicit allowlists for capability refresh
use deployment-specific Redis isolation
keep retention and health intervals intentional rather than relying on defaults

Redis Requirements And Data Use¶

Redis is mandatory for the Studio backend. It stores:

service registry records
retained event envelopes
service health snapshots
task search index state

Important prefixes:

registry: studio:services
events: studio:events
health: studio:health
task search: studio:search

TTL behavior:

event retention uses RELAYNA_STUDIO_EVENT_STORE_TTL_SECONDS
task index retention uses RELAYNA_STUDIO_TASK_INDEX_TTL_SECONDS
prune behavior is driven by RELAYNA_STUDIO_RETENTION_PRUNE_INTERVAL_SECONDS

If you share one Redis instance across environments, change prefixes or DBs so local and non-local control-plane state cannot collide.

Local Source Run¶

From the repo root, prepare Python dependencies:

uv sync --extra dev

From studio/backend/, run the packaged backend target:

make run RELAYNA_STUDIO_REDIS_URL=redis://localhost:6379/0

Or run directly from source:

PYTHONPATH=src:studio/backend/src \
RELAYNA_STUDIO_REDIS_URL=redis://localhost:6379/0 \
uv run python -m relayna_studio

The CLI entrypoint ultimately serves uvicorn relayna_studio.asgi:app.

Docker Run¶

Build:

docker build -f studio/backend/Dockerfile -t relayna-studio-backend .

Tag releases publish the backend image to GHCR as:

ghcr.io/sarattha/relayna-studio-backend

Run:

docker run --rm -p 8000:8000 \
  -e RELAYNA_STUDIO_REDIS_URL=redis://host.docker.internal:6379/0 \
  -e RELAYNA_STUDIO_CAPABILITY_REFRESH_ALLOWED_HOSTS=.svc.local,.cluster.local \
  relayna-studio-backend

The container defaults to:

host: 0.0.0.0
port: 8000

Service Registration And Backend Expectations¶

Each registered service record is anchored by:

service_id
name
base_url
environment
tags
auth_mode
optional log_config

Operational expectations:

service_id should remain stable across deploys
base_url must be reachable from the Studio backend, not merely from a developer browser
base_url must be http or https and must not include query strings, fragments, or user info
environment should be meaningful to operators, not incidental
auth_mode should reflect how the backend is expected to reach the service

Studio backend egress guardrails:

RELAYNA_STUDIO_CAPABILITY_REFRESH_ALLOWED_HOSTS filters allowed host suffixes for registered service URLs, Loki URLs, Prometheus URLs, and Tempo URLs
RELAYNA_STUDIO_CAPABILITY_REFRESH_ALLOWED_NETWORKS filters allowed literal IP networks
AKS service DNS should normally be allowed with host suffixes such as .svc.cluster.local
literal private IP targets are rejected unless their CIDR is explicitly listed in RELAYNA_STUDIO_CAPABILITY_REFRESH_ALLOWED_NETWORKS

If these are set too tightly, registration, refresh, federation, health, event pull-sync, or log queries can fail even when the service is otherwise healthy.

Loki `log_config` for AKS service/app/task views¶

Studio log queries are backend-driven. The browser never talks to Loki directly. Instead, each registered service may carry a log_config that tells the Studio backend how to query Loki for that service.

For the AKS deployment pattern introduced in the latest log-scoping feature, the intended mapping is:

one stable service label for the logical Relayna service
one app label for the concrete emitter or workload, such as API, worker, aggregator, or scaled-job pods
task IDs embedded in log line text when they are not available as Loki labels

Recommended registration shape:

{
  "service_id": "checker-service",
  "name": "Checker Service",
  "base_url": "http://checker-service-api.default.svc.cluster.local:8000",
  "environment": "prod-aks",
  "tags": ["checker", "aks"],
  "auth_mode": "internal_network",
  "log_config": {
    "provider": "loki",
    "base_url": "http://loki.default.svc.cluster.local:3100",
    "tenant_id": null,
    "service_selector_labels": {
      "service": "checker-service"
    },
    "source_label": "app",
    "task_match_mode": "contains",
    "task_match_template": "{task_id}",
    "task_id_label": null,
    "correlation_id_label": "correlation_id",
    "level_label": "level"
  }
}

Field-by-field guidance:

service_selector_labels
the base Loki selector for service-scoped reads
for AKS, this is usually { "service": "<logical-service-name>" }
source_label
the Loki label used to distinguish emitters inside one service scope
for AKS, this is usually "app"
task_match_mode
"label" when task_id exists as a Loki label
"contains" when the task ID appears as plain text in the log line
"regex" when task matching requires a structured log pattern
task_match_template
optional template rendered with {task_id}
default behavior is effectively the raw task id when omitted
correlation_id_label
optional exact-match Loki label for correlation-scoped narrowing
level_label
optional exact-match Loki label for level filtering

How the backend turns that config into Loki queries:

service page, all logs for the service:
{service="checker-service"}
service page, filtered to one app:
{app="checker-service-api",service="checker-service"}
task page, task ID inside line text:
{service="checker-service"} |= "checker_endjdbdgsjmaksdhdsdsdd"
task page, bounded to a lifecycle window:
same query plus Loki start and end derived from the task timeline or manual operator overrides

Important operational behavior:

the backend merges all matching Loki streams into one time-ordered response
source/app suggestions in the UI are discovered from returned log entries; the backend does not maintain a separate app catalog
task pages now pass optional from and to values to the backend, which the Loki provider maps to start and end
when task_match_mode is "contains" or "regex", Studio does not require task_id_label

Prometheus `metrics_config` for AKS service/task views¶

Studio metric queries are also backend-driven. The browser asks Studio for normalized metrics; Studio validates the registered Prometheus URL and sends bounded PromQL query_range requests.

For AKS, a registered service should usually carry both log_config and metrics_config:

{
  "service_id": "checker-service",
  "metrics_config": {
    "provider": "prometheus",
    "base_url": "http://prometheus.observability.svc.cluster.local:9090",
    "namespace": "default",
    "service_selector_labels": {
      "service": "checker-service"
    },
    "runtime_service_label_value": "relayna",
    "namespace_label": "namespace",
    "pod_label": "pod",
    "container_label": "container",
    "step_seconds": 30,
    "task_window_padding_seconds": 120
  }
}

Field-by-field guidance:

service_selector_labels
low-cardinality Kubernetes pod labels that identify pods owned by this logical service
Studio translates these to kube_pod_labels matchers such as label_service="<logical-service-name>" and joins platform metrics by namespace and pod
for AKS, this is usually { "service": "<logical-service-name>" }
runtime_service_label_value
optional value for Relayna runtime metrics where the Prometheus label is named service
set this to "relayna" when SDK runtimes use the create_relayna_lifespan(...) default metrics service name
set it to the logical service name when you construct RelaynaMetrics(service="<logical-service-name>")
namespace, namespace_label, pod_label, and container_label
tell Studio how to select and join Kubernetes/cAdvisor/kube-state-metrics series
task_window_padding_seconds
expands automatic task metric windows before and after the task lifecycle

Studio renders two different metric classes:

Kubernetes pod/container metrics from Prometheus. These are service-level or task-window approximations scoped through kube_pod_labels pod ownership joins, and are never exact per-task CPU/memory.
Relayna runtime metrics from Prometheus. These are aggregate counters, gauges, and histograms with low-cardinality labels only.

Exact per-task CPU/RSS comes from Relayna observation samples in task detail and execution graph data, not Prometheus labels.

Prometheus needs kubelet/cAdvisor, kube-state-metrics, Relayna API/worker /metrics, and Studio backend /metrics scrape targets for complete Studio charts. See AKS observability stack for a deployable starter stack and architecture diagram.

Tempo `trace_config` for task trace correlation¶

Studio trace queries are backend-driven. The browser asks Studio for normalized trace summaries and spans; Studio discovers candidate trace IDs from task detail and task log fields, validates the registered Tempo URL, queries Tempo, and returns a Studio-shaped response.

Registered services that export OpenTelemetry traces to Tempo can add trace_config:

{
  "service_id": "checker-service",
  "trace_config": {
    "provider": "tempo",
    "base_url": "http://tempo.observability.svc.cluster.local:3200",
    "public_base_url": null,
    "tenant_id": null,
    "query_path": "/api/traces/{trace_id}"
  }
}

Field-by-field guidance:

base_url
Tempo URL reachable from the Studio backend
validated by the same outbound URL policy used for registered services, Loki, and Prometheus
public_base_url
optional browser-safe Tempo URL for operators
useful when Studio queries host.docker.internal or cluster DNS but the browser needs localhost or an ingress URL
tenant_id
optional X-Scope-OrgID tenant header for multi-tenant Tempo deployments
query_path
path template containing {trace_id}
defaults to /api/traces/{trace_id}

Relayna SDK trace propagation is optional and off by default. Relayna carries W3C traceparent and tracestate through RabbitMQ headers and attaches trace_id/span_id to structured logs and observations only when an application-owned OpenTelemetry SDK has an active span. Relayna core does not install exporters or send spans to Tempo by itself.

Studio exposes:

GET /studio/tasks/{service_id}/{task_id}/traces

The task detail UI renders a Trace Correlation section. When traces are present, operators can open a Studio-native span detail modal and filter task logs by the span's trace ID. When trace_config is missing or no trace IDs are found, Studio shows a non-error empty state.

Broker DLQ setup for Studio federation¶

Studio can federate live broker DLQ inspection through:

/studio/services/{service_id}/broker/dlq/messages

That only works when the registered Relayna service exposes:

GET /broker/dlq/messages
a capability document advertising broker.dlq.messages

Minimal service-side setup:

from fastapi import FastAPI

from relayna.api import (
    BROKER_DLQ_CAPABILITY_ROUTE_IDS,
    DLQ_CAPABILITY_ROUTE_IDS,
    STATUS_CAPABILITY_ROUTE_IDS,
    create_capabilities_router,
    create_dlq_router,
    create_relayna_lifespan,
    get_relayna_runtime,
    merge_capability_route_ids,
)
from relayna.dlq import DLQService, RabbitMQManagementDLQInspector

BROKER_DLQ_QUEUE_NAMES = [
    "checker.tasks.queue.dlq",
    "checker.aggregation.queue.0.dlq",
]

app = FastAPI(
    lifespan=create_relayna_lifespan(
        topology=topology,
        redis_url="redis://localhost:6379/0",
        dlq_store_prefix="checker-dlq",
        broker_message_inspector=RabbitMQManagementDLQInspector(
            base_url="http://rabbitmq:15672",
            username="guest",
            password="guest",
            vhost="/",
        ),
    )
)
runtime = get_relayna_runtime(app)

app.include_router(
    create_capabilities_router(
        topology=topology,
        supported_routes=merge_capability_route_ids(
            STATUS_CAPABILITY_ROUTE_IDS,
            DLQ_CAPABILITY_ROUTE_IDS,
            BROKER_DLQ_CAPABILITY_ROUTE_IDS,
        ),
        service_title="Checker Service",
    )
)

if runtime.dlq_store is not None:
    app.include_router(
        create_dlq_router(
            dlq_service=DLQService(
                rabbitmq=runtime.rabbitmq,
                dlq_store=runtime.dlq_store,
                status_store=runtime.store,
                broker_message_inspector=runtime.broker_message_inspector,
            ),
            broker_dlq_queue_names=BROKER_DLQ_QUEUE_NAMES,
        )
    )

What each part does:

broker_message_inspector=RabbitMQManagementDLQInspector(...)
enables live message reads from RabbitMQ Management API
broker_dlq_queue_names=[...]
allowlists the DLQ queues the route may inspect
BROKER_DLQ_CAPABILITY_ROUTE_IDS
advertises broker.dlq.messages in the service capability document so Studio can enable broker mode in the UI

Example service-side verification:

curl -s "http://localhost:8000/broker/dlq/messages?limit=20"
curl -s "http://localhost:8000/broker/dlq/messages?queue_name=checker.tasks.queue.dlq&limit=20"
curl -s "http://localhost:8000/broker/dlq/messages?task_id=task-123&limit=20"
curl -s http://localhost:8000/relayna/capabilities

Example Studio-side verification after registration:

curl -s "http://localhost:8000/studio/services/checker-service/broker/dlq/messages?task_id=task-123&limit=20"

Broker-mode reminders:

this route is emergency/live inspection, not the indexed DLQ source of truth
responses do not contain indexed dlq_id, replay state, or cursor pagination
indexed /studio/services/{service_id}/dlq/messages should remain the default operator view when Redis-backed DLQ indexing is healthy

Background Workers¶

The backend can run three periodic workers:

pull sync worker
ingests events from registered services into the central event store
controlled by RELAYNA_STUDIO_PULL_SYNC_INTERVAL_SECONDS
this is the default ingestion path for internal AKS deployments
health refresh worker
refreshes service health snapshots and staleness state
controlled by RELAYNA_STUDIO_HEALTH_REFRESH_INTERVAL_SECONDS
its updated summaries are surfaced in the Studio services screen through frontend polling of /studio/services
retention worker
prunes search-related retained state
controlled by RELAYNA_STUDIO_RETENTION_PRUNE_INTERVAL_SECONDS

Disable a worker by setting its interval to none, null, or off.

Direct push ingestion to POST /studio/ingest/events is disabled by default. Enable it only when a deployment explicitly needs service-side push forwarding by setting RELAYNA_STUDIO_PUSH_INGEST_ENABLED=true.

Examples:

export RELAYNA_STUDIO_PULL_SYNC_INTERVAL_SECONDS=off
export RELAYNA_STUDIO_HEALTH_REFRESH_INTERVAL_SECONDS=null
export RELAYNA_STUDIO_RETENTION_PRUNE_INTERVAL_SECONDS=none

Operator-Facing API Surface¶

All backend routes live under /studio/*.

The main operator surfaces are:

registry CRUD
/studio/services
/studio/services/{service_id}
/studio/services/{service_id}/refresh
health
/studio/services/{service_id}/health/refresh
events
/studio/services/{service_id}/events
/studio/tasks/{service_id}/{task_id}/events
logs
/studio/services/{service_id}/logs
/studio/tasks/{service_id}/{task_id}/logs
search
/studio/services/search
/studio/tasks/search
federated detail
/studio/tasks/{service_id}/{task_id}
federated workflow and DLQ reads
/studio/services/{service_id}/workflow/topology
/studio/services/{service_id}/dlq/messages
/studio/services/{service_id}/broker/dlq/messages
federated execution view
/studio/services/{service_id}/executions/{task_id}/graph
Gateway import catalog
/studio/gateway/services

The frontend contract that consumes these routes is documented in Studio Frontend.

Gateway Import Catalog¶

Relayna Gateway Admin can import Studio-registered services from the Studio backend catalog endpoint:

curl -s http://localhost:8000/studio/gateway/services

Gateway should configure the Studio backend base URL and then fetch the catalog from that base URL. A Gateway deployment can expose settings like these:

export RELAYNA_STUDIO_BASE_URL=http://relayna-studio-backend.studio.svc.cluster.local:8000
export RELAYNA_STUDIO_TOKEN=optional-admin-or-service-token

With that configuration, Gateway reads:

curl -s "$RELAYNA_STUDIO_BASE_URL/studio/gateway/services"

The current Studio endpoint does not require a token by itself. If an ingress, service mesh, or Gateway-side Studio client adds authentication, pass that token outside the catalog payload, for example as an Authorization header. Do not store Studio tokens in service records or forward them into imported upstream credential fields.

The response is a stable, read-only catalog:

{
  "count": 2,
  "services": [
    {
      "studio_service_id": "payments-api",
      "name": "payments-api",
      "display_name": "Payments API",
      "base_url": "https://payments.example.test",
      "environment": "prod",
      "tags": ["core", "payments"],
      "auth_mode": "internal_network",
      "status": "healthy",
      "capabilities": {
        "supported_routes": ["status.latest", "history.list"],
        "feature_flags": ["execution_graphs"]
      },
      "default_route_pattern": "/services/payments-api/*"
    },
    {
      "studio_service_id": "Payments API",
      "name": "payments-api-185bfb25",
      "display_name": "Payments API Legacy",
      "base_url": "https://payments-legacy.example.test",
      "environment": "prod",
      "tags": ["legacy"],
      "auth_mode": "internal_network",
      "status": "degraded",
      "capabilities": {},
      "default_route_pattern": "/services/payments-api-185bfb25/*"
    }
  ]
}

Field mapping:

Export field	Meaning for Gateway import
`studio_service_id`	Stable Studio registry key. Gateway should persist this as `studio_service_id` and use it for idempotent re-import/sync.
`name`	Gateway-safe service-name suggestion derived from `studio_service_id`. Values are lowercase, URL-safe, and unique within the export response.
`display_name`	Operator-facing label from the Studio service `name`.
`base_url`	Candidate upstream base URL. Gateway should treat it as an import preview/default, not as a credential or policy decision.
`environment`	Studio environment label for filtering or preview grouping.
`tags`	Studio registry tags for preview, filtering, or optional Gateway labels.
`auth_mode`	Studio's registered auth mode. Gateway still owns upstream credentials and secret references.
`status`	Health-aware status hint. Gateway should use this for preview/sync state only and continue to fail closed for incomplete services.
`capabilities`	Optional Relayna capability document hints. Gateway may use these to suggest allowed methods or UI defaults, but Studio does not enforce Gateway traffic policy.
`default_route_pattern`	Deterministic route-pattern suggestion based on the exported `name`.

studio_service_id maps to the Studio registry service_id. name is a lowercase URL-safe Gateway service-name suggestion derived from service_id. When two Studio IDs normalize to the same Gateway-safe name, Studio appends a stable short fingerprint so each name and default_route_pattern remains unique in the export response.

Gateway remains responsible for virtual-key authorization, upstream credentials or secret references, enabled traffic state, timeout, max body bytes, fallback services, cost mode, budgets, and fail-closed routing. Studio does not export log_config, metrics_config, trace_config, health internals, last_seen_at, or credential material through this endpoint.

Recommended Gateway import flow:

Read $RELAYNA_STUDIO_BASE_URL/studio/gateway/services.
Show operators a picker with display_name, studio_service_id, environment, status, base_url, tags, and default_route_pattern.
Persist imported records with source = studio and the selected studio_service_id.
Preserve Gateway-owned fields on re-import, including credentials, enabled state, route overrides, timeout, body limits, fallback services, cost mode, project link, and budgets.
Mark imported records incomplete until required Gateway-owned runtime fields are configured.

Verification¶

With the backend running on localhost:8000, verify the core surfaces:

curl -s http://localhost:8000/studio/services
curl -s http://localhost:8000/studio/gateway/services
curl -s "http://localhost:8000/studio/services/search?limit=20"
curl -s "http://localhost:8000/studio/tasks/search?task_id=task-123"

After at least one service is registered:

curl -s -X POST http://localhost:8000/studio/services/my-service/refresh
curl -s -X POST http://localhost:8000/studio/services/my-service/health/refresh
curl -s http://localhost:8000/studio/tasks/my-service/task-123
curl -s http://localhost:8000/studio/services/my-service/workflow/topology
curl -s http://localhost:8000/studio/services/my-service/dlq/messages
curl -s "http://localhost:8000/studio/services/my-service/broker/dlq/messages?task_id=task-123"
curl -s "http://localhost:8000/studio/services/my-service/logs?limit=20&source=runtime-worker"
curl -s "http://localhost:8000/studio/tasks/my-service/task-123/logs?limit=50&source=api&from=2026-04-22T10:00:00Z&to=2026-04-22T10:15:00Z"

Studio log queries normalize a required source field on each log entry, sourced from the service's configured log_config.source_label when present and falling back to "unknown" per entry when the upstream stream omits that label. Both service-scoped and task-scoped log routes also accept an optional exact-match source query parameter; requesting it without log_config.source_label configured returns 422.

Task-scoped log routes also accept optional from and to timestamps. Studio uses these automatically on the task detail page when it can derive a queued to terminal lifecycle window from status history or timeline events, and operators can override that window manually in the UI.

Studio renders ANSI-styled log messages in the browser without changing the raw backend payload. Escape sequences remain in the API response and are interpreted only by the frontend log panels.

Broker DLQ inspection is intentionally separate from indexed DLQ reads:

/studio/services/{service_id}/dlq/messages
reads indexed Relayna DLQ records
includes replay/index metadata such as dlq_id and replay state
supports cursor pagination and indexed filters
/studio/services/{service_id}/broker/dlq/messages
reads live broker messages through the registered service
is available only when the service advertises broker.dlq.messages
returns a read-only broker payload shape without dlq_id, replay state, or cursor pagination

Troubleshooting¶

Redis misconfiguration¶

Symptoms:

backend fails at startup
registry appears empty after restart
event and search views never retain state

Checks:

confirm RELAYNA_STUDIO_REDIS_URL
confirm Redis is reachable from the runtime environment
confirm prefixes or DB selection are not colliding with another deployment

Capability refresh blocked¶

Symptoms:

service registration succeeds
refresh endpoint fails
capabilities remain stale or missing

Checks:

confirm the service base_url is reachable from the backend host
confirm the service or observability backend host suffix matches RELAYNA_STUDIO_CAPABILITY_REFRESH_ALLOWED_HOSTS
for literal IP URLs, confirm the IP is within RELAYNA_STUDIO_CAPABILITY_REFRESH_ALLOWED_NETWORKS

Service unavailable vs stale capability vs stale observation¶

unavailable
the backend cannot successfully read the service right now
stale capability
the service record exists, but the capability snapshot is older than RELAYNA_STUDIO_CAPABILITY_STALE_AFTER_SECONDS
stale observation
the service may still exist, but ingested activity is older than RELAYNA_STUDIO_OBSERVATION_STALE_AFTER_SECONDS

These states are different. A service can be registered but stale, or reachable for some reads while still showing stale activity.

Relationship To Downstream Services¶

The backend works best when downstream services:

expose GET /relayna/capabilities
expose GET /relayna/health/workers when they want Studio to track worker liveness
emit events into GET /events/feed
expose execution graph, workflow, DLQ, and worker health routes where relevant
keep service_id, task IDs, correlation IDs, and base URLs stable

That service-side integration, including worker heartbeat wiring for Studio, is covered in Getting Started.

Studio Backend¶

Role In The Architecture¶

Package And Runtime Model¶

Runtime Configuration¶

Required¶

Optional network and process settings¶

Optional registry and capability settings¶

Optional event ingestion settings¶

Optional health settings¶

Optional search and retention settings¶

Practical recommendations¶

Redis Requirements And Data Use¶

Local Source Run¶

Docker Run¶

Service Registration And Backend Expectations¶

Loki log_config for AKS service/app/task views¶

Prometheus metrics_config for AKS service/task views¶

Tempo trace_config for task trace correlation¶

Broker DLQ setup for Studio federation¶

Background Workers¶

Operator-Facing API Surface¶

Gateway Import Catalog¶

Verification¶

Troubleshooting¶

Redis misconfiguration¶

Capability refresh blocked¶

Service unavailable vs stale capability vs stale observation¶

Relationship To Downstream Services¶

Loki `log_config` for AKS service/app/task views¶

Prometheus `metrics_config` for AKS service/task views¶

Tempo `trace_config` for task trace correlation¶