Research CommonsResearch Commons
gpu-train/Dashboard (UI)

Dashboard (UI)

The local, branded gpu-train dashboard — start runs, watch live logs, connect providers, and track cost from the browser.

gpu-train ships a local, Research Commons-branded dashboard: a single view over your infra, jobs, providers, and cost. It reads the same registry and SQLite state as the Python API, so anything you launch from code shows up here and vice-versa.

Run it

pip install "gpu-train[server]"
gpu-train serve            # opens http://127.0.0.1:8780

Flags: --host (default 127.0.0.1), --port (default 8780), --no-browser. The prebuilt UI is bundled into the wheel, so no Node.js is required to use it.

Loopback only — no auth

The server binds to 127.0.0.1, rejects non-loopback Host headers (anti-DNS-rebinding), restricts CORS/WebSocket origins to loopback, and has no authentication. Do not expose it to a network directly — put it behind an authenticating reverse proxy or an SSH tunnel if you need remote access.

Pages

Overview

One unified view of the control plane:

  • KPIs — runs and total cost over the last 30 days, instances running now, and W&B status.
  • Infrastructure — every provider with a live status dot (ready / needs credential / coming soon).
  • Control-plane defaults — runtime, auto-terminate, idle timeout, etc.
  • Recent runs and a 30-day cost report.

The header has a + New run button available on every page.

Start a new run

A dialog to launch from the browser — pick a provider (only connected ones are offered), GPUs, entrypoint, project dir, runtime, and an optional price cap. It submits to the control plane and opens the live run view.

Runs & Run detail

The full job list, and a per-run detail view with live streaming logs (over a WebSocket), status, cost, exit code, a kill button, and a deep-link into the run's Weights & Biases page.

Providers & integrations

Every provider with status badges and an inline Connect form. Enter API keys (RunPod/Vast.ai), a GCP service-account JSON + project/zone, or a Colab tunnel endpoint — right in the UI. Keys are saved to the local credential store (chmod 600), shown back masked, and can be disconnected. Providers configured via environment variables are shown read-only. There's also a Weights & Biases card and a Colab bootstrap cell viewer.

Cost

Spend broken down per day and per provider over a configurable window.

HTTP API

The dashboard is a thin client over a small JSON API (handy for scripting):

MethodPathPurpose
GET/jobs · /jobs/{id}List / fetch jobs.
POST/jobsSubmit a run.
POST/jobs/{id}/cancelCancel a job + terminate its box.
GET/jobs/{id}/events · …/events/wsLogs (poll or WebSocket stream).
GET/v1/providersProviders + status + live instances.
GET/v1/configControl-plane defaults + W&B config.
GET/PUT/DELETE/v1/credentials · /v1/credentials/{provider} · /v1/credentials/wandbManage stored credentials (masked).
GET/v1/colab/bootstrapThe Colab bootstrap cell text.
GET/v1/observability/summaryPer-day / per-provider cost report.
GET/healthLiveness + counts.

Credential reads are always masked; environment-sourced credentials cannot be overwritten or deleted via the API. See Credentials & secrets.