CLI
The gpu-train command — list jobs, tail logs, kill boxes, reconcile, and serve the dashboard.
Installing gpu-train puts a gpu-train command on your PATH. It's a thin
wrapper over the Python API that builds a manager from environment-provided
credentials (and the local credential store), so jobs / logs / kill /
serve work standalone.
gpu-train --version
gpu-train <command> [options]Commands
jobs
List recent jobs.
gpu-train jobs --status running --limit 50| Option | Default | Meaning |
|---|---|---|
--status | all | Filter by status (queued, running, succeeded, …). |
--limit | 50 | Max rows. |
logs
Print or follow a job's logs.
gpu-train logs <job-id> # print stored logs
gpu-train logs -f <job-id> # stream until the job ends| Option | Default | Meaning |
|---|---|---|
--limit | 1000 | Max lines (non-follow). |
-f, --follow | off | Stream until the job reaches a terminal state. |
kill
Terminate a job (and its box), or everything.
gpu-train kill <job-id>
gpu-train kill --all # the panic buttonreconcile
Sweep and terminate instances orphaned by a crashed control plane.
gpu-train reconcileserve
Serve the branded dashboard (requires the [server] extra).
gpu-train serve --host 127.0.0.1 --port 8780
gpu-train serve --no-browserSee Dashboard (UI).
Environment
The CLI reads credentials from the environment — RUNPOD_API_KEY,
VAST_API_KEY, GOOGLE_APPLICATION_CREDENTIALS (+ GOOGLE_CLOUD_PROJECT /
CLOUDSDK_COMPUTE_ZONE), WANDB_API_KEY — merged with any keys saved from the
dashboard. Set GPU_TRAIN_LOG_LEVEL=DEBUG for verbose logs and GPU_TRAIN_HOME
to relocate the state directory.