Skip to content

Deployment Guide

Production deploy paths and when to pick each one.

Local setup: Getting started. This is for servers.

MethodWhenCommand
deploy.shFirst local prod-like test on a VM./deploy.sh web
Prebuilt imagesProd, CI, anywhere you don’t want to compile on the boxdocker compose -f compose/docker-compose.prebuilt.yml up -d
docker buildx bakeYou changed code and need fresh imagesdocker buildx bake -f docker-bake.hcl
CoolifyCoolify hostcompose/docker-compose.coolify.yml

Rule: dev machines build, prod machines pull. Building Go + judge toolchains on a 2-vCPU VPS during a deploy window is a bad time.

Set via .env, secrets manager, or compose environment. Must-haves:

VariableNotes
DB_HOST, DB_PASSWORD, …Postgres
RABBITMQ_HOST, RABBITMQ_USER, RABBITMQ_PASSWORDQueue
JWT_SIGNING_SECRETLong random string. Rotate = everyone re-logs-in
JUDGE_PASSWORDShared with judge workers
AUTH_PROVIDER_PASSWORDWeb → API OAuth bridge
CORS_ORIGINYour web origin(s), comma-separated
ADMIN_EMAILSComma-separated admin bootstrap emails
SEED_DATAfalse in prod
ELASTIC_ENABLEDtrue only if ES is deployed

Same RabbitMQ + NEXTJUDGE_HOST/PORT + matching JUDGE_PASSWORD.

VariableNotes
NEXT_PUBLIC_API_URLBrowser-reachable API URL
NEXTAUTH_SECRETSession encryption
NEXTAUTH_URLPublic HTTPS URL of the web app

Full lists: config.go, judge app.py, web .env.example.

Internet → TLS proxy → web :8080
↓ (internal)
data layer :5000
Postgres, RabbitMQ (never public)
judge workers (internal only)

Firewall 5432, 5672, 5000 from the world. Only 443 (or 80 → redirect) public.

server {
listen 443 ssl;
server_name nextjudge.example.com;
ssl_certificate /etc/letsencrypt/live/nextjudge.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/nextjudge.example.com/privkey.pem;
location / {
proxy_pass http://127.0.0.1:8080;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto $scheme;
}
}

Set NEXTAUTH_URL=https://nextjudge.example.com. A mismatch with the URL in the browser causes login redirect loops.

Terminal window
docker compose -f compose/docker-compose.backend.yml up -d --scale nextjudge-judge=3

Throughput ≈ workers × (1 / avg_submission_seconds). Queue depth is your gauge, not CPU on the web container.

  • AutoMigrate on data layer boot. pg_dump before upgrades.
  • No SEED_DATA in prod
  • Test restore from backup occasionally (untested backups are wishes)
Terminal window
curl -sf https://api.yourdomain.com/healthy && echo ok

Judges have no HTTP health endpoint. Monitor: queue depth, PENDING age, container restarts.

  • Patch judge images (untrusted code runs inside, not “on” the host, but still)
  • Rate-limit /v1/basic_login at the proxy
  • nsjail ≠ invincible. Network-separate the judge from internal admin tools
  1. pg_dump
  2. Read changelog / schema_updates.sql
  3. Pull/build images
  4. Restart data layer (migrations) → judges → web
  5. Submit a known-AC solution. If that fails, roll back images before investigating novel bugs.
SymptomLikely cause
API reachable from host, not from browserNEXT_PUBLIC_API_URL points at an internal hostname
CORS errors in browser consoleCORS_ORIGIN missing the web app’s public origin
Submissions stay PENDINGJudge workers down, RabbitMQ auth mismatch, or wrong JUDGE_PASSWORD
Login succeeds then loopsNEXTAUTH_URL does not match the HTTPS URL users visit
Judge PATCH returns 401JUDGE_PASSWORD differs between data layer and judge containers

Local stack issues: Getting started troubleshooting.