Deployment Guide
This guide covers how to deploy Playupi from scratch at each growth phase, using the recommended services from the Architecture and Scaling docs. Each phase includes both manual step-by-step instructions and scripted/AI-assisted alternatives where applicable.
Prerequisites (All Phases)
Before any deployment, ensure you have:
| Tool | Install |
|---|---|
| Node.js 20+ | nvm install 20 or nodejs.org |
| npm | Comes with Node.js |
| Git | git-scm.com |
| Vercel CLI | npm i -g vercel |
| Prisma CLI | Included in project deps (npx prisma) |
Repository Setup
git clone https://github.com/lait-kelomins/playupi.gitcd playupinpm installEnvironment Variables
All phases use a .env.local file for local development. Production secrets are set through each platform’s dashboard or CLI.
# .env.local templateDATABASE_URL="postgresql://..."DATABASE_URL_POOLED="postgresql://..."UPSTASH_REDIS_REST_URL="https://..."UPSTASH_REDIS_REST_TOKEN="..."JWT_SECRET="..."NEXT_PUBLIC_SITE_URL="https://playupi.com"Never commit
.env.localor any file containing secrets to git.
Phase 1: MVP — $0/mo (Free Tiers)
Target: 0–5K DAU, < 10M events/mo, < 500 games, 1–2 person team.
Architecture Recap
Vercel (Free Tier)├── Next.js SSR (frontend user + admin)└── Serverless API Routes (/api/v1/*, /api/admin/*) │ │ Neon PostgreSQL Upstash Redis (free tier) (free tier)Supporting services: Cloudflare (DNS), Sentry (errors), Betterstack (uptime), PostHog (analytics).
1.1 Database — Neon PostgreSQL
Manual
- Go to neon.tech and create a free account
- Create a new project named
playupi - Select the region closest to your users (e.g.,
eu-central-1for Europe) - Copy both connection strings from the dashboard:
- Direct (for migrations):
postgresql://user:pass@ep-xxx.neon.tech/playupi?sslmode=require - Pooled (for application):
postgresql://user:pass@ep-xxx-pooler.neon.tech/playupi?sslmode=require
- Direct (for migrations):
- Add them to your
.env.local:DATABASE_URL="postgresql://...@ep-xxx.neon.tech/playupi?sslmode=require"DATABASE_URL_POOLED="postgresql://...@ep-xxx-pooler.neon.tech/playupi?sslmode=require" - Push the schema:
Terminal window npx prisma db push - Seed development data:
Terminal window npx tsx scripts/seed.ts
Scripted
#!/bin/bash# setup-db.sh — Requires NEON_API_KEY environment variable# Install Neon CLI: npm i -g neonctl
neonctl projects create --name playupi --region-id aws-eu-central-1 --output json > /tmp/neon-project.json
PROJECT_ID=$(jq -r '.project.id' /tmp/neon-project.json)CONN_URI=$(neonctl connection-string --project-id "$PROJECT_ID")POOLED_URI=$(neonctl connection-string --project-id "$PROJECT_ID" --pooled)
echo "DATABASE_URL=\"$CONN_URI\"" >> .env.localecho "DATABASE_URL_POOLED=\"$POOLED_URI\"" >> .env.local
npx prisma db pushnpx tsx scripts/seed.ts
echo "Database ready."AI-Assisted
Prompt for Claude Code: “Set up a Neon PostgreSQL database for the playupi project. Create the project via the Neon CLI, save the connection strings to
.env.local, runprisma db push, and seed the database.”
1.2 Cache — Upstash Redis
Manual
- Go to console.upstash.com and create an account
- Create a new Redis database:
- Name:
playupi-cache - Region: same as Neon (e.g.,
eu-central-1) - Type: Regional (free tier)
- Name:
- Copy the REST credentials from the dashboard
- Add to
.env.local:UPSTASH_REDIS_REST_URL="https://xxx.upstash.io"UPSTASH_REDIS_REST_TOKEN="AXxx..."
Scripted
#!/bin/bash# setup-redis.sh — Requires UPSTASH_EMAIL and UPSTASH_API_KEY
curl -s -X POST "https://api.upstash.com/v2/redis/database" \ -u "$UPSTASH_EMAIL:$UPSTASH_API_KEY" \ -H "Content-Type: application/json" \ -d '{"name":"playupi-cache","region":"eu-central-1","tls":true}' \ > /tmp/upstash-db.json
REST_URL=$(jq -r '.endpoint' /tmp/upstash-db.json)REST_TOKEN=$(jq -r '.rest_token' /tmp/upstash-db.json)
echo "UPSTASH_REDIS_REST_URL=\"https://$REST_URL\"" >> .env.localecho "UPSTASH_REDIS_REST_TOKEN=\"$REST_TOKEN\"" >> .env.local
echo "Redis ready."1.3 Hosting — Vercel
Manual
- Go to vercel.com and sign up with your GitHub account
- Click “Import Project” and select the
playupirepository - Set the framework preset to Next.js
- Set the root directory to
apps/web(or wherever the main Next.js app lives) - Add environment variables in the Vercel dashboard:
DATABASE_URL_POOLED(from Neon — use the pooled URL)DATABASE_URL(from Neon — direct URL, for build-time migrations)UPSTASH_REDIS_REST_URLUPSTASH_REDIS_REST_TOKENJWT_SECRET(generate withopenssl rand -hex 32)NEXT_PUBLIC_SITE_URL=https://playupi.com
- Deploy:
- Push to
mainbranch triggers automatic deployment - Or deploy manually from the Vercel dashboard
- Push to
Scripted (Vercel CLI)
#!/bin/bash# Link the project (first time only)vercel link
# Set environment variablesvercel env add DATABASE_URL_POOLED productionvercel env add DATABASE_URL productionvercel env add UPSTASH_REDIS_REST_URL productionvercel env add UPSTASH_REDIS_REST_TOKEN productionvercel env add JWT_SECRET production
# Deploy to productionvercel --prod
echo "Deployed to Vercel."CI/CD (GitHub Actions)
name: Deploy to Vercel
on: push: branches: [main]
jobs: deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: 20 cache: npm
- run: npm ci - run: npx prisma generate - run: npm run build
- uses: amondnet/vercel-action@v25 with: vercel-token: ${{ secrets.VERCEL_TOKEN }} vercel-org-id: ${{ secrets.VERCEL_ORG_ID }} vercel-project-id: ${{ secrets.VERCEL_PROJECT_ID }} vercel-args: --prod1.4 DNS — Cloudflare
Manual
- Go to dash.cloudflare.com and create a free account
- Add your domain
playupi.com - Update nameservers at your registrar to Cloudflare’s NS records
- Add DNS records:
Type Name Value Proxy CNAME @cname.vercel-dns.comDNS only (gray cloud) CNAME wwwcname.vercel-dns.comDNS only (gray cloud) - In Vercel, add the domain
playupi.comto your project and verify
Important: Set Cloudflare proxy to DNS only (gray cloud) for Vercel. Vercel handles its own edge CDN and SSL. Orange-cloud proxying can cause certificate conflicts.
1.5 Cron Jobs — Vercel Cron + Upstash QStash
Vercel Cron (free tier) supports 2 cron jobs with a minimum daily interval. For the hourly aggregation job, use Upstash QStash (free: 500 messages/day).
Manual
- Add to
vercel.json:{"crons": [{"path": "/api/cron/rank","schedule": "0 */6 * * *"}]} - Create the route handler
app/api/cron/rank/route.tswith aCRON_SECRETcheck - For the hourly aggregation job, set up Upstash QStash:
- Go to console.upstash.com/qstash
- Create a schedule that calls
https://playupi.com/api/cron/aggregateevery hour - Add
QSTASH_CURRENT_SIGNING_KEYandQSTASH_NEXT_SIGNING_KEYto Vercel env vars
1.6 Monitoring & Analytics
Sentry (Error Tracking)
- Go to sentry.io and create a free account
- Create a project for Next.js
- Install:
npm install @sentry/nextjs - Run
npx @sentry/wizard@latest -i nextjs - Add
SENTRY_DSNandSENTRY_AUTH_TOKENto Vercel env vars
Betterstack (Uptime Monitoring)
- Go to betterstack.com and create a free account
- Add monitors:
https://playupi.com(homepage)https://playupi.com/api/v1/games(API health)
- Configure email alerts for downtime
PostHog (Analytics)
- Go to posthog.com and create a free account
- Create a project and copy the API key
- Add
NEXT_PUBLIC_POSTHOG_KEYandNEXT_PUBLIC_POSTHOG_HOSTto Vercel env vars
1.7 Database Migrations (Production)
When schema changes are needed in production:
# Generate a migration filenpx prisma migrate dev --name add-new-field
# Deploy migration to production (run in CI or manually)npx prisma migrate deployVercel build command should include
prisma generate(notprisma migrate deploy). Run migrations separately before deploying, either manually or in a CI step.
1.8 MVP Deployment Checklist
| Step | Command / Action | Done |
|---|---|---|
| Neon database created | Dashboard or neonctl | |
| Schema pushed | npx prisma db push | |
| Upstash Redis created | Dashboard or API | |
| Vercel project linked | vercel link | |
| Env vars set in Vercel | Dashboard or vercel env add | |
| Domain configured | Cloudflare DNS + Vercel domain | |
| Sentry integrated | npx @sentry/wizard | |
| Betterstack monitors added | Dashboard | |
| PostHog integrated | Dashboard + env vars | |
| SSL/HTTPS verified | Visit https://playupi.com | |
| Security headers verified | Check with securityheaders.com | |
| Cron jobs configured | vercel.json + QStash | |
| First deploy successful | vercel --prod or push to main |
Phase 2: Growth — ~$200–500/mo
Target: 5K–50K DAU, 10M–100M events/mo, 500–5K games, 2–4 person team.
Migration triggers:
- API p95 response time > 500ms
- Database CPU consistently > 60%
- Event write throughput > 1K/sec sustained
- Hitting Vercel free tier limits (~100K invocations/mo)
Architecture Recap
Cloudflare CDN │ Load Balancer (Railway / Fly.io) ┌───┴───┐Backend x2 Backend x2(user API) (admin API) └───┬───┘ PostgreSQL (Primary + Read Replica) │ Redis (dedicated instance)2.1 Upgrade Vercel (or Migrate Backend)
You have two options at this phase:
Option A: Vercel Pro ($20/mo)
If API routes are still sufficient:
- Upgrade to Vercel Pro in the dashboard
- This gives you 1M serverless invocations/mo, 1TB bandwidth, and more cron jobs
- No code changes needed
Option B: Separate Backend on Railway / Fly.io
When you need persistent processes (background workers, WebSockets):
Manual (Railway)
- Go to railway.app and create an account
- Create a new project
- Add a service from your GitHub repo:
- Set the root directory to
apps/api - Set start command:
npm start - Set build command:
npm run build
- Set the root directory to
- Add environment variables (same as Vercel + backend-specific ones)
- Add a PostgreSQL service from Railway’s template library (or keep Neon)
- Add a Redis service from Railway’s template library (or keep Upstash)
- Configure a custom domain:
api.playupi.com - Scale to 2 replicas from the Railway dashboard
Manual (Fly.io)
- Install Fly CLI:
curl -L https://fly.io/install.sh | sh - Sign up:
fly auth signup - Create a
fly.tomlin your API directory:app = "playupi-api"primary_region = "cdg" # Paris[build]dockerfile = "Dockerfile"[http_service]internal_port = 3000force_https = trueauto_stop_machines = trueauto_start_machines = truemin_machines_running = 2[env]NODE_ENV = "production" - Set secrets:
Terminal window fly secrets set DATABASE_URL="..." UPSTASH_REDIS_REST_URL="..." JWT_SECRET="..." - Deploy:
Terminal window fly deploy - Add a custom domain:
Terminal window fly certs add api.playupi.com
Scripted (Railway CLI)
#!/bin/bash# Install Railway CLInpm i -g @railway/cli
# Loginrailway login
# Create project and servicerailway init --name playupi-api
# Link to reporailway link
# Set environment variablesrailway variables set DATABASE_URL="$DATABASE_URL"railway variables set DATABASE_URL_POOLED="$DATABASE_URL_POOLED"railway variables set UPSTASH_REDIS_REST_URL="$UPSTASH_REDIS_REST_URL"railway variables set UPSTASH_REDIS_REST_TOKEN="$UPSTASH_REDIS_REST_TOKEN"railway variables set JWT_SECRET="$JWT_SECRET"
# Deployrailway up
echo "Backend deployed on Railway."2.2 Database — Add Read Replica
Manual (Neon)
- Upgrade to Neon Launch plan ($19/mo — 300 compute hours, 10 GB)
- In the Neon dashboard, create a read replica endpoint
- Copy the read replica connection string
- Add to environment:
DATABASE_URL_READ="postgresql://...@ep-xxx-read.neon.tech/playupi?sslmode=require"
- Update your backend to route read queries (game lists, search, metrics) to the read replica
Manual (Railway PostgreSQL)
- Create a PostgreSQL service in Railway (comes with the Starter plan at $5/mo)
- Enable read replicas from the Railway dashboard
- Railway automatically provides separate read/write connection strings
2.3 Cache — Upgrade Redis
Option A: Upstash Pay-as-you-go
Upgrade from free to pay-as-you-go in the Upstash dashboard. Cost: ~$0.2 per 100K commands. No action needed beyond enabling billing.
Option B: Dedicated Redis (Railway/Fly.io)
- Add a Redis service in Railway or deploy Redis on Fly.io
- Update
REDIS_URLenvironment variable - Restart API services
2.4 Background Workers
For hourly aggregation and ranking jobs that shouldn’t block API requests:
Manual
- Create a separate service in Railway/Fly.io for the worker
- Use a lightweight process runner (e.g.,
node-cron, BullMQ with Redis, or Inngest) - Configure jobs:
- Hourly: Event aggregation →
DailyMetrictable - Every 6 hours: Rank recalculation
- Daily: Tag recomputation (New, Trendy)
- Hourly: Event aggregation →
Scripted (BullMQ + Railway)
# In apps/worker/package.json# Start command: "node dist/worker.js"
railway service create --name playupi-workerrailway variables set REDIS_URL="$REDIS_URL" DATABASE_URL="$DATABASE_URL"railway up2.5 CI/CD for Growth Phase
name: Deploy (Growth)
on: push: branches: [main]
jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: { node-version: 20, cache: npm } - run: npm ci - run: npm run lint - run: npm run test - run: npm audit --audit-level=critical
migrate: needs: test runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: { node-version: 20, cache: npm } - run: npm ci - run: npx prisma migrate deploy env: DATABASE_URL: ${{ secrets.DATABASE_URL }}
deploy-frontend: needs: migrate runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: amondnet/vercel-action@v25 with: vercel-token: ${{ secrets.VERCEL_TOKEN }} vercel-org-id: ${{ secrets.VERCEL_ORG_ID }} vercel-project-id: ${{ secrets.VERCEL_PROJECT_ID }} vercel-args: --prod
deploy-api: needs: migrate runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - run: npm i -g @railway/cli - run: railway up --service playupi-api env: RAILWAY_TOKEN: ${{ secrets.RAILWAY_TOKEN }}
deploy-worker: needs: migrate runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - run: npm i -g @railway/cli - run: railway up --service playupi-worker env: RAILWAY_TOKEN: ${{ secrets.RAILWAY_TOKEN }}2.6 Growth Deployment Checklist
| Step | Action | Done |
|---|---|---|
| Vercel upgraded to Pro OR backend moved to Railway/Fly.io | ||
| Backend running 2+ replicas | ||
| Database upgraded with read replica | ||
| Redis upgraded (pay-as-you-go or dedicated) | ||
| Background worker deployed | ||
| CI/CD pipeline updated for multi-service deploy | ||
| CDN caching enabled for game list endpoints | ||
| APM/monitoring upgraded (response time tracking) | ||
| Load testing performed (target: 1K concurrent users) |
Phase 3: Scale — ~$1,000–3,000/mo
Target: 50K–500K DAU, 100M–1B events/mo, 5K–50K games, 4–10 person team.
Migration triggers:
- Event volume > 10K/sec
- PostgreSQL event table > 500GB
- Metric query time > 2s
- Need real-time or near-real-time ranking updates
Architecture Recap
Cloudflare CDN │ Load Balancer ┌───┴────────┐User API x4 Admin API x2 └───┬────────┘ PostgreSQL Cluster (Primary + 2 Read Replicas) │ Kafka/Redpanda ──> ClickHouse (OLAP) │ Redis Cluster │ Meilisearch (dedicated search)3.1 Event Pipeline — Kafka/Redpanda + ClickHouse
This is the biggest architectural change. Events stop going directly to PostgreSQL and instead flow through a streaming pipeline.
Manual (Managed Services)
Redpanda (recommended over Kafka — simpler, lower resource usage):
- Go to cloud.redpanda.com and create an account
- Create a Serverless cluster in your region
- Create a topic:
playupi-events(partitions: 6, retention: 7 days) - Copy the bootstrap servers and credentials
ClickHouse:
- Go to clickhouse.cloud and create an account
- Create a service (Development tier, ~$190/mo)
- Create the events table:
CREATE TABLE events (id UUID,type LowCardinality(String),game_id UUID,session_id UUID,device_id String,platform LowCardinality(String),timestamp DateTime64(3),received_at DateTime64(3),context String -- JSON stored as String, extracted via JSONExtract)ENGINE = MergeTree()PARTITION BY toYYYYMM(timestamp)ORDER BY (game_id, type, timestamp)TTL timestamp + INTERVAL 90 DAY;
- Set up a Kafka/Redpanda consumer (ClickHouse has a native Kafka engine):
CREATE TABLE events_kafka (id UUID,type String,game_id UUID,session_id UUID,device_id String,platform String,timestamp DateTime64(3),received_at DateTime64(3),context String)ENGINE = KafkaSETTINGSkafka_broker_list = 'xxx.redpanda.com:9092',kafka_topic_list = 'playupi-events',kafka_group_name = 'clickhouse-consumer',kafka_format = 'JSONEachRow';CREATE MATERIALIZED VIEW events_mv TO events ASSELECT * FROM events_kafka;
Update API event ingestion:
Change the event API route to publish to Redpanda instead of inserting into PostgreSQL:
// Before: await prisma.event.createMany({ data: events })// After:import { Kafka } from 'kafkajs'
const kafka = new Kafka({ brokers: [process.env.REDPANDA_BROKER!] })const producer = kafka.producer()
await producer.send({ topic: 'playupi-events', messages: validatedEvents.map(e => ({ key: e.gameId, value: JSON.stringify(e), })),})AI-Assisted
Prompt for Claude Code: “Refactor the event ingestion API route to publish events to Redpanda/Kafka instead of inserting directly into PostgreSQL. Set up the KafkaJS producer with the connection from
REDPANDA_BROKERenv var. Keep the Zod validation layer, just change the storage backend.”
3.2 PostgreSQL — Cluster with Connection Pooling
Manual
- Migrate to a managed PostgreSQL provider with clustering support:
- Neon Scale ($69/mo — 750 compute hours, 50 GB, read replicas)
- Railway with dedicated PostgreSQL
- Supabase Pro ($25/mo + compute add-ons)
- Set up 2 read replicas
- Add PgBouncer in front of PostgreSQL (if not using Neon’s built-in pooler):
Terminal window # On Railway, add PgBouncer as a separate service# Or use Supavisor (Supabase's built-in pooler) - Update application to use read replicas for:
- Game list queries
- Search queries
- Metric dashboard queries
- Exploration queue reads
3.3 Search — Meilisearch
Manual (Meilisearch Cloud)
- Go to meilisearch.com/cloud and create an account
- Create an index named
games - Configure searchable attributes:
title,description,author.name - Configure filterable attributes:
categories,platform,visibility,state - Set up a sync job to push game data from PostgreSQL to Meilisearch on game create/update
Manual (Self-hosted on Fly.io)
fly launch --image getmeili/meilisearch:latest --name playupi-search --region cdgfly secrets set MEILI_MASTER_KEY="$(openssl rand -hex 16)"fly scale memory 1024 # 1GB RAM3.4 Redis Cluster
Upgrade from single Redis to a cluster for distributed caching:
- Upstash Pro: Upgrade to a Pro plan with higher throughput limits
- Self-managed: Deploy Redis Cluster on Fly.io or Railway with 3 nodes
- Update the Redis client configuration to use cluster mode
3.5 Scale Deployment Checklist
| Step | Action | Done |
|---|---|---|
| Redpanda cluster created and topic configured | ||
| ClickHouse service created with events table | ||
| Kafka consumer (materialized view) active | ||
| API event ingestion refactored to use Redpanda | ||
| PostgreSQL upgraded with 2 read replicas | ||
| Connection pooling configured (PgBouncer) | ||
| Meilisearch deployed and synced | ||
| Redis upgraded to cluster/pro | ||
| Aggregation jobs read from ClickHouse | ||
| Ranking recalculation runs every 15–60 minutes | ||
| Load testing performed (target: 10K concurrent users) | ||
| Monitoring dashboards for Kafka lag, ClickHouse query time |
Phase 4: Platform — ~$5,000–15,000/mo
Target: 500K+ DAU, 1B+ events/mo, 50K+ games, 10+ person team.
Migration triggers:
- Multiple geographic regions needed
- Dev portal with third-party traffic
- Multiple teams with independent release cycles
Architecture Recap
Cloudflare (CDN + WAF + DDoS) │ API Gateway (rate limiting, auth, routing) ┌──────┼──────────┐ User API Admin API Dev Portal API (multi-region) │ PostgreSQL (Primary + Global Read Replicas) ClickHouse Cluster (sharded) Redis (multi-region) Meilisearch Cluster S3/R2 (object storage for assets) Kafka Cluster (event bus) Grafana + Prometheus (full observability)4.1 Multi-Region Deployment
Manual
- Identify target regions based on user distribution (e.g., EU, US, Asia)
- Deploy API services in 2–3 regions:
Terminal window # Fly.io multi-regionfly regions add cdg iad nrt # Paris, Virginia, Tokyofly scale count 3 --region cdgfly scale count 2 --region iadfly scale count 1 --region nrt - Configure Cloudflare load balancing to route users to the nearest region
- Set up PostgreSQL global read replicas (one per region):
- Primary in main region (writes)
- Read replicas in secondary regions (reads)
- Deploy Redis in each region for local caching
4.2 Multi-Repo Migration
When team size and release cycles demand it:
- Split the monorepo into:
Repo Contents playupi-webPlayer-facing frontend playupi-apiPublic + Admin API playupi-adminAdmin dashboard frontend playupi-devportalDeveloper portal (frontend + API) playupi-sharedShared types, constants, utilities (published as npm package) playupi-infraTerraform/Pulumi IaC, Docker configs, CI/CD templates - Publish
playupi-sharedas a private npm package - Each repo gets its own CI/CD pipeline
4.3 Infrastructure as Code (Terraform/Pulumi)
At this phase, all infrastructure should be defined as code:
# infrastructure/main.tf (Terraform example)
module "database" { source = "./modules/postgres" instance = "db.m5.xlarge" replicas = 3 regions = ["eu-central-1", "us-east-1"]}
module "redis" { source = "./modules/redis" node_type = "cache.m5.large" clusters = 2 regions = ["eu-central-1", "us-east-1"]}
module "clickhouse" { source = "./modules/clickhouse" tier = "production" shards = 3 replicas = 2}
module "search" { source = "./modules/meilisearch" instances = 3 memory = "4GB"}AI-Assisted
Prompt for Claude Code: “Generate Terraform modules for the Playupi platform infrastructure: PostgreSQL with read replicas, Redis cluster, ClickHouse, and Meilisearch. Target AWS eu-central-1 as primary region with us-east-1 as secondary.”
4.4 API Gateway
Deploy an API gateway for centralized auth, rate limiting, and routing:
Options:
- Kong (open-source, self-hosted on Kubernetes)
- AWS API Gateway (managed, if on AWS)
- Cloudflare Workers (edge-based, integrates with existing Cloudflare)
Request → Cloudflare → API Gateway → Service ├── Auth validation ├── Rate limiting ├── Request logging └── Route to correct service4.5 Full Observability Stack
Manual
- Metrics: Deploy Prometheus + Grafana (or use Grafana Cloud free tier)
Terminal window # Self-hosted on Fly.io or dedicated VPSfly launch --image grafana/grafana --name playupi-grafanafly launch --image prom/prometheus --name playupi-prometheus - Logging: Deploy Loki or use Betterstack Logs
- Tracing: Add OpenTelemetry instrumentation to all services
- Alerting: Configure PagerDuty or Opsgenie for on-call rotation
Key Dashboards
| Dashboard | Panels |
|---|---|
| API Health | Request rate, p50/p95/p99 latency, error rate, status codes |
| Database | Query time, connections, replication lag, disk usage |
| Events Pipeline | Kafka lag, ClickHouse insert rate, consumer throughput |
| Business | DAU, plays, new games, ranking changes |
| Infrastructure | CPU, memory, network across all services |
4.6 Object Storage (S3/R2)
Migrate game assets off broker CDNs to your own storage:
- Create a Cloudflare R2 bucket:
playupi-assets - Set up an image processing pipeline:
Original upload → Resize (32, 64, 128, 256, 512px) → Store in R2 → Serve via Cloudflare CDN
- Use Cloudflare Image Resizing or a Workers script for on-the-fly transforms
- Migrate existing thumbnailUrl references to R2 URLs
4.7 Platform Deployment Checklist
| Step | Action | Done |
|---|---|---|
| Multi-region API deployment (2–3 regions) | ||
| Cloudflare load balancing configured | ||
| PostgreSQL global read replicas | ||
| Multi-region Redis | ||
| Monorepo split into multi-repo | ||
| Shared package published | ||
| Infrastructure as Code (Terraform/Pulumi) | ||
| API gateway deployed | ||
| Full observability (Grafana + Prometheus + Loki) | ||
| On-call rotation configured | ||
| Object storage (R2) for game assets | ||
| Dev portal deployed | ||
| Load testing (target: 100K concurrent users) | ||
| Disaster recovery plan documented and tested |
Quick Reference: Services by Phase
| Service | Phase 1 (MVP) | Phase 2 (Growth) | Phase 3 (Scale) | Phase 4 (Platform) |
|---|---|---|---|---|
| Frontend | Vercel Free | Vercel Pro | Vercel Pro | Vercel Enterprise / CDN |
| Backend | Vercel Serverless | Railway/Fly.io x2 | Railway/Fly.io x4 | Kubernetes / Fly.io multi-region |
| Database | Neon Free | Neon Launch + replica | Neon Scale + 2 replicas | Managed PostgreSQL cluster |
| Cache | Upstash Free | Upstash Pro | Redis Cluster | Multi-region Redis |
| Events | PostgreSQL | PostgreSQL | Redpanda + ClickHouse | Kafka Cluster + ClickHouse Cluster |
| Search | PostgreSQL ILIKE | PostgreSQL tsvector | Meilisearch | Meilisearch Cluster |
| CDN | Vercel Edge | Cloudflare + Vercel | Cloudflare Pro | Cloudflare Enterprise |
| Monitoring | Sentry Free + Betterstack Free | Sentry + APM | Grafana + Prometheus | Full observability stack |
| Analytics | PostHog Free | PostHog | PostHog | PostHog + custom dashboards |
| Assets | Broker CDN | Broker CDN | R2 (optional) | Cloudflare R2 + Image Resizing |
| Cost | $0/mo | ~$200–500/mo | ~$1,000–3,000/mo | ~$5,000–15,000/mo |
Rollback Strategy
For all phases, maintain the ability to roll back:
| Layer | Rollback Method |
|---|---|
| Frontend | Vercel instant rollback (previous deployment) |
| Backend | Railway/Fly.io previous release (fly releases, railway rollback) |
| Database | Never rollback schema — use forward migrations. Neon: branch from PITR for data |
| ClickHouse | Replay events from Kafka retention window |
| Redis | Cache rebuild on restart (no persistent data) |
Emergency Rollback
# Vercel — rollback to previous production deploymentvercel rollback
# Fly.io — rollback to previous releasefly releasesfly deploy --image registry.fly.io/playupi-api:v42 # specific version
# Railway — rollback via dashboard or CLIrailway rollbackSecurity Checklist (All Phases)
Before every production deployment, verify:
- No secrets in git history (
git log --all -p | grep -i "password\|secret\|token") -
npm auditpasses with 0 critical vulnerabilities - Environment variables set correctly (not hardcoded)
- HTTPS enforced on all endpoints
- Security headers present (check securityheaders.com)
- Database not publicly accessible
- Rate limiting active on public endpoints
- Admin auth working (JWT + bcrypt)
- Iframe sandbox attributes in place
- CSP headers configured
See Security for the full security specification.