Skip to content

exzen-core — Staging Environment

Status: Deployed ✅ (2026-05-10) URLs:

  • Frontend (Mini App): https://telegram-staging.exzentcg.com (Vercel Preview env, pinned to staging branch)
  • Backend API: https://telegram-staging-api.exzentcg.com (Proxmox CT 106, gated by WEBHOOK_SECRET only — no Cloudflare Access)

Snapshot: initial-deploy (taken 2026-05-10 after first green E2E order test)


Purpose

Hosts a stable staging environment for ExzenTCG/exzen-core — the existing Telegram Mini App + Python FastAPI backend currently running in a long-lived terminal on Oracle Cloud. Staging exists so the operator can test changes (UX tweaks, new features, integration changes) end-to-end before promoting them to the production Oracle host. Also a useful first step toward containerising the backend so the production deployment can eventually move off "runs in a terminal" onto a supervised container.

This is a stopgap until the TCG Commerce Operations Platform replaces exzen-core at Phase 10 go-live. Don't over-invest — the goal is "stop testing in prod" until the replacement lands.

Frontend stays on Vercel — Preview environment with branch override

The Next.js Mini App is too coupled to Vercel features (preview deploys, SSR, @vercel/analytics) to bother running on the homelab. Only the Python backend lives in the CT.

Vercel Custom Environments are Pro-tier only ($20/mo), so on Hobby we use Preview environment + branch-specific overrides: every staging-only env var must be scoped to Preview AND restricted to the staging git branch. The custom domain telegram-staging.exzentcg.com is assigned to the project with Git Branch = staging.

Discipline required

Any future env var added with Environment = Preview but without the staging-branch override will be inherited by every PR's preview build, which would point those PRs at the staging backend instead of mocks/prod. Always set the branch override explicitly. (Upgrading to Pro flips this on its head with proper Custom Environments.)

Bots run in polling mode

python-telegram-bot defaults to long-polling (the bot pulls updates from Telegram's API outbound). No inbound webhook exposure is needed for the bots, which is why this CT does not need an n8n-webhooks-style Bypass policy. The only inbound traffic is Vercel BFF → backend HTTP API (gated by WEBHOOK_SECRET).

Real-money risk — mitigate via PayNow config

Production uses real PAYNOW_UEN/PAYNOW_MOBILE. If staging copies those values, a tester scanning a staging QR code would actually pay real money. Leave both blank in staging .envbuild_paynow_payment_data raises ValueError and the in-app QR block doesn't render (the success view falls through to the legacy "Open Telegram Bot" CTA). If a future change makes the QR fall-through unacceptable, rotate to a dedicated staging UEN/mobile or render a placeholder with a "STAGING — DO NOT PAY" overlay.


Deployment Details

Property Value
CT ID 106
Hostname exzen-staging
IP 192.168.0.56
Backend port 5000 (FastAPI/uvicorn)
RAM 2048 MB
Swap 1024 MB
Disk 10 GB
Cores 1
DNS 1.1.1.1 (same as CT 105 — router DNS has known lookup issues)
Public URL (frontend) https://telegram-staging.exzentcg.com (Vercel)
Public URL (backend) https://telegram-staging-api.exzentcg.com (Cloudflare Tunnel → NPM → CT 106)
Auth (frontend) None at the Cloudflare layer — Mini App handles its own Telegram-initData auth
Auth (backend) WEBHOOK_SECRET header (BFF→backend); no Cloudflare Access
Snapshot initial-deploy (taken after first green end-to-end order test)

Resource sizing rationale

FastAPI + SQLite + the two python-telegram-bot long-polling threads + APScheduler is comfortably under 1 GB resident in production today. 2 GB / 10 GB / 1 core is generous. Bump if integration polling jobs cause pressure later.


Prerequisites — what you create by hand before running this runbook

These are owner-managed (Claude can't edit 00_secrets). Add each to homelab/00_secrets/Credentials Index.md as you go.

  1. BotFather — create two new bots:
  2. Staging admin bot, e.g. ExzenTCG_Staging_Admin_bot (record token)
  3. Staging customer bot, e.g. ExzenTCG_Staging_bot (record token)
  4. On the staging customer bot: set Menu Button URL to https://telegram-staging.exzentcg.com once Vercel is wired up
  5. Staging admin Telegram forum group:
  6. Create a new Telegram group → enable Topics → add both staging bots → make staging admin bot an admin with topic permissions
  7. The bot will discover the group via /set (see Step 9 below) — no manual chat ID extraction needed
  8. Google Sheet:
  9. Duplicate the production master Sheet → rename to ExzenTCG Staging → share with the existing service account email (unchanged from prod)
  10. Record the new Sheet ID
  11. Secrets to generate:
  12. WEBHOOK_SECRET — fresh random string (different from prod)
  13. NEXTAUTH_SECRET — fresh random string for the Vercel staging project
  14. Vercel — set up Preview + branch override for the staging branch (one-time, in https://vercel.com/exzentcgs-projects/exzentcg):
  15. Settings → Domains → Add telegram-staging.exzentcg.com → in the domain row, set Git Branch to staging. This pins this hostname to deploys built from the staging branch only.
  16. Settings → Deployment Protection → Vercel Authentication: untick "Require Log In" (or set to "Only Production Deployments" if that option appears). The default Hobby setting gates all Preview deploys behind a Vercel SSO login, which the Telegram webview can't satisfy → Mini App opens to the SSO popup forever. The Mini App enforces its own auth via Telegram initData HMAC and WEBHOOK_SECRET, so this Vercel layer is redundant for this project and breaks the Telegram flow.
  17. Settings → Environment Variables: for each variable below, click Add New → Value → Environment = tick Preview → check "Apply to specific Git branches" → enter staging. Repeat for each:
    • BACKEND_API_URL=https://telegram-staging-api.exzentcg.com
    • WEBHOOK_SECRET=<staging webhook secret>
    • CUSTOMER_BOT_TOKEN=<staging customer bot token> (used by /api/auth/telegram to validate Mini App initData HMAC — without it auth fails 500 and the Mini App stays on the access-restricted gate even after Telegram opens the URL correctly)
    • NEXT_PUBLIC_BOT_USERNAME=<staging customer bot username> (e.g. Zencorestore_staging_bot. Used by the admin login page; without it staging falls back to the prod bot handle.)
    • NEXTAUTH_URL=https://telegram-staging.exzentcg.com
    • NEXTAUTH_SECRET=<staging>

!!! danger "Always set the branch override" If you forget the staging branch override on any of these vars, every PR preview in this project will inherit the staging value — including pointing the Mini App at the staging backend. Custom Environments (Pro tier) eliminate this risk; on Hobby it's checklist discipline.

!!! note "Vercel doesn't auto-rebuild on env-var changes" Adding a new env var (or changing one) does not trigger an automatic redeploy. After updating env vars, either push an empty commit to the staging branch or click Redeploy on the latest staging deployment in the Vercel dashboard (untick "Use existing Build Cache" so the new var is baked into the lambda runtime). Without a fresh build, the running deployment serves the old env values.


Deployment Plan

Step 1 — Create LXC

From the Proxmox host (192.168.0.200):

pct create 106 local:vztmpl/debian-12-standard_12.12-1_amd64.tar.zst \
  --hostname exzen-staging \
  --cores 1 \
  --memory 2048 \
  --swap 1024 \
  --rootfs local-lvm:10 \
  --net0 name=eth0,bridge=vmbr0,ip=192.168.0.56/24,gw=192.168.0.1,firewall=1 \
  --nameserver 192.168.0.1 \
  --features nesting=1 \
  --onboot 1 \
  --start 1

Step 2 — Install Docker + set DNS

pct enter 106
apt update && apt upgrade -y && apt install curl git -y
curl -fsSL https://get.docker.com | sh
exit

# Fix DNS from Proxmox host (router DNS has known lookup issues — see CT 103, 105)
pct set 106 --nameserver 1.1.1.1
pct reboot 106

Step 3 — Clone exzen-core and prepare config

pct enter 106
mkdir -p /opt/exzen-core && cd /opt/exzen-core
git clone https://github.com/ExzenTCG/exzen-core.git .
git checkout staging   # long-lived staging branch — never deploy main/stablev2 here

Branch model

exzen-core uses two long-lived branches: stablev2 is production (Oracle Cloud manual deploy + Vercel production project) and staging is what this CT tracks (staging Vercel custom domain too). Feature branches PR into staging first; once exercised here, staging is merged into stablev2 to ship.

Create backend/config/.env with staging values (NEVER commit this file — it's gitignored):

cd /opt/exzen-core/backend
mkdir -p config
cp config/templates/.env.example config/.env
nano config/.env

Required staging values (the rest copy from prod via password manager):

# IMPORTANT: ENVIRONMENT=PROD on staging.
# The Settings class derives TELEGRAM_TEST_MODE from ENVIRONMENT==DEV. When
# TELEGRAM_TEST_MODE is true, the bots route their API calls to Telegram's
# *test environment server* (/bot{token}/test) which is a separate Telegram
# instance. Our staging bots are regular @BotFather bots, not test-env bots,
# so they get rejected by the test server. Setting ENVIRONMENT=PROD here
# bypasses that routing. Marketplace integrations are blank on staging so the
# PROD-mode Shopee/Lazada gateways are inert.
ENVIRONMENT=PROD

# Bot tokens — staging-only, from BotFather. Settings reads the *_PROD keys
# when ENVIRONMENT=PROD (NOT plain TELEGRAM_BOT_TOKEN — that name is silently
# ignored thanks to pydantic_settings extra="ignore").
TELEGRAM_BOT_TOKEN_PROD=<staging admin bot token>
CUSTOMER_BOT_TOKEN_PROD=<staging customer bot token>

# Authorised admin user IDs — comma-separated string, NOT JSON array.
# (.env.example shows JSON; that example is misleading — Settings uses str split.)
AUTHORIZED_USER_IDS=<your Telegram user ID>

# Sheets — staging Sheet ID
GOOGLE_SHEET_ID=<staging sheet id>

# Customer Mini App URL — Vercel staging domain
CUSTOMER_MINIAPP_URL=https://telegram-staging.exzentcg.com

# BFF auth — fresh random
WEBHOOK_SECRET=<staging webhook secret>

# Payments — leave blank to disable QR generation (real-money safety guard).
# Once you trust the staging environment and want to test end-to-end PayNow
# flows, you can set a dedicated test mobile/UEN here. WARNING: any QR
# generated then routes real SGD to that recipient.
PAYNOW_MOBILE=
PAYNOW_UEN=
COMPANY_NAME=ExzenTCG (Staging)

# Marketplace integrations — leave blank to disable, or use sandbox/test creds.
# DO NOT use prod marketplace credentials — staging would touch real listings.
PARTNER_ID=
PARTNER_KEY=
SHOP_ID=
NGROK_AUTHTOKEN=

# Sentry — optional, separate project for staging
GLITCHTIP_DSN=

File ownership matters

The bind-mount target /app/config/.env is read by the container's app user (uid 1000). After writing the file as root, chown 1000:1000 /opt/exzen-core/backend/config/.env (and the same for service_account.json) — otherwise the container exits with PermissionError: [Errno 13] Permission denied: '/app/config/.env' on first start.

Place the Sheets service account JSON at backend/config/service_account.json (same file as prod — the duplicated Sheet is shared with the same service account).

Step 4 — Build and start the backend

The Dockerfile + docker-compose.yml live in the exzen-core repo at backend/ (Python 3.11-slim, non-root uid 1000, named volumes for data/, logs/, orders/, qr_temp/, healthcheck on /health). Confirm the checkout has them before continuing — ls backend/Dockerfile backend/docker-compose.yml should both succeed.

cd /opt/exzen-core/backend
docker compose up -d --build
docker compose ps
docker compose logs --tail 100 exzen-core

Port binding

The compose file binds 5000:5000 (all interfaces). Inbound ACL is enforced by the LXC firewall in Step 5 — only +edge_gw and +admin_desktop reach :5000. No need for an IP-specific bind.

Volume ownership on first boot

Docker creates the four named volumes (exzen-data, exzen-logs, exzen-orders, exzen-qr-temp) as root:root by default. The container runs as uid 1000 and will crash with PermissionError: '/app/logs/shopee_bot.log' on first write. Recent Dockerfile bakes a mkdir -p ... && chown -R app:app inside the image so subsequent rebuilds inherit the right perms — but on the very first docker compose up the volume mounts overlay the image dirs and re-introduce root ownership. Fix once after first boot:

for v in exzen-data exzen-logs exzen-orders exzen-qr-temp; do
  chown -R 1000:1000 $(docker volume inspect backend_$v --format '{{.Mountpoint}}')
done
docker compose restart

First-boot checks:

  • Backend logs show Uvicorn running on http://0.0.0.0:5000 and Application started.
  • Both bots log connection success (admin + customer) — look for getMe HTTP/1.1 200 OK for each token.
  • curl http://localhost:5000/health returns 200.
  • SQLite migrations run automatically at startup; volume exzen-data now contains the DB file.

Schema bootstrap on a truly fresh DB

main.py calls init_database() at startup, which creates the schema via init_db.init() if data/zendb.sqlite3 doesn't exist, then runs migrations. If you ever wipe backend_exzen-data and the container restart-loops with sqlite3.OperationalError: no such table: customer_orders, manually bootstrap the schema once:

docker compose exec -u app exzen-core python -c "from init_db import init; init()"
docker compose restart

Step 5 — LXC Firewall

Create /etc/pve/firewall/106.fw on the Proxmox host:

[OPTIONS]
enable: 1

[RULES]
IN ACCEPT -source +edge_gw -p tcp -dport 5000      # Only edge-gateway can reach FastAPI
IN ACCEPT -source +admin_desktop -p tcp -dport 22   # SSH from admin
IN ACCEPT -source +admin_desktop -p tcp -dport 5000 # Direct LAN access for dev
IN DROP                                              # Block all other inbound
OUT ACCEPT -dest +router_gw -p udp -dport 53        # DNS
OUT ACCEPT -dest +router_gw -p tcp -dport 53        # DNS
OUT ACCEPT -p udp -dport 53                          # Public DNS fallback (1.1.1.1)
OUT DROP -dest +lan_subnet                           # Lateral movement prevention
OUT ACCEPT -p tcp -dport 443                         # HTTPS — Telegram, Sheets, Vercel BFF, marketplaces
OUT ACCEPT -p tcp -dport 80                          # HTTP fallback (apt, image registries)

No marketplace allow-list yet

Phase 1 staging starts with marketplace credentials blank — no Shopee/Lazada outbound traffic. If you later wire sandbox/test credentials, tighten OUT ACCEPT :443 to specific marketplace hostnames the same way tcg-staging.fw will in Phase 3.

Verify after boot:

# From inside CT 106

# 1. LAN lateral-movement should be blocked
curl -s --connect-timeout 3 http://192.168.0.16 -o /dev/null -w "HTTP %{http_code}\n"
# Expected: HTTP 000

# 2. Proxmox host should be blocked
curl -s --connect-timeout 3 https://192.168.0.200:8006 -o /dev/null -w "HTTP %{http_code}\n"
# Expected: HTTP 000

# 3. FastAPI self-reachability — use loopback (same iptables OUTPUT caveat as CT 105)
curl -s --connect-timeout 3 http://localhost:5000/health -o /dev/null -w "HTTP %{http_code}\n"
# Expected: HTTP 200

# 4. Telegram API reachable (polling mode requires this)
curl -s --connect-timeout 5 https://api.telegram.org -o /dev/null -w "HTTP %{http_code}\n"
# Expected: HTTP 200 or 302

External reachability test — from CT 101 edge-gateway, proves NPM can proxy:

pct enter 101
curl -s --connect-timeout 3 http://192.168.0.56:5000/health -o /dev/null -w "HTTP %{http_code}\n"
# Expected: HTTP 200
exit

Step 6 — Edge-gateway firewall update

Edit /etc/pve/firewall/101.fw on the Proxmox host. Add before the OUT DROP -dest +lan_subnet line:

OUT ACCEPT -dest 192.168.0.56 -p tcp -dport 5000   # Proxy to exzen-staging FastAPI

Step 7 — NPM Proxy Host

Open NPM admin: http://192.168.0.51:81 → Proxy Hosts → Add:

Field Value
Domain Names telegram-staging-api.exzentcg.com
Scheme http
Forward Hostname / IP 192.168.0.56
Forward Port 5000
Block Common Exploits
Websockets Support (FastAPI here uses plain HTTP/JSON only)
SSL Certificate None (Cloudflare terminates TLS)

Step 8 — Cloudflare Tunnel Route

Cloudflare → Zero Trust → Networks → Tunnels → exzentcg-homelab → Published application routes → Add:

Field Value
Subdomain telegram-staging-api
Domain exzentcg.com
Path (empty)
Service Type HTTP
URL localhost:80
HTTP Host Header telegram-staging-api.exzentcg.com

Cloudflare auto-creates the DNS CNAME (proxied).

No Cloudflare Access app for the backend

Per the chosen design, the backend URL is gated only by WEBHOOK_SECRET (matching the existing prod pattern on Oracle Cloud). Anyone reaching the URL without the secret gets a 401 from FastAPI. If this needs tightening later, add an Access application + Service Token for Vercel.

Step 9 — First-time bootstrap (admin channel + product cache)

Two pieces of state live in the SQLite DB and must be re-seeded after any fresh deploy or volume wipe — bot_settings.customer_order_channel_id and the products_cache rows. Without them the Mini App's /api/products returns 503 Service Unavailable and order notifications have nowhere to land.

  1. Bind the admin channel. In the staging Telegram forum group, send /set to the staging admin bot. Tap 🛒 Set Customer Order Channel in the reply — the bot writes the group + thread mapping to bot_settings. Verify by re-sending /set and confirming the configured channel ID is shown.
  2. Refresh the product cache. DM /refresh_products to the staging admin bot. The bot pulls the staging Sheet's Master Inventory + Imports tabs and writes ~370 rows into products_cache. Bot replies with a count.
  3. (Optional) Configure error log + scheduled report destinations via /set's other buttons.

Step 10 — Vercel staging frontend

(Outside the CT — done in the Vercel dashboard.)

  1. Create a new Vercel project linked to ExzenTCG/exzen-core, root directory frontend, OR add a staging git branch on the existing project promoted to a custom domain.
  2. Add custom domain telegram-staging.exzentcg.com.
  3. Set environment variables (Production scope on the staging project / Preview scope for the staging branch):
  4. BACKEND_API_URL=https://telegram-staging-api.exzentcg.com
  5. WEBHOOK_SECRET=<staging webhook secret> (must match backend)
  6. NEXT_PUBLIC_TG_BOT_NAME=ExzenTCG_Staging_bot
  7. NEXTAUTH_URL=https://telegram-staging.exzentcg.com
  8. NEXTAUTH_SECRET=<staging>
  9. Deploy the staging branch.
  10. In BotFather, set the staging customer bot's Menu Button URL to https://telegram-staging.exzentcg.com.

Step 11 — End-to-end verification

  • [ ] https://telegram-staging-api.exzentcg.com/health returns 200 (no Cloudflare Access challenge)
  • [ ] https://telegram-staging.exzentcg.com loads in a browser (if you see Vercel SSO instead of the page, recheck Deployment Protection in the prereq)
  • [ ] Open the staging customer bot in Telegram on a separate test account, tap the menu button at the bottom of the chat → Mini App opens (opening the URL directly in a browser hits the "must be opened through ExzenTCG miniapp" gate — that's the Mini App's own initData check, working as designed)
  • [ ] Mini App home page renders without "Access Restricted" (if it doesn't, the most likely cause is CUSTOMER_BOT_TOKEN missing on Vercel — auth route returns 500 silently. See prereq.)
  • [ ] Browse products → add a priced item to cart (only the products with non-zero standard_price in the staging Sheet's Imports tab can be checked out; unpriced items make subtotal=0 and the BFF Zod schema rejects with "Invalid request body") → checkout with Self-Collection (free, safe) → order success page renders
  • [ ] If PAYNOW_* is set in .env: in-app QR block renders with the configured recipient. If PAYNOW_* is blank: success page falls through to the legacy "Open Telegram Bot" CTA (safety guard working as designed)
  • [ ] Order appears in the staging admin forum group with approval buttons
  • [ ] Approving the order in the staging group flips the order status in the staging Mini App
  • [ ] Sending a chat message to the staging customer bot from the test account forwards into the order's thread in the staging admin group
  • [ ] No data hits the production Sheet (check Sheet revision history)
  • [ ] No notifications hit the production admin forum group

Customer bot DM behaviour when PAYNOW_* is blank

On checkout, the customer bot tries to send the QR via DM. With PAYNOW_* blank, that DM fires the "❌ Payment system not configured. Please contact support." message — which is alarmist but harmless on staging (the order itself succeeds, this is just the bot's prod-tone fallback). If the noise bothers you, set PAYNOW_MOBILE to a known test number; otherwise ignore.

Step 12 — Snapshot

pct snapshot 106 initial-deploy --description "exzen-staging initial deployment — FastAPI + python-telegram-bot polling mode"

Updates needed elsewhere (post-deploy)

  • [ ] Add exzen-staging entries to homelab/00_secrets/Credentials Index.md:
  • SSH key, root password (CT 106)
  • Staging admin bot token (@zencorehelper_staging_bot)
  • Staging customer bot token (@Zencorestore_staging_bot)
  • Staging WEBHOOK_SECRET
  • Staging Google Sheet ID
  • Staging admin forum group ID (-1003922201568)
  • Vercel staging project + NEXTAUTH_SECRET
  • [ ] Update homelab/01_planning/Phase 1/Architecture Diagram.md — add CT 106 to IP map and Mermaid diagram
  • [ ] Add CT 106 health check to homelab/02_setup_logs/Phase 1 Health Checks.md (curl backend /health, check docker compose ps)
  • [ ] Update Homepage services.yaml on CT 103 with a card for telegram-staging (LAN-internal entry)
  • [ ] Add a row to root index.md "Homelab Services" table
  • [x] ~~Open a PR on ExzenTCG/exzen-core adding backend/Dockerfile + backend/docker-compose.yml~~ — merged as PR #22
  • [ ] Write homelab/02_setup_logs/exzen-staging Setup Log.md while deploying — raw commands, screenshots of NPM/Cloudflare configs, the BotFather setup, and any deviations from this plan

Troubleshooting

Problem Cause Fix
ghcr.io / pypi.org DNS lookup fails Router DNS pct set 106 --nameserver 1.1.1.1 from Proxmox host, reboot CT
Container restart-loops with PermissionError: '/app/config/.env' Bind-mounted file is root:root but container runs as uid 1000 chown 1000:1000 /opt/exzen-core/backend/config/.env /opt/exzen-core/backend/config/service_account.json
Container restart-loops with PermissionError: '/app/logs/shopee_bot.log' Volume mountpoint inherited root:root Run the volume-chown loop documented in Step 4
Container restart-loops with sqlite3.OperationalError: no such table: customer_orders Fresh DB never bootstrapped docker compose exec -u app exzen-core python -c "from init_db import init; init()" then docker compose restart
Bots reach getMe but get Unauthorized from Telegram ENVIRONMENT=DEV routes to test-server (/bot{token}/test) which rejects regular bot tokens Set ENVIRONMENT=PROD in .env, recreate container with docker compose up -d --force-recreate (env_file is read at create time, not restart)
Mini App opens to "Access Restricted: This app must be opened through ExzenTCG Miniapp" CUSTOMER_BOT_TOKEN missing on Vercel — auth route returns 500, Mini App stays unauth Add CUSTOMER_BOT_TOKEN to Vercel Preview env vars (staging branch override), then redeploy
Mini App opens to a Vercel SSO popup Vercel Authentication enabled on Preview Disable in Project Settings → Deployment Protection (untick "Require Log In")
Mini App opens but /api/products returns 503 "Product cache empty" products_cache table empty on fresh DB DM /refresh_products to staging admin bot
Order created but admin thread not posted customer_order_channel_id not set in bot_settings Send /set → "🛒 Set Customer Order Channel" in staging forum group
Checkout returns 400 "Invalid request body" — platform: "Expected string, received null" Schema didn't allow null for platform (older staging branches before fix fc133c5) git pull origin staging to get the schema fix, redeploy Vercel
Checkout returns 400 "Invalid request body" — subtotal: must be positive Picked an unpriced product (price=0) Pick a product with non-zero standard_price from the Imports sheet
Bot tokens copy-paste error Test directly: curl https://api.telegram.org/bot<TOKEN>/getMe should return {"ok":true,...} Rotate via BotFather, update .env, recreate container
Staging order writes hit prod Sheet GOOGLE_SHEET_ID still pointing at prod Update .env, recreate container
Mini App loads but order create returns 401 WEBHOOK_SECRET mismatch between Vercel and backend Match values exactly, redeploy Vercel AND recreate backend container
Vercel env-var change doesn't take effect Vercel doesn't auto-rebuild on env-var change Push empty commit to staging OR click Redeploy in Vercel UI (untick "Use existing Build Cache")

Rollback

If the staging environment goes sideways and you need to start over:

# From Proxmox host
pct stop 106
pct rollback 106 initial-deploy
pct start 106

After rollback, re-run Step 9 first-time bootstrap (/set + /refresh_products) — both pieces of state live in the SQLite DB which initial-deploy snapshot took before you bound the admin channel and refreshed products.

Less destructive — wipe just the DB volume

If you only want to clear orders/spam test data without losing CT-level config:

cd /opt/exzen-core/backend
docker compose down
docker volume rm backend_exzen-data backend_exzen-logs backend_exzen-orders backend_exzen-qr-temp
docker compose up -d --build
# then re-run the volume-chown loop (Step 4) and Step 9 bootstrap
Faster than a full snapshot rollback. Same end state for the app.

If even the snapshot is bad: pct destroy 106 and re-run this runbook from Step 1. Staging is disposable by design — no real data lives here.


Promoting a staging-validated change to production

Once a feature has been exercised on this CT (real order through, no regressions in bot delivery, no console errors, etc.), promote it to production:

1. Merge stagingstablev2 on exzen-core

# On your local exzen-core checkout
git fetch origin
git checkout stablev2
git pull origin stablev2
git merge --ff-only origin/staging   # fast-forward only — no merge commit unless conflicts
git push origin stablev2

If a fast-forward isn't possible (someone hot-fixed stablev2 directly), resolve by rebasing staging onto stablev2 and force-pushing staging, then re-running the merge.

2. Vercel auto-redeploys the production project

Pushing to stablev2 triggers the production Vercel project to rebuild and deploy the new Mini App. No further action needed for the frontend.

3. Restart the Oracle Cloud backend

The backend on Oracle does NOT auto-deploy. SSH in and restart manually:

ssh oracle-prod
cd /path/to/exzen-core/backend
git pull origin stablev2
# Stop current process (Ctrl-C in tmux/screen, or kill the uvicorn process)
# Re-activate venv if needed, then:
python main.py

Sequence matters

If you push to stablev2 without restarting the Oracle backend within a short window, you'll have a brief skew where the new Vercel frontend talks to the old Oracle backend. The Mini App is designed to degrade gracefully (BFF reads only known fields, success view falls back when payment is absent), but minimise the window — restart Oracle as soon as the Vercel build goes green.

4. Smoke-test production

Place a real (small-amount) order from your operator account through the production Mini App. Confirm the QR + bot DM + admin thread + Sheets write all behave as on staging.

5. Tag the release (optional but useful)

git tag -a "release-$(date +%Y-%m-%d)" -m "Promoted from staging: <one-line summary>"
git push origin --tags

Tags make rollback trivial: git checkout <previous-tag> on Oracle, restart.