Skip to content

Phase 7 Findings — Shopee Connector Overhaul

Closed out 2026-05-17

Phase 7 shipped across 7 feature PRs (#112 PR-1 schema + backfill, #113 PR-2 region/URL derivation, #115 PR-3a token encryption, #116 PR-3b proactive refresh middleware, #117 PR-4 warehouse-location cache, #121 PR-5 profile-based settings UI + legacy cutover, #122 PR-6 diagnostic endpoint + modal) plus five hotfix PRs that surfaced during deploy (#119, #120, #123, #124, #125). All deployed and smoke-validated on CT 105. See Phase 7 Kickoff Plan for the design and D1–D7 decisions.

What shipped

A complete redo of the Shopee connector surface, replacing Phase 6's one-row-per-env_type shape with a multi-profile model and adding operational hardening (encrypted tokens, proactive refresh, warehouse-cache auto-fetch, diagnostics).

Behaviorally:

  • Profiles, not environments. shopee_connector_profile table holds N rows per env_type; exactly one per env is is_active = true (partial unique index). Operators create / clone / activate / delete profiles from /connectors/shopee in the dashboard.
  • Region as a real enum. ShopeeRegion enum covers all 10 Shopee regions plus their TEST_ siblings (19 values). base_url_override is retained as an Advanced/escape-hatch field; the default (region) → base_url derivation handles the normal cases.
  • Derived URLs. push_url and oauth_callback_url are derived at request time from PUBLIC_API_BASE_URL + env_type + profile_id. The dashboard surfaces them read-only for operator registration in Shopee Partner Center; no DB columns hold them.
  • AES-256-GCM at rest for OAuth tokens. PR-3a added access_token_encrypted + refresh_token_encrypted columns; PR-5's migration dropped the plaintext columns once the encrypted path was live. Domain-separated HKDF info string (shopee-oauth-access-token + refresh-token) keeps token encryption isolated from partner_key encryption.
  • Proactive token refresh. tokenRefresher.refreshTokenIfStale() runs inside getShopeeSdk() on every outbound call. Postgres advisory lock keyed on hashtext('shopee-refresh:' || env || ':' || shop_id) ensures exactly-one in-flight refresh per shop; losers poll for the new token (200ms × ≤5s) and bail to the SDK's 401-recovery if the wait times out.
  • Warehouse-location cache. shopee_warehouse_location table caches Shopee's getWarehouseDetail response per profile. Auto-refreshed by a subscriber on shopeeConnectorService.shopee-auth-token.created (first OAuth completion) and operator-triggered via POST /admin/dashboard/connectors/shopee/profiles/:id/refresh-locations.
  • Metadata-only diagnostic surface. GET /admin/dashboard/connectors/shopee/profiles/:id/diagnostics returns the timing + status of the most recent refresh attempt, scopes (when Shopee echoes them), and the error excerpt on failure. Token bytes never leave the server.

Validation

Profile cutover smoke (curl from CT 105, 2026-05-16)

  • GET /admin/dashboard/connectors/shopee/profiles returns both seeded profiles (scp_sandbox configured + active, scp_live empty + active stub).
  • GET /profiles/scp_sandbox returns full credential view with partner_key_fingerprint (v1:9…q0M=·len127), push_partner_key_fingerprint, derived URLs.
  • POST /profiles { display_name, env_type, region, clone_from_id: "scp_sandbox" } clones cleanly; new profile is inactive.
  • POST /profiles/scp_live/activate → 409 Cannot activate profile "Live" — credentials incomplete (partner_key) (pre-flight refused as designed).
  • DELETE /profiles/scp_sandbox → 409 Cannot delete the active "sandbox" profile. Activate a different one first.
  • DELETE /profiles/<clone_id> → 204.
  • Legacy GET /admin/dashboard/connectors/shopee?env=sandbox → 404 (route deleted per cutover plan).

Diagnostic endpoint smoke (2026-05-16)

  • Initial fetch: all last_refresh_* fields null (no refresh fired since PR-6 code deployed).
  • Force-stale expired_at to be inside the 60s buffer + trigger refresh-locations → proactive refresh fires through getShopeeSdk.
  • Re-fetch: last_refresh_attempt_at and refresh_token_last_used_at populated; last_refresh_status: success; access token now valid for another 4h.
  • Endpoint returns only metadata — no access_token / refresh_token strings.

Phase 3 cascade (regression check)

Phase 3's order-update cascade (code=3 ORDER_STATUS_UPDATE → getOrdersDetail → Medusa order workflow) still fires under the Phase 7 image. The encrypted-token + advisory-lock paths are transparent to upstream callers.

Gotchas + decisions captured in flight

Admin SPA whitescreen — react@18 + react-dom@19 ABI mismatch

Phase 6's dashboard was healthy. After a Dependabot bump landed (react-dom 18.3.1 → 19.2.6 in apps/server/package.json, #66), /app whitescreened with Uncaught TypeError: can't access property "S", $e is undefined at line 372 of the admin bundle. Dependabot only bumped react-dom; react stayed at ^18.3.1, so the bundle initialized react-dom 19 against react 18's internals and the React 19 owner-stack access ($e.S / $e.T) failed against React 18's exports.

Spent 3 PRs chasing a wrong hypothesis (@swc/core pin per Medusa's whitescreen doc, then Node 22 in Docker) — both produced byte-identical admin bundles (same content-hash filename index-B7IAVb-7.js), proving the swc/Node hypothesis was wrong for our codebase.

Real fix #120: pin react-dom back to ^18.3.1 to match what Medusa 2.15.2's admin SDK + dashboard packages peer-require. New bundle index-YheKpllb.js rendered cleanly. The dashboard app at React 19.2.4 is isolated in its own workspace and unaffected.

Takeaway: when CI's dependabot opens a major-version bump on only one half of a peer-paired package, refuse the PR. React + react-dom must move together.

model.json() columns are deep-merged on update, not replaced

PR-6 set out to drop plaintext token bytes from shopee_auth_token.raw_token_payload. Three iterations failed because two compounding traps:

  1. { access_token, ...rest } = token doesn't actually strip the fields. The SDK's AccessToken instance carries those fields in a way that survives spread — likely class accessors or non-own enumerables. JSON-roundtrip + delete produces a clean POJO in JS, but…
  2. MedusaService.update<Entity>([{id, json_col: rawSafe}]) deep-merges JSONB. A write of {a, b} over {a, b, leaked} retains leaked on the row.

Verified on staging by injecting access_token: "leaked" into the row via SQL, force-staling the token, and triggering a refresh — the leaked value survived a refresh whose JS write didn't include it.

Fix (#125): - Code: write the secret keys as explicit null in raw_token_payload so the merge overrides the existing value with null. - Data: one-shot migration uses Postgres's JSONB - operator to fully drop the keys from any existing row.

Captured to the project memory as Gotcha #4 in reference-medusa-workflow-gotchas — applies to any model.json() column where you need to remove keys, not just raw_token_payload.

Cloudflare Insights SHA-512 mismatch was a red herring

During the whitescreen detour, the browser console showed None of the "sha512" hashes in the integrity attribute match the content of the subresource for static.cloudflareinsights.com/beacon.min.js. The SRI mismatch was real but unrelated — the hash CF served matched the SHA-512 of an empty string, meaning CF was returning empty for the beacon. That doesn't crash React; the $e is undefined error one line down was the actual cause.

Takeaway: when multiple errors print together, the LAST one in the cascade is usually the trigger; the SRI / CORS warnings up top are noise from auxiliary scripts.

deriveOauthCallbackUrl originally pointed at /admin/... — would have required admin auth

PR-2 added a helper that returned a callback URL under /admin/dashboard/connectors/shopee/oauth/callback. That path is auto-protected by Medusa's admin auth middleware — Shopee can't authenticate as a Medusa admin, so the callback would have 401'd.

Fixed during PR-5 by repointing to the public /connectors/shopee/oauth/callback/<env>?profile_id=<id> path that's served by the existing route under apps/server/src/api/connectors/shopee/.... The ?profile_id= query param lets the callback handler pin the target profile active before exchanging the code.

CT 105 missing PUBLIC_API_BASE_URL env var

The PR-2 URL helpers throw if PUBLIC_API_BASE_URL is unset. Staging didn't have it configured, so the per-profile detail endpoint shows (PUBLIC_API_BASE_URL not set) instead of the real derived URLs. Not a code defect — operator config follow-up.

Follow-up #126 now forwards PUBLIC_API_BASE_URL through apps/server/docker-compose.yml into the Medusa container. The host still must set PUBLIC_API_BASE_URL=https://tcg-staging.exzentcg.com in apps/server/.env; the staging runbook is the canonical checklist.

Disk pressure on CT 105 during Docker rebuilds

Two filesystem read-only events during PR-3a + PR-119 builds: LVM thin pool (pve/data) hit 100%, triggering ENOSPC, freezing CT 105's rootfs. Recovery procedure (pct stoplvextend -L+5G pve/datafsck.ext4 -fypct startpct fstrim 105) became near-routine during the Phase 7 build cycle. Captured to memory as feedback-lvm-thin-pool-threshold.

CT 105 rootfs grew 20G → 30G over the phase. Worth a periodic pct fstrim 105 from the host now that Docker builds reliably eat 5–10G of layer cache per cycle.

Open follow-ups

  • Ensure PUBLIC_API_BASE_URL is present on CT 105 and future prod (e.g. https://tcg-staging.exzentcg.com) so the per-profile detail page shows real derived URLs instead of the placeholder. Code-side compose forwarding landed in #126; host .env configuration remains an operator step. Same env var needs to land on prod when CT 106 spins up.
  • Capture OAuth scopes on first auth. Shopee's authorize response does carry scope information for some Partner Center apps; the SDK doesn't expose it directly today. A future PR could pull from the raw OAuth callback HTTP response (pre-SDK) and populate shopee_auth_token.scopes so the Diagnostics modal shows what the operator actually has. PR-6's preserve-on-refresh logic is already in place; just need a populator for the initial OAuth.
  • Drop last_refresh_* columns from the model when the scopes story is finalized. Currently the scopes column is always null (never populated by anything) — harmless but noisy.
  • JWT_SECRET rotation tooling — still manual (carried over from Phase 6). With token encryption now also keyed on it, rotation requires re-encrypting partner_key_encrypted, push_partner_key_encrypted, access_token_encrypted, refresh_token_encrypted across all rows. Worth a runbook + a maintenance-mode admin endpoint.
  • Multi-shop per profile — Phase 7 assumes one profile = one shop_id. If a merchant ever wants two shops sharing partner credentials but writing tokens to distinct rows, the schema's shop_id field on shopee_connector_profile becomes the constraint to lift. Not on any current roadmap.
  • Apply Phase 7 to prod (CT 106) when it spins up — same migration chain (20260516180000, …200000, …210000, …220000, …230000, …240000) plus the JSONB-cleanup migration.
  • Phase 7 Kickoff Plan — D1–D7 design decisions
  • Phase 6 Findings — predecessor; Phase 7 consolidated and hardened the multi-environment toggle that Phase 6 introduced
  • Phase 3 Findings — Shopee cascade behavior that Phase 7's encrypted/refresh paths leave intact