Phoenix Roadmap

Engineering Roadmap

Where Phoenix is heading

Our direction for the next few years — performance, hosting & caching, and AI search — sequenced so every step delivers standalone value on its own.

Status
Decision doc · no code yet
Horizon
Next few years
Themes
Performance · Hosting/Caching · AI Search
Phases
5 (P0 → P4)
Overview

The roadmap at a glance

Phases by horizon. P0 and P1 start now in parallel; P2 is the critical-path unlock everything later depends on.

Ready to start Critical-path unlock Interim value Destination Positions show sequencing & dependencies, not committed dates.
01 — Delivery

The phases

Five phases, ordered to front-load portable work. Click any phase to expand.

  • AI search: robots.txt + dynamic sitemap.xml + llms.txt with explicit AI-crawler rules (GPTBot, ClaudeBot, OAI-SearchBot, PerplexityBot, Google-Extended). Missing entirely today.
  • Schema expansion: BreadcrumbList, NewsArticle variant, FAQPage, Author/E-E-A-T; RSS feeds; pagination meta (rel=prev/next).
  • RUM monitoring dashboard: capture real-user metrics in the browser — TTFB, LCP and the rest of Core Web Vitals, broken down by device, network, and region — and surface them on a live dashboard with alerting on regressions. This is the baseline every later phase is measured against.
  • Observability baseline (gates everything after): extend the dashboard with CDN hit-ratio, upstream (FOS/StatsPerform) call-volume, and server response latency p50/p99 — so we catch the next 429 before it happens.
Exit robots/sitemap/llms resolve; Rich Results + schema validators pass; RUM dashboard live with real-user Core Web Vitals + alerting.
  • Finish MUI + styled-components → CSS Modules + Radix islands + semantic HTML; delete MUI/emotion (removes runtime CSS-in-JS — aligns with the zero-JS goal).
  • Enforce portability conventions (this is what makes Astro cheap): fetch data at the route boundary, keep leaf components prop-driven/presentational. Fix the standards' Article.tsx example that fetches inline — that pattern is the one thing that won't port to Astro.
  • Replace EOL react-query v3 → TanStack Query v5; push client fetching to the server where possible.
  • Client bundle audit — global providers ship to every page; trim to push work server-side.
Exit zero MUI in the bundle; all content types render from app/; components portable (CSS Modules + Radix islands, no inline data fetching).

Hosting — move off Lambda + API Gateway → ECS/Fargate

  • Long-running containers (autoscaling behind an ALB) → native HTML streaming, persistent in-process cache, and no cold-start TTFB tax.
  • Remove the response buffering in prodServer.js; serve Next's streaming handler.
  • Re-enable compression at the right layer (currently compress:false); audit Vary headers for cache correctness.

Caching & resilience

  • Shared cache: stand up ElastiCache/Redis as the cross-instance cache (Fargate makes a persistent in-process L1 viable too → L1 in-process + L2 Redis).
  • Single-flight request coalescing + retry/backoff + circuit breaker in fetchWrapper for all upstreams (FOS, StatsPerform, Apple News) → kills the 429-class incidents. Caching these calls also means RSC prefetch is served from cache instead of hammering FOS.
  • RSC-aware caching: distinguish document vs RSC responses — key/Vary on the RSC header and normalize ?_rsc at the Akamai property (external/DevOps change, sequence with this work).
  • Keep the existing Akamai per-content-type SWR + Edge-Cache-Tag purge strategy unchanged.
Exit RSC caching correct (no poisoning/fragmentation); 429s gone under load; upstream origin calls collapse (cache hit-ratio climbs); cold-start tax removed; TTFB down; streaming now possible.
  • loading.tsx + Suspense boundaries per page section/shelf; stop blocking on Promise.all (the concept ports to Astro islands).
  • Partial Prerendering (PPR) — static shell + streamed dynamic holes — only to the extent the Astro timeline justifies; deep PPR investment is Next-specific. Decision gate at Phase 2 exit.
  • Apply ISR where content allows; wire shared-cache reads into render.
Exit measurable TTFB/LCP improvement in RUM; above-the-fold paints before slow shelves.

Cheap by design: components are portable, hosting is Fargate (Astro SSR runs there natively), observability lets us compare honestly. Routing is built fresh here — a single splat route that resolves the FOS route and renders.

  • Stand up an Astro SSR app on Fargate; build FOS route resolution as a single src/pages/[...slug].astro splat route (resolve once, render the right template). Replaces Next's routing layer wholesale.
  • Port templates: presentational components drop in; async Server Components become .astro frontmatter (mechanical); Radix bits become client:load/client:visible React islands; CSS Modules + design tokens carry over unchanged.
  • Strangler-fig cutover: route one content type at a time through the CDN to Astro vs Next; compare TTFB + CDN hit-ratio against the Phase 0 baseline; expand when green.
  • Decommission Next.
Exit all content types on Astro; Next removed; one HTML representation per URL (RSC caching problem gone for good); zero-JS-default pages; CDN caching trivial.
02 — Watch list

Risks & gates

Don't rebuild components twice

Phase 1 portability conventions are the single biggest cost control — land them before bulk component work.

Akamai cache-key changes are external

DevOps-owned; sequence the RSC-aware caching + ?_rsc normalization with them.

Fargate is the critical-path unlock

Streaming (P3) and persistent caching depend on it; de-risk with a single-content-type canary before full cutover.

Phase 3 sizing depends on the Astro timeline

Revisit at the Phase 2 exit; keep PPR light if Astro is near.

Astro shared-state caveat

Global config/login context spanning many islands is more awkward than a single React tree; design island boundaries deliberately in Phase 4.

03 — Proof

Verification per phase

P0

robots/sitemap/llms resolve; schema validators pass; dashboards populated.

P1

zero MUI in bundle; a sample migrated component renders unchanged in an Astro + React-island spike (proves portability).

P2

curl -N <url> shows chunked early bytes and the correct representation per RSC header; load test holds FOS origin call-count flat as concurrency rises (no 429); Redis hit-ratio climbs and RSC prefetch is served from cache, not origin; cold-start tax gone.

P3

RUM shows TTFB/LCP improvement; above-the-fold paints before slow shelves.

P4

per-content-type TTFB + CDN hit-ratio beat the Next baseline before each cutover; Next decommissioned with no regression.

Status

Roadmap / decision doc — no code yet. First concrete implementation is Phase 0 (parallelisable now) and Phase 1 (continue the migration with portability conventions enforced). The critical-path technical unlock is Phase 2 (Fargate + Redis + caching/resilience).