Migration Playbook: Enterprise ESP/SMS Replatforms with Zero Downtime

Why replatform—and why “lift & shift” breaks

You replatform for one of four reasons: (1) data gravity changed and the old ESP/SMS is fighting your warehouse; (2) channels expanded (in-app/push) and orchestration needs grew up; (3) compliance & security standards tightened (SSO/MFA, DPAs, 10DLC), or (4) experimentation velocity hit a ceiling. All good reasons, none of which justify breaking deliverability or consent because you were in a hurry.

The lie of “lift & shift” is that messaging is files and settings. It’s not. It’s identity, consent, placement, and timing—living things that won’t survive a copy-paste. Treat replatforming like a change-control exercise, not a design sprint. Build plumbing first; paint after.

The risk map: where migrations actually fail

  • Deliverability drift: new domain without warm-up; DMARC misalignment; image-only templates; mailing unengaged during “big weeks.”
  • Identity fracture: external_id lost; email/phone merged incorrectly; anonymous sessions not stitched; duplicate profiles ballooning suppressions.
  • Consent regressions: CSV imports with mystery opt-in states; mixed jurisdictions; SMS 10DLC campaigns mismatched to actual traffic; missing HELP/STOP/quiet hours.
  • Event entropy: “Added to Cart” means five things; timestamps drift; attributions garbled; flows firing twice.
  • Ops chaos: no send-freeze; no rollback; last-minute approvals; no owner to push the big red button.

This playbook is designed to neutralize each category before it gets expensive.

Week 0 audit: inventory, identity, consent, and data

Start with a humble spreadsheet. Your future self will call to thank you.

Inventory (flows, campaigns, assets)

  • Flows: name, purpose, triggers, segments, languages, dynamic content, owner, KPI.
  • Campaign cadences: volume by cohort, suppression rules, add-ons (AMP, animations).
  • Assets: template partials, language packs, UGC blocks, product recommendation logic.

Identity contract

  • Primary key per channel (email, phone, app id), external_id (CRM/warehouse).
  • Anonymous → known stitching: how, where, when; cookie/session id mapping.
  • De-duplication rules (email normalization, phone E.164, domain aliases).

Consent contract

  • Email: opt-in source, timestamp, jurisdiction; double opt-in where required.
  • SMS: brand & campaign registrations (10DLC), HELP/STOP keywords, quiet hours on profile.
  • Tools: Dataships audit or OneTrust/Ketch/Transcend preference center + DSAR pipeline.

Data audit

  • Events (names, schemas, timestamps, ids): Viewed Product, Add to Cart, Checkout Started, Placed Order, Refund Issued, Subscription events.
  • Properties: ZPD (primary_goal, variant_pref), loyalty (points_to_next_reward, tier), geo/timezone.
  • Retention & deletion: what do you keep, for how long, and how do you prove deletion?

Stack patterns: Klaviyo, Braze, Attentive, Postscript, Cordial

Tools are opinions about how work should happen. Respect their biases.

Klaviyo

  • Strengths: Shopify gravity, fast lifecycle, email+SMS in one brain, dynamic blocks, straightforward A/Bs.
  • Watch-outs: multilingual requires language packs/partials; advanced experimentation via hacks or bandits in warehouse.
  • Migration notes: replicate flows with improvements, not just copies; map event names precisely; keep persistent holdouts.

Braze

  • Strengths: multi-channel (email/SMS/push/in-app), robust experimentation, profile APIs.
  • Watch-outs: needs warehouse/CDP discipline; QA pipeline heavier.
  • Migration notes: enforce external_id early; align catalog feeds; port segmentation logic to Braze audience builder.

Attentive / Postscript (SMS)

  • Attentive: growth tooling, compliance guardrails, journey builder, analytics. Enterprise-friendly.
  • Postscript: lean, deep Shopify hooks, fast to iterate.
  • Migration notes: 10DLC brand/campaigns first; map opt-ins/opt-outs; quiet hours on profile; test HELP/STOP.

Cordial

  • Strengths: API-forward, flexible data model, scale.
  • Watch-outs: bring your own experimentation rigor; explicit data contracts needed.

Warehouse & Reverse ETL: Snowflake/BigQuery + dbt + Hightouch/Census is the universal adapter. Keep your identity and risk in your house; let orchestration tools do the last mile.

Deliverability prerequisites: domains, DMARC, warm-up, seeds

Placement is a license, not a feeling. Treat it like change-controlled infrastructure.

  1. Dedicated sender domain: e.g., news.yourbrand.com. Configure SPF/DKIM, align DMARC. Start on p=none, move to quarantine then reject after stabilization.
  2. Tracking CNAMEs: branded click/open domains to avoid mismatched-domain flags.
  3. Engagement band policy: 0–30 / 31–60 / 61–90 days; sunset after two re-engagement touches.
  4. Warm-up plan: 2–3 weeks of banded sends; no blasts to unengaged; proof-first lifecycle before promos.
  5. Seed/panel baselines: measure placement before changes so you know what “good” was.

Event & identity mapping (with examples)

Write a dictionary: left column = old platform event/property, right column = new platform schema, with id and timestamp rules.

Legacy Target Notes
Checkout Started checkout_started Ensure cart_id present; dedupe rapid retries by cart_id+timestamp
Placed Order order_completed Include order_id, order_value, currency, items[] (sku, qty, price)
Refund Issued refund_issued Emit negative revenue if your attribution expects net sales
Subscription Paused subscription_paused Key to reason-based saves; store reason enum

Identity merge rules (pseudo)

-- Normalize email
email_norm = LOWER(TRIM(email))
email_norm = REGEXP_REPLACE(email_norm, '\\.(?=[^@]*@)', '')  -- optional dot removal for gmail

-- Normalize phone
phone_e164 = TO_E164(phone_raw, country_hint)

-- Merge policy
customer_key = COALESCE(external_id, email_norm, phone_e164)
    

Flow migration strategy: clone, improve, or redesign?

Everything in a replatform begs you to copy. Resist. Use migration to fix what your future team will hate.

  1. Clone when the logic is good and dependencies are stable (e.g., transactional notices).
  2. Improve when you can add a proof-first block, progress header (“You’re {{points_to_next_reward}} from $10 off”), or better segmentation.
  3. Redesign when the model wants it (Braze Canvas, Cordial data flows) or when multilingual needs require language packs instead of duplicated flows.

Start with the spine: Post-Purchase → Second-Purchase Accelerator → Replenishment → Winback → Subscription Saves. Those five pay rent.

Parallel sends & placement validation

There is no zero-downtime without parallelism. For two weeks, run critical flows in both platforms, mailing bands of engaged users. Measure, don’t hope.

  • Seed/panel: daily snapshots; compare inbox rates; don’t chase single-day noise.
  • Complaint dashboards: track per-domain; pause promos if Gmail moves.
  • Holdouts: keep message-level controls for pilot flows; report holdout-adjusted RPR and conversion.
  • Diff logs: when platforms disagree (segment size, dynamic content), log the diff and fix before cutover.

Cutover day playbook & rollback

Go/no-go criteria

  • New domain warm-up stable; seed/panel equal or better than baseline.
  • Complaint rates ≤0.08% at Gmail; stable elsewhere.
  • Critical flows validated with test profiles; no duplicate sends; timestamps correct.
  • Consent & identity tables reconcile (spot-checks across cohorts).
  • Rollback path signed (who presses it; which flows revert; DNS/ESP toggles).

Day-of checklist

  • Communication sent: “change freeze” notice; on-call matrix; escalation channel.
  • Disable legacy triggers incrementally; enable new triggers; monitor diffs.
  • Seed/panel check at T+1h, T+4h, T+24h; complaint watch continuous.
  • Post a 24-hour cutover report: flows enabled, issues, mitigations, next checks.

Rollback

Rollback is not failure; it’s discipline. If complaints spike or seeds tank, revert flows, pause promos, fix, and try again. Document why.

Stabilization weeks: what to watch first

  • Deliverability: complaint by domain, read-time proxies, seed/panel trendlines.
  • Revenue: holdout-adjusted RPR (flows vs. campaigns), AOV and conversion where applicable.
  • Retention: 30-day second-purchase rate for cohorts first exposed in new platform.
  • SMS: opt-out rate per send; TCR flag status; quiet-hour compliance rate.
  • Ops: incident log, QA misses, approval times; retro after 2 weeks.

Multilingual & regional cutovers without chaos

Duplicating flows per language is how programs drown. Use language packs and partials—translators edit keys, not logic. For RTL (Arabic/Hebrew), flip containers (dir="rtl"), mirror icons, verify fonts, and run extra device QA. Consent text must be localized; quiet hours must respect local timezones. Roll regions in waves: one market per week, not five in a day.

RACI, SLAs, and change control that prevent fire drills

Task R A C I
Domain & DMARC setup Deliverability lead Head of Lifecycle IT/Sec Marketing
Event & identity mapping Data engineer Head of Data ESP/SMS admins Lifecycle
Consent reconciliation Privacy lead DPO/Legal Lifecycle IT/Sec
Parallel send validation Producer Head of Lifecycle Deliverability Stakeholders
Cutover & rollback Producer Head of Lifecycle Deliverability/Data Marketing/Legal
Change-freeze clause (SOW excerpt)
During risk windows (warm-up, parallel sends, cutover, incidents), a change freeze is in effect. Only pre-approved critical fixes
may deploy. All other changes are queued. Exceptions require joint approval by Client Lifecycle Lead and Agency Producer.
      
QA & Send SLA (excerpt)
Agency provides [X] business hours for QA before any scheduled send: device rendering, link/UTM validation, segmentation checks,
accessibility (alt text, contrast). No template changes within [Y] minutes of send. Client consolidates feedback through one approver.
      

45/60-day timeline (day-by-day milestones)

Days 1–7: Audit & prerequisites

  • Inventory flows/campaigns; identity/consent contracts documented.
  • Dedicated sender domain provisioned; SPF/DKIM/DMARC configured (p=none).
  • Engagement band policy and sunset enforcement turned on in legacy platform.
  • Seed/panel baseline runs; complaint dashboards by domain configured.
  • 10DLC brand/campaign registrations confirmed; HELP/STOP tested.

Days 8–21: Warm-up & mapping

  • Warm-up: engaged cohorts only; lifecycle > promos.
  • Event/identity mapping: schemas in dbt; sample payloads end-to-end; external_id stitched.
  • Consent reconciliation: imports with source/timestamp/jurisdiction; fix mixed states.
  • Template refactor: real text, language packs, accessibility passes.

Days 22–35: Parallel sends & fixes

  • 2–3 critical flows in parallel; segment-size diffs investigated.
  • Seeds: daily; complaints: continuous; anomalies triaged and fixed.
  • Holdout-adjusted RPR collected for pilot flows; document “what changed.”

Days 36–42: Cutover

  • Go/no-go review; cutover plan with timestamps; on-call schedule.
  • Disable legacy triggers; enable new; monitor T+1h, T+4h, T+24h.
  • Rollback if thresholds breached (complaints/placement); retro if executed.

Days 43–60: Stabilization

  • Weekly readout: RPR (flows/campaigns), complaints, seeds, P2-rate, SMS opt-outs.
  • Backlog: lift winners into more flows; retire parity builds that aren’t needed.
  • Move DMARC to quarantinereject when confident.

Case snapshots: three migrations, three lessons

CPG (Klaviyo → Braze; Attentive retained)

Problem: app messaging needed; data lived in warehouse; legacy flows duplicated per language. Fix: external_id stitch + language packs; warehouse → Braze via Hightouch; parallel for 18 days. Outcome (day 45 vs baseline): complaint steady, inbox unchanged; +7% holdout-adjusted RPR on post-purchase; -22% build hours monthly due to partials.

Apparel (Cordial → Klaviyo; Postscript → Attentive)

Problem: slow experiments; SMS complaints; consent states inconsistent. Fix: Dataships audit; 10DLC re-registration; warm-up new domain; proof-first templates. Outcome: -34% SMS opt-outs, +9% RPR in second-purchase, complaint ≤0.05% Gmail.

Wellness (Klaviyo → Klaviyo consolidation across regions)

Problem: six accounts, duplicated flows, no RTL support. Fix: consolidate to one global pattern; lang packs; dir="rtl" for MEA; staged regional cutovers. Outcome: -40% maintenance hours, +6 pts placement in problematic region, no downtime.

FAQ

Can we migrate in two weeks?

Not safely. You can start in two weeks. Safe cutovers need warm-up, parallel, and validation. If someone promises “done in a week,” they’re selling open rates, not placement.

Do we need a CDP to replatform?

No. A warehouse + dbt + reverse ETL is enough for most enterprises. Add a CDP when you have app events and complex audience choreography.

How much of our calendar should we freeze?

During warm-up and cutover, freeze blasts to unengaged and limit promos to engaged cohorts. Flows can run; focus on proof-first content.

What’s the fastest way to tank placement?

Mail unengaged on a brand-new domain, ignore DMARC, send image-only templates, and “blast because it’s a big week.” Or you could not.

How do we prove we didn’t lose money?

Holdout-adjusted RPR (flows vs. campaigns), complaint by domain, seed/panel trendlines, and P2-rate for cohorts first exposed in the new platform. One weekly slide. No screenshots of pretty carousels.

Back to blog