How do you replatform ESP/SMS without downtime?

Warm up a dedicated sender domain with DMARC alignment, enforce engagement band policy, run parallel sends for 2–3 critical flows with seed/panel placement checks and complaint monitoring by domain, reconcile identity and consent (including 10DLC), and only cut over when go/no-go criteria are met. Keep a rollback plan. Stabilize for two weeks with weekly readouts.

What are the biggest migration risks?

Deliverability drift (new domain, no warm-up), identity fracture (lost external_id, duplicates), consent regressions (mixed opt-in states, 10DLC mismatch), event entropy (inconsistent schemas), and operational chaos (no freeze/rollback).

What should we measure during cutover?

Seed/panel inbox placement, complaint rate by domain, holdout-adjusted RPR (flows vs campaigns), 30-day second-purchase rate for touched cohorts, and SMS opt-out rate per send.

How long should a migration take?

A safe mid-complexity cutover typically spans 6 weeks: 1–2 weeks for warm-up and mapping, 2 weeks for parallel sends, 1 week for cutover, and 1–2 weeks for stabilization. Complex data or multilingual programs add time.

Migration Playbook: Enterprise ESP/SMS Replatforms with Zero Downtime

November 17, 2025

Do this today: write a one-page Cutover Reality Check before you schedule a single migration meeting.

Business dials for day 0 → day 45: holdout-adjusted RPR (flows vs. campaigns), complaint rate by domain, seed/panel placement, second-purchase rate for exposed cohorts, and SMS opt-out/10DLC health.
Risk floor: dedicated sender domain + DMARC (already aligned), engagement band policy, send-freeze windows, rollback owner.
Identity contract: what is the customer key (email, phone, external_id)? how will you unify anonymous → known?
Consent contract: where does email/SMS consent live? source of truth? jurisdiction logic? (Dataships/OneTrust/Ketch/Transcend acceptable.)
Pilot scope: parallel sends for 2–3 critical flows, seed/panel checks, go/no-go criteria, and a kill switch.

If a replatform plan can’t answer these five in one page, you’re not migrating—you’re guessing.

Why replatform—and why “lift & shift” breaks

You replatform for one of four reasons: (1) data gravity changed and the old ESP/SMS is fighting your warehouse; (2) channels expanded (in-app/push) and orchestration needs grew up; (3) compliance & security standards tightened (SSO/MFA, DPAs, 10DLC), or (4) experimentation velocity hit a ceiling. All good reasons, none of which justify breaking deliverability or consent because you were in a hurry.

The lie of “lift & shift” is that messaging is files and settings. It’s not. It’s identity, consent, placement, and timing—living things that won’t survive a copy-paste. Treat replatforming like a change-control exercise, not a design sprint. Build plumbing first; paint after.

The risk map: where migrations actually fail

Deliverability drift: new domain without warm-up; DMARC misalignment; image-only templates; mailing unengaged during “big weeks.”
Identity fracture: external_id lost; email/phone merged incorrectly; anonymous sessions not stitched; duplicate profiles ballooning suppressions.
Consent regressions: CSV imports with mystery opt-in states; mixed jurisdictions; SMS 10DLC campaigns mismatched to actual traffic; missing HELP/STOP/quiet hours.
Event entropy: “Added to Cart” means five things; timestamps drift; attributions garbled; flows firing twice.
Ops chaos: no send-freeze; no rollback; last-minute approvals; no owner to push the big red button.

This playbook is designed to neutralize each category before it gets expensive.

Week 0 audit: inventory, identity, consent, and data

Start with a humble spreadsheet. Your future self will call to thank you.

Inventory (flows, campaigns, assets)

Flows: name, purpose, triggers, segments, languages, dynamic content, owner, KPI.
Campaign cadences: volume by cohort, suppression rules, add-ons (AMP, animations).
Assets: template partials, language packs, UGC blocks, product recommendation logic.

Identity contract

Primary key per channel (email, phone, app id), external_id (CRM/warehouse).
Anonymous → known stitching: how, where, when; cookie/session id mapping.
De-duplication rules (email normalization, phone E.164, domain aliases).

Consent contract

Email: opt-in source, timestamp, jurisdiction; double opt-in where required.
SMS: brand & campaign registrations (10DLC), HELP/STOP keywords, quiet hours on profile.
Tools: Dataships audit or OneTrust/Ketch/Transcend preference center + DSAR pipeline.

Data audit

Events (names, schemas, timestamps, ids): Viewed Product, Add to Cart, Checkout Started, Placed Order, Refund Issued, Subscription events.
Properties: ZPD (primary_goal, variant_pref), loyalty (points_to_next_reward, tier), geo/timezone.
Retention & deletion: what do you keep, for how long, and how do you prove deletion?

Stack patterns: Klaviyo, Braze, Attentive, Postscript, Cordial

Tools are opinions about how work should happen. Respect their biases.

Klaviyo

Strengths: Shopify gravity, fast lifecycle, email+SMS in one brain, dynamic blocks, straightforward A/Bs.
Watch-outs: multilingual requires language packs/partials; advanced experimentation via hacks or bandits in warehouse.
Migration notes: replicate flows with improvements, not just copies; map event names precisely; keep persistent holdouts.

Braze

Strengths: multi-channel (email/SMS/push/in-app), robust experimentation, profile APIs.
Watch-outs: needs warehouse/CDP discipline; QA pipeline heavier.
Migration notes: enforce external_id early; align catalog feeds; port segmentation logic to Braze audience builder.

Attentive / Postscript (SMS)

Attentive: growth tooling, compliance guardrails, journey builder, analytics. Enterprise-friendly.
Postscript: lean, deep Shopify hooks, fast to iterate.
Migration notes: 10DLC brand/campaigns first; map opt-ins/opt-outs; quiet hours on profile; test HELP/STOP.

Cordial

Strengths: API-forward, flexible data model, scale.
Watch-outs: bring your own experimentation rigor; explicit data contracts needed.

Warehouse & Reverse ETL: Snowflake/BigQuery + dbt + Hightouch/Census is the universal adapter. Keep your identity and risk in your house; let orchestration tools do the last mile.

Deliverability prerequisites: domains, DMARC, warm-up, seeds

Placement is a license, not a feeling. Treat it like change-controlled infrastructure.

Dedicated sender domain: e.g., news.yourbrand.com. Configure SPF/DKIM, align DMARC. Start on p=none, move to quarantine then reject after stabilization.
Tracking CNAMEs: branded click/open domains to avoid mismatched-domain flags.
Engagement band policy: 0–30 / 31–60 / 61–90 days; sunset after two re-engagement touches.
Warm-up plan: 2–3 weeks of banded sends; no blasts to unengaged; proof-first lifecycle before promos.
Seed/panel baselines: measure placement before changes so you know what “good” was.

Event & identity mapping (with examples)

Write a dictionary: left column = old platform event/property, right column = new platform schema, with id and timestamp rules.

Legacy	Target	Notes
Checkout Started	checkout_started	Ensure cart_id present; dedupe rapid retries by cart_id+timestamp
Placed Order	order_completed	Include order_id, order_value, currency, items[] (sku, qty, price)
Refund Issued	refund_issued	Emit negative revenue if your attribution expects net sales
Subscription Paused	subscription_paused	Key to reason-based saves; store reason enum

Identity merge rules (pseudo)

-- Normalize email
email_norm = LOWER(TRIM(email))
email_norm = REGEXP_REPLACE(email_norm, '\\.(?=[^@]*@)', '')  -- optional dot removal for gmail

-- Normalize phone
phone_e164 = TO_E164(phone_raw, country_hint)

-- Merge policy
customer_key = COALESCE(external_id, email_norm, phone_e164)

Flow migration strategy: clone, improve, or redesign?

Everything in a replatform begs you to copy. Resist. Use migration to fix what your future team will hate.

Clone when the logic is good and dependencies are stable (e.g., transactional notices).
Improve when you can add a proof-first block, progress header (“You’re {{points_to_next_reward}} from $10 off”), or better segmentation.
Redesign when the model wants it (Braze Canvas, Cordial data flows) or when multilingual needs require language packs instead of duplicated flows.

Start with the spine: Post-Purchase → Second-Purchase Accelerator → Replenishment → Winback → Subscription Saves. Those five pay rent.

Parallel sends & placement validation

There is no zero-downtime without parallelism. For two weeks, run critical flows in both platforms, mailing bands of engaged users. Measure, don’t hope.

Seed/panel: daily snapshots; compare inbox rates; don’t chase single-day noise.
Complaint dashboards: track per-domain; pause promos if Gmail moves.
Holdouts: keep message-level controls for pilot flows; report holdout-adjusted RPR and conversion.
Diff logs: when platforms disagree (segment size, dynamic content), log the diff and fix before cutover.

Cutover day playbook & rollback

Go/no-go criteria

New domain warm-up stable; seed/panel equal or better than baseline.
Complaint rates ≤0.08% at Gmail; stable elsewhere.
Critical flows validated with test profiles; no duplicate sends; timestamps correct.
Consent & identity tables reconcile (spot-checks across cohorts).
Rollback path signed (who presses it; which flows revert; DNS/ESP toggles).

Day-of checklist

Communication sent: “change freeze” notice; on-call matrix; escalation channel.
Disable legacy triggers incrementally; enable new triggers; monitor diffs.
Seed/panel check at T+1h, T+4h, T+24h; complaint watch continuous.
Post a 24-hour cutover report: flows enabled, issues, mitigations, next checks.

Rollback

Rollback is not failure; it’s discipline. If complaints spike or seeds tank, revert flows, pause promos, fix, and try again. Document why.

Stabilization weeks: what to watch first

Deliverability: complaint by domain, read-time proxies, seed/panel trendlines.
Revenue: holdout-adjusted RPR (flows vs. campaigns), AOV and conversion where applicable.
Retention: 30-day second-purchase rate for cohorts first exposed in new platform.
SMS: opt-out rate per send; TCR flag status; quiet-hour compliance rate.
Ops: incident log, QA misses, approval times; retro after 2 weeks.

Multilingual & regional cutovers without chaos

Duplicating flows per language is how programs drown. Use language packs and partials—translators edit keys, not logic. For RTL (Arabic/Hebrew), flip containers (dir="rtl"), mirror icons, verify fonts, and run extra device QA. Consent text must be localized; quiet hours must respect local timezones. Roll regions in waves: one market per week, not five in a day.

RACI, SLAs, and change control that prevent fire drills

Task	R	A	C	I
Domain & DMARC setup	Deliverability lead	Head of Lifecycle	IT/Sec	Marketing
Event & identity mapping	Data engineer	Head of Data	ESP/SMS admins	Lifecycle
Consent reconciliation	Privacy lead	DPO/Legal	Lifecycle	IT/Sec
Parallel send validation	Producer	Head of Lifecycle	Deliverability	Stakeholders
Cutover & rollback	Producer	Head of Lifecycle	Deliverability/Data	Marketing/Legal

Change-freeze clause (SOW excerpt)

During risk windows (warm-up, parallel sends, cutover, incidents), a change freeze is in effect. Only pre-approved critical fixes
may deploy. All other changes are queued. Exceptions require joint approval by Client Lifecycle Lead and Agency Producer.

QA & Send SLA (excerpt)

Agency provides [X] business hours for QA before any scheduled send: device rendering, link/UTM validation, segmentation checks,
accessibility (alt text, contrast). No template changes within [Y] minutes of send. Client consolidates feedback through one approver.

45/60-day timeline (day-by-day milestones)

Days 1–7: Audit & prerequisites

Inventory flows/campaigns; identity/consent contracts documented.
Dedicated sender domain provisioned; SPF/DKIM/DMARC configured (p=none).
Engagement band policy and sunset enforcement turned on in legacy platform.
Seed/panel baseline runs; complaint dashboards by domain configured.
10DLC brand/campaign registrations confirmed; HELP/STOP tested.

Days 8–21: Warm-up & mapping

Warm-up: engaged cohorts only; lifecycle > promos.
Event/identity mapping: schemas in dbt; sample payloads end-to-end; external_id stitched.
Consent reconciliation: imports with source/timestamp/jurisdiction; fix mixed states.
Template refactor: real text, language packs, accessibility passes.

Days 22–35: Parallel sends & fixes

2–3 critical flows in parallel; segment-size diffs investigated.
Seeds: daily; complaints: continuous; anomalies triaged and fixed.
Holdout-adjusted RPR collected for pilot flows; document “what changed.”

Days 36–42: Cutover

Go/no-go review; cutover plan with timestamps; on-call schedule.
Disable legacy triggers; enable new; monitor T+1h, T+4h, T+24h.
Rollback if thresholds breached (complaints/placement); retro if executed.

Days 43–60: Stabilization

Weekly readout: RPR (flows/campaigns), complaints, seeds, P2-rate, SMS opt-outs.
Backlog: lift winners into more flows; retire parity builds that aren’t needed.
Move DMARC to quarantine → reject when confident.

Case snapshots: three migrations, three lessons

CPG (Klaviyo → Braze; Attentive retained)

Problem: app messaging needed; data lived in warehouse; legacy flows duplicated per language. Fix: external_id stitch + language packs; warehouse → Braze via Hightouch; parallel for 18 days. Outcome (day 45 vs baseline): complaint steady, inbox unchanged; +7% holdout-adjusted RPR on post-purchase; -22% build hours monthly due to partials.

Apparel (Cordial → Klaviyo; Postscript → Attentive)

Problem: slow experiments; SMS complaints; consent states inconsistent. Fix: Dataships audit; 10DLC re-registration; warm-up new domain; proof-first templates. Outcome: -34% SMS opt-outs, +9% RPR in second-purchase, complaint ≤0.05% Gmail.

Wellness (Klaviyo → Klaviyo consolidation across regions)

Problem: six accounts, duplicated flows, no RTL support. Fix: consolidate to one global pattern; lang packs; dir="rtl" for MEA; staged regional cutovers. Outcome: -40% maintenance hours, +6 pts placement in problematic region, no downtime.

FAQ

Can we migrate in two weeks?

Not safely. You can start in two weeks. Safe cutovers need warm-up, parallel, and validation. If someone promises “done in a week,” they’re selling open rates, not placement.

Do we need a CDP to replatform?

No. A warehouse + dbt + reverse ETL is enough for most enterprises. Add a CDP when you have app events and complex audience choreography.

How much of our calendar should we freeze?

During warm-up and cutover, freeze blasts to unengaged and limit promos to engaged cohorts. Flows can run; focus on proof-first content.

What’s the fastest way to tank placement?

Mail unengaged on a brand-new domain, ignore DMARC, send image-only templates, and “blast because it’s a big week.” Or you could not.

How do we prove we didn’t lose money?

Holdout-adjusted RPR (flows vs. campaigns), complaint by domain, seed/panel trendlines, and P2-rate for cohorts first exposed in the new platform. One weekly slide. No screenshots of pretty carousels.

Back to blog

Country/region