What business impact did AI recommendations have in this case study?

Holdout-adjusted results: +8.9% PDP add-to-cart (absolute), +5.6% conversion relative, +9.8% AOV, +6.7% revenue per session, and −18% WISMO tickets once the tracking page mirrored recommendations. No heavier discounting was required.

Which tooling did you use and why?

Shopify for catalog and orders; Rebuy for goal-constrained PDP/cart recommendations; Klaviyo for email orchestration and dynamic blocks; Attentive for SMS nudges; Malomo for branded tracking; Snowflake+dbt for features/labels; Hightouch for syncing scores; Gorgias for helpdesk signals. The goal was one spine with few moving parts, plus honest measurement.

How do you ensure recommendations don’t hurt margin or UX?

We enforce inventory and margin floors at the rule level, hide cart add-ons when shipping ETAs would split the order, and limit suggestions to two SKUs with clear proof. If RPS or attach doesn’t rise in holdouts, we delete the module and try the next hypothesis.

What’s the fastest way to start?

Add one two-SKU recommendation block on the top PDPs, constrained by goal/inventory/margin. Run a 50/50 holdout for 14 days. Report revenue per session (RPS), attach rate, and AOV—then decide to scale or kill. Consistency and constraints matter more than the logo on the tool.

Predictive Retention Case Study: AI Product Recommendations Increase Sales Conversion

November 6, 2025

Do this today: Ship a minimum-viable AI recommendation test on your PDP or cart.

Add one constrained recommendation block (e.g., “Pairs well with”) beneath the primary CTA on your top 5 PDPs. Constrain logic by goal/category (hydration/repair/style), inventory, and margin.
Stand up a 50/50 holdout at the session level for 14 days. No cookies gymnastics—your tool should support server- or edge-level assignments.
Track three dials: add-to-cart rate on PDP, attach rate for recommended SKUs, and Revenue per Session (RPS). Report deltas vs. holdout—not screenshots of pretty carousels.
Write a two-paragraph readout (“what changed / what we learned / what we’ll test next”). Kill the block if it doesn’t move RPS; iterate if it does. Next week, copy the winner to cart and checkouts.

When you can explain, on one slide, how your recommendations moved RPS, order attach rate, and AOV—with holdouts—you’re doing AI personalization, not superstition.

TL;DR (What moved and why)

A mid-market DTC brand (AOV ~$52) implemented AI-powered product recommendations across PDP, cart, and post-purchase email. We replaced generic “bestsellers” with goal-constrained recommendations (“For Hydration / Performance / Calm”), gated by inventory and margin, and orchestrated the same choices in email and SMS. With session-level holdouts and CFO-grade dials, the program delivered:

+8.9% PDP add-to-cart rate (absolute), +5.6% order conversion relative
+9.8% AOV driven by attach of 1–2 low-AOV “tip-over” items
+6.7% Revenue per Session (RPS) sitewide, holdout-adjusted
−18% “last-mile” WISMO tickets after we mirrored recommendations on the branded tracking page

No heavier discounting. No homepage redesign. Just relevant choices, consistent across channels, measured with holdouts so we knew they worked.

Context & baseline

The brand: mid-market health/beauty CPG with three core use-cases, healthy traffic, and an honest design system. The problem: PDPs explained the hero SKU well but didn’t show a clear next best step. Cart had a generic “You may also like” carousel that ignored stock and margins. Email recommendations were “bestsellers” stitched on at the end of templates. Measurement was open-rate theater.

Baseline (28-day average):

PDP add-to-cart: 12.6%
Session conversion: 2.4%
AOV: $52.10
Attach rate for any accessory/add-on: 19.3%
Revenue per Session (RPS): $1.25

Hypothesis & guardrails (progress over promo)

Hypothesis: If we replace generic recommendations with goal-constrained, margin-aware, inventory-safe suggestions—and echo the same choices in email/SMS—customers will buy more confidently and attach the right low-AOV items, lifting AOV and conversion without heavier discounts.

Guardrails (non-negotiables):

Consistency: A customer should see the same choices on PDP, cart, and email for the same context.
Constraints: No OOS, no “backorder soon,” no margin-sink SKUs; only items with clean availability and shipping promises.
Coverage: Three goal families (“Hydration / Performance / Calm”), variant-aware where relevant.
Holdouts: 50/50 at session level for 14 days, then 20% persistent control moving forward.
Reporting: RPS, attach rate, AOV, conversion—no vanity metrics.

The stack we used (and why)

Shopify — source of truth for catalog, inventory, and order streams.
Rebuy — goal-constrained recommendations on PDP/cart/checkout; low-lift edge rendering.
Klaviyo — email orchestration; dynamic blocks that follow the same decision logic; persistent holdouts.
Attentive — SMS nudges only for high-intent cohorts; quiet hours enforced.
Malomo — branded tracking page; mirrored recommendations post-delivery.
Snowflake + dbt — churn/LTV labels, features, and holdout analytics; one place to calculate truth.
Hightouch — reverse ETL to sync scores and goal preferences into Klaviyo/Attentive as profile fields.
Gorgias — helpdesk; reason codes fed back into features; macros for “didn’t work / too much / price.”

Signals that matter: events, ZPD, and constraints

Recommendations fail when they ignore reality. We included only the signals we could trust and explain:

Events: Viewed Product, Added to Cart, Checkout Started, Placed Order, Delivered
ZPD: primary_goal (Hydration|Performance|Calm), variant_pref (shade/flavor/size)
Inventory: in-stock, min buffer, backorder cutoff
Margin: floor by SKU and bundle
Fulfillment: shipping SLA for region; avoid “ships later” in cart add-ons

Modeling: rules → collaborative → hybrid

We didn’t start with deep learning. We started with what the team could ship and explain, then layered complexity where it paid.

Rules v0.1: If primary_goal = Hydration, show hydration-tagged SKUs; else show starter kits. Obvious, but instantly better than “bestsellers.”
Collaborative ranking: “People who bought X also bought Y,” filtered by inventory/margin/goal. Rebuy handled the heavy lifting.
Hybrid w/ bandits: Two framings—proof-first vs. offer-first—with multi-armed bandits reallocating traffic to winners by goal cohort.

Explainer policy (human-readable): “If the customer told us they want Hydration, and the item is in stock and meets margin, then offer {bottle + electrolyte pack}. If not, offer the starter kit. Never show OOS. Never recommend something that ships later than the hero.” Everyone could understand it—and that’s why we could scale it.

Implementation: PDP, cart, checkout, and email

PDP (below the primary CTA)

One 2-SKU block titled “For {goal}, most people add…”
10-word proof line under each SKU (“Electrolytes to start your hydration routine”).
“Add” buttons pre-set to the customer’s variant_pref where available.

Cart

Single “complete your set” row; the goal is one add-on, not a buffet.
Hidden when shipping ETA would mismatch the hero (no split shipments).

Checkout

Low-friction, low-AOV add-on (samples) only; no cognitive overload.
Margin floor enforced server-side.

Email (Klaviyo)

Post-purchase Message #3 pulled the same goal-constrained SKUs; loyalty progress header added (“You’re {{points_to_next_reward}} from $10 off—add any of these to unlock it”).
Winback first touch: “What you loved last time + your goal → top 3 picks.”

SMS (Attentive)

Nudge only: “For your {goal}, most add {SKU}. 1-tap → {short_link}.”
Quiet hours enforced; recency-gated against email sends.

Experiments: design, holdouts, and bandits

Holdouts: Session-level 50/50 for 14 days (PDP + cart), then 20% persistent control.

Bandits: Proof-first vs. offer-first framings per goal cohort. Proof-first won for “Hydration/Calm”; offer-first edged in “Performance” (athletic audience). We kept bandits on—taste drifts.

A/Bs: Module order and title wording (“Pairs well with” vs. “Complete your set”). “For {goal}, most people add…” outperformed both by a wide margin—social proof without shouting.

Measurement: what we reported (and what we ignored)

Vanity metrics stayed on the bench. We reported only the dials that pay bills, all holdout-adjusted:

PDP add-to-cart rate (by template & placement)
Attach rate (presence of recommended SKU on the resulting order)
AOV and per-order margin
Session conversion and RPS
Ticket volume per 100 orders for WISMO and “how to use” topics

We ignored click-through on the module itself when it contradicted RPS. “High engagement” that didn’t convert got cut.

Results (holdout-adjusted)

+8.9% PDP add-to-cart (absolute) on templates with goal-constrained recs
+5.6% conversion relative (sessions with recs vs. holdout)
+9.8% AOV from attach of 1–2 low-AOV items and fewer mismatched bundles
+6.7% RPS sitewide
−18% WISMO tickets after adding quickstart + related items to the tracking page

The money line: the program paid for itself in under three weeks—and kept compounding because we didn’t rely on a one-time launch. We ran it like operations, not a project.

The playbook you can copy in 10 steps

Write your three goal families and map SKUs to each.
Enforce inventory and margin floors in the recommendation rule.
Add one block to the top 5 PDPs; hold 50% of sessions for 14 days.
Move the winner to cart; hide when shipping ETAs mismatch.
Mirror the same logic in Klaviyo email & Attentive SMS.
Report RPS, AOV, conversion, attach—only.
Run a bandit on proof-first vs. offer-first framings by goal.
Add tip-over add-ons under loyalty progress headers.
Put quickstart + “pairs well with” on the branded tracking page.
Document “what changed / what we learned / what we test next.” Repeat.

Pitfalls & how we avoided them

Halo of bestseller bias. We killed “bestsellers” after week one; goal-constrained recs won across cohorts.
OOS landmines. Server-side inventory and a simple “min buffer” saved us from recommending ghosts.
Margin sinkers. Margin floors at the rule level; this is commerce, not a carnival.
Carousel fatigue. One row, two items. Decision simplicity beats slot machines.
Disconnected channels. We mirrored decisions across PDP, cart, email, SMS, and tracking; no cognitive dissonance.
Open-rate worship. We measured RPS and attach; the module lived or died by money, not clicks.

Ethics: consent, explainability, fairness

Consent scope. If a profile prefers “deals only,” email can teach but SMS sticks to deals. Match message to channel consent.
Explainability. Keep a one-pager “recommendation policy” in human language. If you can’t explain a suggestion, don’t ship it.
Fairness. No protected-attribute inference. Train on commerce signals + explicit ZPD only. Review features for proxy bias.
Privacy & security. Store only what you use; rotate keys; DPA where required; log access.

90-day roadmap: from carousel to compounding

Weeks 1–2 — Ship & measure

Add one block to top PDPs; 50/50 holdout; report RPS & attach vs. control.
Move the winner to cart; enforce margin/inventory; hide when ETAs mismatch.

Weeks 3–4 — Orchestrate

Mirror logic in Klaviyo and Attentive; enforce quiet hours and recency gates.
Add loyalty progress headers; measure CTR vs. control.

Weeks 5–6 — Learn faster

Bandit on proof-first vs. offer-first; keep exploration on.
Add quickstart + “pairs well with” to tracking page; track WISMO change.

Weeks 7–9 — Personalize gently

Collect one ZPD answer in post-purchase (“What are you after?”) and use it everywhere.
Variant-aware CTAs where applicable.

Weeks 10–12 — Prove & scale

Read out holdout deltas; freeze winners; document the policy; schedule the next round.
Roll out to additional categories, keeping guardrails intact.

FAQ

How do we start if we don’t have a recommender tool yet?

Ship a rules-based block by goal on 3–5 PDPs with a 50/50 holdout. If it doesn’t move RPS, delete. If it does, graduate to Rebuy/Nosto and keep holdouts.

What if our catalog is small?

Constrain by goal and inventory; run two-item suggestions. On small catalogs, guardrails matter more than models.

Won’t this cannibalize our hero SKU?

No—when constrained. We suggest add-ons or the next step, not a lateral switch. Your hero keeps its spotlight.

How much lift should we expect?

Directionally: +4–10% PDP add-to-cart, +3–8% conversion, +5–12% AOV. Your category, prices, and guardrails decide where you land.

Related resources

Personalization at Scale: AI-Curated Journeys
Data-Driven Retention: Dashboards that Connect to Revenue
10 Email Automation Workflows to Boost Retention
Request a retention audit — we’ll show you where RPS is hiding (no heavier discounts required)

Back to blog

Country/region