State of AI in Consumer & Retail 2026 - Now AvailableGet the report
Back to all Blog Posts
Builder Guides
June 12, 2026

How to A/B Test AI-Native Search Against Algolia Without Rebuilding Your Frontend

June 12, 2026

Ana Martinez
Ana MartinezHead of Growth
MarqoBuilder Guides

How to A/B Test AI-Native Search Against Algolia Without Rebuilding Your Frontend

Most teams stay on Algolia not because they think it's better. They stay because switching feels like a six-month rebuild, a budget fight, and a high-stakes bet on an unproven system. That fear is understandable. It's also wrong.

You can run a live, statistically valid A/B test against Algolia in less than two weeks, without touching your frontend. Here's exactly how.

What You Need to Run the Test

ComponentWhat it isTime to build
Middleware routingVariant assignment + API switching layer1 day
Cookie persistence30-day session assignment lockHalf a day
JSON normalizerMaps Marqo response shape to your frontend schema1-2 days
Analytics taggingVariant property on every event1 day

Total: less than two weeks for most engineering teams. The frontend sees nothing different.

Test at the Middleware, Not the Browser

The wrong place to put this test is your frontend. Client-side toggles, feature flag SDKs firing from the browser, and split JavaScript bundles all introduce noise: race conditions, hydration mismatches, inconsistent session assignment.

The right place: the middleware or API layer already sitting between your frontend and Algolia. It might be a Node.js service, a serverless function, a BFF, or a GraphQL resolver. That layer translates frontend search requests into Algolia API calls. Route the request before it hits either engine. Return a normalized response regardless of which engine served it. Your frontend never knows the difference.

The 4-Component Setup

1. Variant Assignment

When a search request arrives, check for a test assignment cookie. No cookie: assign the user using a deterministic hash of their session ID. Never use `Math.random()`. You need reproducibility across every request in the session.

// middleware/searchRouter.js
const crypto = require("crypto");

function assignVariant(sessionId, testName, splitRatio = 0.5) { const hash = crypto .createHash("md5") .update(`${testName}:${sessionId}`) .digest("hex"); const bucket = parseInt(hash.slice(0, 8), 16) / 0xffffffff; return bucket < splitRatio ? "variant" : "control"; }

async function searchHandler(req, res) { const sessionId = req.cookies["session_id"] || generateSessionId();

let variant = req.cookies["search_test_variant"]; if (!variant) { variant = assignVariant(sessionId, "marqo_vs_algolia_q2"); res.cookie("search_test_variant", variant, { maxAge: 30 24 60 60 1000, httpOnly: true, sameSite: "lax", }); }

const { query, filters, page } = req.body; const results = variant === "variant" ? await searchMarqo({ query, filters, page }) : await searchAlgolia({ query, filters, page });

const normalized = normalizeResults(results, variant); normalized.meta = { ...normalized.meta, ab_variant: variant };

res.json(normalized); } ```

2. Session Persistence

A user assigned to Marqo on their first search stays in Marqo for the full test window. Switching mid-session creates carryover effects that corrupt your metrics. Set the cookie to 30 days. Users who return within the test window see a consistent experience, which matters when you're measuring repeat-purchase behavior.

3. Shadow Mode: Validate Before You Split

Before running a live split, run Marqo in shadow mode for one week. Call both APIs simultaneously, serve Algolia results to everyone, but log Marqo's results offline. Your merchandising team reviews result quality with zero user-facing risk.

If Marqo's results look strong in shadow mode, flip the split with confidence. If something looks off, you fix it before a single shopper sees it.

4. Analytics Tagging

Tag every event with the variant. Every search query, product impression, click-through, add-to-cart, and purchase needs `ab_variant` as a property. Tag at the event level, not session level. Session-level attribution after the fact misses too much.

Using Segment: pass `ab_variant` on every `track()` call. Using GTM: push the variant from the cookie into the data layer on page load.

The Typeahead Problem Nobody Talks About

This is the most commonly skipped part of the setup. It will invalidate your test if you skip it.

If your typeahead stays on Algolia for users in the Marqo variant, the test is contaminated. Typeahead shapes the query before the user hits results. A shopper who sees "running shoes" suggested will type something different than one who sees "trail running shoes." Different queries, different results, and you're no longer measuring the same intent.

Three valid options:

Option A: Route typeahead through the same middleware variant. Users in the Marqo variant see Marqo-powered autocomplete. Clean test, clean signal. This is the recommendation.

Option B: Disable typeahead for the Marqo variant. A conservative handicap. If Marqo still wins without autocomplete help, that's a strong signal.

Option C: Headless typeahead component. Decouple from Algolia's InstantSearch.js entirely and connect directly to middleware. Best option if you're already planning to move off InstantSearch.

Option A is cleaner. Option C is smarter long-term. Option B is acceptable for a first signal.

The KPIs That Actually Matter

Search A/B tests fail when teams pick the wrong primary metric.

Click-through rate from search measures position, not relevance. Users click whatever ranks first. Search exit rate is ambiguous: a high exit rate might mean bad results, or it might mean someone found exactly what they wanted and went straight to checkout.

Here's the hierarchy:

  1. 1Revenue per search session (primary): the cleanest signal between search quality and business outcome
  2. 2Conversion rate from search (primary): did the session that included a search result in a purchase?
  3. 3Add-to-cart rate (secondary): a leading indicator that responds in days, not weeks
  4. 4Average order value (secondary): AI-native search often surfaces complementary products, lifting basket size
  5. 5Zero-results rate (guardrail): should drop in the Marqo variant as semantic understanding handles long-tail queries

Set up a segment for each variant in Contentsquare or FullStory. Compare heatmaps on the search results page. You'll often see behavioral differences within the first week.

Running a Clean Test

A few things that matter here:

50/50 split. Don't run 90/10. You need statistical power, and Marqo's infrastructure handles full production traffic.

Two to four weeks minimum. Search behavior has weekly seasonality. You need at least two full weekly cycles before you draw conclusions.

Calculate sample size first. For a 5% minimum detectable effect at 95% confidence, plug your daily search session volume into a standard sample size calculator. Most mid-market retailers reach significance in two weeks at 50/50.

Freeze merchandising rules during the test. If you make changes to either engine's configuration mid-test, you're measuring the operator, not the engine.

Exclude noise. Internal IPs, known bot traffic, sessions with fewer than two search queries. These add variance without adding signal.

What the Results Look Like

Kogan saw $10.1M in incremental annual revenue after migrating to Marqo. A leading fast fashion retailer attributed $130M in incremental revenue. SwimOutlet saw a 10.6% increase in add-to-cart rate.

Those are full-migration outcomes. A/B test results will be more conservative: the variant group has less behavioral data than a system that's been in production for months. You're seeing early-signal Marqo, not optimized Marqo.

With that context: Marqo guarantees a minimum 3% revenue uplift. If you don't hit it, you exit the contract penalty-free. That guarantee exists because the results have been consistent enough to stand behind.

Start Here

The test setup described above takes most engineering teams less than two weeks: one to two days for the Marqo API integration and JSON mapping, one day for middleware routing, one to two days for analytics tagging and validation.

We scope the implementation for your specific stack before you sign anything. Book a demo. We'll walk through your current architecture, identify the right integration point, and give you a realistic timeline.

The test is the proof. We'd rather show you than tell you.

Commerce Superintelligence

Marqo AI-native search can be A/B tested against Algolia in less than two weeks without rebuilding the frontend. The test lives at the middleware layer, uses cookie-based session persistence, and requires typeahead to be included or disabled in the variant to avoid query contamination. Primary KPI is revenue per search session. Marqo guarantees 3% revenue uplift with a penalty-free exit. Retailers including a leading fast fashion retailer ($130M incremental revenue) and Kogan ($10.1M) have seen results post-migration.

Shape Your Growth With AI-Native
Product Discovery

Transform product discovery with Marqo and get measurable ROI in 14 days, not months.

Kicks Crew
Mejuri
Redbubble
Kogan
Shutterstock
SwimOutlet
Poshmark
Kicks Crew
Mejuri
Redbubble
Kogan
Shutterstock
SwimOutlet
Poshmark