Why Agentic Commerce Will Reshape Shopware

A position paper for Shopware merchants — from an engineering perspective

For most of e-commerce history, the buyer was a human reading product pages. That assumption is baked into every layer of Shopware: the storefront templates, the faceted search, the cart UX, the SEO surface. Strip away the design and you find one consistent shape — content rendered for eyes, hyperlinks for clicks, forms for keyboards.

That shape is breaking. The buyer is increasingly an agent: a large language model with tool access, sent by a human who asked something like "find me a waterproof boot under 250 francs that handles real winter." The agent doesn't read the product page the way a human does. It calls an API, parses JSON, and returns one answer.

This article is the position I've arrived at after building an agentic-commerce platform on top of Shopware for the last several months. It's not a forecast — it's a description of what already changes when you put a real agent in front of a real Shopware shop, what we built to make it work, and what Shopware merchants should be doing in the next six months.

This part closes the engineering arc started earlier in this series — Part 3 covered the license boundaries that made the build legally shippable. What follows is what we built on top of those boundaries.

Why this matters now, not next year

Three things forced the timeline:

The protocol stack is no longer hypothetical. The Agentic Commerce Protocol (ACP), driven by OpenAI and Stripe, has had a live endpoint on Shopify since March 2026. Shopware has announced comparable agent features for June 2026. ACP is now the de-facto contract for agent → shop → checkout, and merchants who don't speak it will be invisible to the agent layer.
Shopping is moving into the chat surface. When the agent has both a search tool and a checkout API, the customer doesn't have to leave the conversation. They ask, the agent searches your catalog, the agent transacts. The product page becomes the fallback for traffic the agent surface couldn't capture — not the primary surface for conversion.
Conversion data has started shifting. Semantic queries — "keeps water out at freezing temperatures" — don't map to keyword facets. They map to embeddings. Shops with structured, embeddable product data win the agent's first answer. Shops without it lose the click before it ever happens.

I don't think any one of those shifts is dramatic on its own. Together, on a 12-month horizon, they reshape what a Shopware merchant has to ship.

What actually changes when the buyer is an agent

The shape of the system reorganizes around three layers — the protocol the agent talks, the data shape it queries, and the agent roles that produce the answer:

The agent bypasses the storefront entirely. Shopware feeds the canonical product table in SQLite; the seeder derives the per-tenant Milvus collection from there; the merchant API exposes ACP, MCP, and UCP outwards; the multi-agent layer produces the answer.

The protocol layer

A human buyer hits HTML. An agent buyer hits a contract — actually three contracts, with different jobs:

ACP — checkout. The Agentic Commerce Protocol defines five RESTful endpoints under /api/checkout_sessions/... (create, get, update, complete, cancel). Auth via Authorization: Bearer <merchant-api-key>, JSON in/out, idempotent on Idempotency-Key. A typical create-checkout request looks like this:

The response carries the session id, the current status, and a messages array the agent can render or pass on to the customer. From there, the shop becomes a strict state machine:

MCP — tool exposure. Model Context Protocol is what the agent uses before checkout: searching the catalog, manipulating the cart, fetching recommendations, asking service questions. We expose those operations as MCP tools through a dedicated apps-sdk service — eleven tools, all returning structured data, none rendering HTML. MCP is to agent exploration what ACP is to agent transaction.

UCP — discovery. Universal Commerce Protocol exposes /.well-known/ucp plus an A2A JSON-RPC 2.0 endpoint so an agent that has never heard of your shop can find out which protocols you support, what your checkout looks like, and what capabilities are available — in one round-trip. The agent card at /.well-known/agent-card.json declares those capabilities. Same idea as sitemap.xml for traditional crawlers, but for agent affordances.

Shopware's Store API speaks REST. That's the good news — the structural distance between a Shopware merchant and an ACP-compliant shop is smaller than it looks. The bad news is that none of the existing endpoints are shaped for an agent. They're shaped for a human-driven storefront. The work is the shaping, not the wiring.

The data layer

Faceted search assumes a human is willing to click "Outdoor → Hiking → Footwear → Waterproof". An agent doesn't click. It embeds the query as a 1024-dimensional vector and asks a vector database for the nearest neighbours.

The catalog therefore needs a parallel representation: not in MySQL, but in a vector index. We use Milvus 2.4 with one collection per tenant (product_catalog_<tenant_slug>), with this schema:

Field	Type	Notes
`id`	VARCHAR(100)	Primary key, mirrors `Product.id`
`sku`	VARCHAR(50)	—
`name`	VARCHAR(200)	—
`category`	VARCHAR(100)	Breadcrumb `"A > B > C"`
`price_cents`	INT64	—
`stock_count`	INT64	—
`text`	VARCHAR(2000)	The string actually embedded — `name + description + category`
`tenant_id`	VARCHAR(64)	Physical isolation, slug regex `[a-z0-9_]+`
`vector`	FLOAT_VECTOR(1024)	`nv-embedqa-e5-v5`-class embedding

Index: IVF_FLAT, metric L2, nlist=128. Loaded into memory via Collection.load() on first use.

The interesting field is text. It is not the product description as the marketing team wrote it. It's a synthesized string assembled for the embedding model — name first (because it's the most reliable signal across catalog: descriptions vary wildly in quality, names rarely do), then a stripped-HTML description, then a Category: ... tail to anchor the breadcrumb. This is the difference between a Shopware shop that retrieves well and one that retrieves badly. Two shops with identical SKUs can rank completely differently in the agent surface based on how the embedding text is constructed.

The agent layer

The single chatbot is a dead pattern. A useful agentic-commerce stack is multi-agent because the jobs aren't the same shape. We run four:

Agent	Role	Retrieval
`search`	Catalog retrieval — RAG with reranker	Milvus per-tenant collection
`recommendation`	Cross-sell / upsell — Agentic RAG with two-stage scoring	Milvus per-tenant collection
`promotion`	Pricing / promo arbiter — strategy, no retrieval	—
`post-purchase`	Multilingual shipping & service messages	—

Each agent runs as its own service with its own config, its own tracing, and its own failure mode. When recommendation times out, search still answers. When promotion decides not to discount, post-purchase still ships. This is operationally different from a single-prompt chatbot in two ways: (a) you can swap one agent's model without touching the other three, and (b) you can put a budget on each agent independently.

The two retrieval-bound agents (search, recommendation) bind the per-tenant collection at config time:

We chose collection-per-tenant over filters-on-shared-collection because the agent toolkit's retriever config doesn't expose filters in its optional-fields whitelist — it hard-codes collection_name, top_k, output_fields, search_params, and vector_field. That's a small detail with a big implication: physical isolation comes for free, you just have to respect the slug regex.

What we built

Setup

Everything below runs against this stack:

Application services: 9 containers — nginx, merchant (FastAPI + SQLModel), psp, apps-sdk (FastAPI + MCP server), ui (Next.js 15), four agents
Infrastructure: Milvus 2.4 standalone (etcd + MinIO), Phoenix for LLM tracing
Data: SQLite via SQLModel for transactional state (Customer, Product, BrowseHistory, CheckoutSession, telemetry); Milvus for retrieval
Sync: Shopware Store API → SQLModel Product upsert on (tenant_id, source="shopware", external_id), every 30 minutes via Ofelia
Embeddings: Hosted embedding API, Mistral-class quality, 1024-dim
Catalog size in test: 5,000–25,000 products per tenant

The Shopware connection is a direct sync. We pull products from /store-api/product, strip HTML out of the description, join the category tree into a breadcrumb, upsert idempotently, and trigger a tenant-scoped Milvus reseed in the same process. After a successful sync with inserted + updated > 0, the seeder embeds the tenant's source="shopware" slice and replaces that tenant's rows in the per-tenant Milvus collection. Other tenants are untouched.

The numbers that should drive your sizing

A note on what this section is. The measured operating numbers — milliseconds per request, francs per thousand turns, GPU-hour breakdowns — belong in their own deep-dive (next in this series), where the methodology can stand up to scrutiny. Publishing approximate latencies here without that methodology would be exactly the kind of false precision that makes engineering posts un-trustworthy. So: shape numbers below, measured numbers next.

Thresholds a Shopware merchant should know:

Number	Value	Why it matters
Index-tuning threshold	~500K embeddings per tenant	Below it, `IVF_FLAT` with `nlist=128` / `nprobe=16` is fine. Above it, `IVF_PQ` or `HNSW` becomes worth the operational complexity.
Sync cadence	30 minutes	The right answer for typical mid-market catalog change rates. Faster fights embedding-API rate limits without buying anything a merchant notices.
Top-k for retrieval	8	The reranker handles final ordering — the retriever just has to be recall-correct in the top 8.
Embedding dimension	1024	Determined by the embedding model class we use. Lower-dim alternatives (256, 384, 512) trade memory for recall on multilingual product text — a real choice for very large catalogs, not for our scale.
Agent count	4	One per distinct job-shape: two retrieval-bound (search, recommendation), one strategy (promotion), one messaging (post-purchase). Fewer agents conflate distinct job-shapes; more invent capabilities the merchant hasn't asked for.

Two shape conclusions hold without absolute numbers:

Vector search is cheap relative to LLM work. At 5K–25K products per tenant, retrieval sits well below where latency dominates the agent turn. The cost center is the reranker pass — not the vector index.
Catalog reseed is bulk-amortized. Embedding happens in a single bulk call per sync, not per product. That structural choice is what makes the 30-minute cadence feasible in the first place.

If you want the absolute numbers behind these shapes — embedding throughput, reranker latency distribution, cost per agent turn — the next post in this series measures them properly.

What we deliberately did not solve

Trade-offs are the part most posts skip. Here are ours.

No vector quantization. We use IVF_FLAT, not IVF_PQ or HNSW. At our test scale (5K–25K products per tenant) this is the right call — quantization would buy us scale we don't yet need at the cost of recall and tuning complexity. We'll revisit when a tenant crosses 500K embeddings, which is roughly where the index-tuning return on complexity flips.

No streaming inserts. Bulk reseed every 30 minutes is fine for the catalog change rate of typical mid-market Shopware shops. Real-time inserts would mean introducing Pulsar or Kafka, which is overkill for the per-tenant volume we see.

No agent for upselling at checkout. ACP's checkout state machine is strict — ready_for_payment should not fork into a sales conversation. We considered it and consciously left it out. Upsell happens in recommendation before the cart consolidates, not at the payment step.

No native Shopware plugin yet. The integration is still a Store-API consumer, not a plugin running inside Shopware. That's a deliberate phasing decision — the protocol layer matters more than the integration form factor. A plugin will follow once the contract is stable, not before.

No proprietary protocol invented. We don't ship a "MEMOTECH commerce protocol." The platform speaks ACP, MCP, and UCP — all open specs. Inventing yet another protocol would be a customer-lock-in tax that the open-source base of this platform explicitly tries to avoid.

What this means for Shopware merchants — the next six months

This is the part most engineering posts on agentic commerce skip, so let me be concrete. If you operate a Shopware shop and you're reading this in mid-2026, here are the four moves I'd make in priority order.

1. Audit your product data for embeddability. Open the descriptions of your top fifty SKUs. Are they marketing copy that assumes a human reader, or are they structured statements an embedding model can encode? "Premium boot for the modern adventurer" is bad input. "Waterproof leather hiking boot, ankle-high, GORE-TEX membrane, rated to -10°C, weight 540g per shoe" is good input. The catalog you already have — minus the marketing layer, plus the structured attributes — is the catalog the agent will rank.

2. Stand up a Store API consumer. Whether you build it yourself, buy it, or partner — get a process in place that pulls your products out of Shopware on a schedule and lands them in a vector index. The 30-minute cadence is fine. The integration shape is well-understood. The work is mostly in field mapping (HTML stripping, category breadcrumbs, image URLs) and idempotent upsert keys.

3. Decide your agent posture before Shopware decides for you. Shopware's June 2026 announcement will define a default. Once the default ships, the cheap path will be "use whatever Shopware bundles." That's fine for some merchants. For others — especially regulated, multi-shop, or DACH-language merchants — the default won't fit, and the time to plan the alternative is now, not after the announcement.

4. Don't rip out Shopware. Don't wait for native ACP either. The position I keep arriving at, both for our platform and for clients I advise, is: Shopware stays as the system of record. The agent layer sits beside it, talks to it via Store API, and exposes ACP/UCP outwards. This is unsexy and correct. The merchants I see making the most progress are the ones who treated this as a sidecar problem, not a replatforming problem.

What's still open

Three honest questions I haven't answered yet, and would genuinely like to compare notes on:

1. Where exactly does the index-tuning inflection sit? We say IVF_FLAT is fine below 500K and IVF_PQ / HNSW becomes worth the complexity above it — but that 500K is a folk heuristic from public benchmarks, not our own measurement. We'll have real data once a tenant crosses 250K. Anyone running larger Milvus catalogs in production — I'd like to hear where you actually saw the inflection on recall and tail latency.

2. Where does the MCP/ACP boundary sit long-term? Today we draw it cleanly: MCP for exploration (search, cart, recommendations), ACP for the transaction (checkout state machine). Some clients ask whether stateful flows — multi-turn cart edits, service-handoffs, returns conversations — should live on MCP throughout, with ACP only as the final commit step. We currently bet on the exploration/transaction split. If the boundary moves, we have to follow.

3. How well does multilingual embedding hold up for DACH-specific text? Product descriptions in Swiss High German (with regional terms like Velo or parkieren alongside standard German Fahrrad / parken), Suisse-Romande French labelling, Italian-Swiss long-tail products — embedding behaviour across these varies in ways our German-first benchmark doesn't surface. We test standard German first; the Swiss-specific recall losses are still under-measured.

If you've solved any of these — or have hit walls in the same places — I'd genuinely like to hear about it.

References

Why Your First Commit on Any Fork Should Be a License Audit — Part 3 of this series on the license-boundary work that preceded this build.
ACP — Agentic Commerce Protocol, OpenAI + Stripe specification, 2026. Spec · Repo.
UCP — Universal Commerce Protocol, agent discovery via /.well-known/ucp. Spec · Repo.
Six-protocol comparison (ACP, A2A, UCP, AP2, AXP, StoreSync) — memotech.ch/agentic-commerce.
Shopify ChatGPT Integration — public announcement, March 2026 (referenced on memotech.ch/agentic-commerce).
Shopware Roadmap 2026 — public announcement of agent features, June 2026 (referenced on memotech.ch/agentic-commerce).
Milvus 2.4 — Apache-2.0 vector database, used here for per-tenant collections. milvus.io.

External links retrieved 2026-05-01.

Mehmet Gökçe is a software & data engineer with IT experience since 1998. He runs MEMOTECH (Swiss-based, St. Gallen) and publishes regularly on Agentic Commerce, multi-agent architectures, and e-commerce engineering.

Running a Shopware shop and trying to figure out where the agent layer fits without replatforming? That sidecar problem — exposing ACP/UCP outwards while keeping Shopware as system of record — is what we work on every week. If you're stuck on it, get in touch.

Otherwise, I publish engineering deep-dives on multi-agent architectures, vector-search performance, and the protocol layer roughly twice a month. Direct to your inbox.

Subscribe to the MEMOTECH Newsletter →

Why Agentic Commerce Will Reshape Shopware

Why Agentic Commerce Will Reshape Shopware

Why this matters now, not next year

What actually changes when the buyer is an agent

The protocol layer

The data layer

The agent layer

What we built

Setup

The numbers that should drive your sizing

What we deliberately did not solve

What this means for Shopware merchants — the next six months

What's still open

References

Mehmet Gökçe

Weitere Artikel

BDI Agents für Shopware: Wenn Multi-Agent Systeme zu Chatbots werden

KI-Agenten sicher einsetzen: Policy-as-Code mit NVIDIA OpenShell

Why Your First Commit on Any Fork Should Be a License Audit