case studies

Three deployments, anonymized.

Client work stays anonymous on principle — we’ll walk through the specifics on a call. One of these is our own back office, which needs no anonymizing at all.

case 01 · support

Full-inbox AI support for a UK / EU fashion brand

the starting point

~50 cases/day, ~1,500/month — roughly 1.25 FTE of work.
Tight margins; return shipping ate a real chunk of order value.
Orders flowing through a storefront and a separate payment platform that didn’t always agree.
Strong reason to settle return disputes with partial refunds, not full returns.

what we built & the outcome

Sophie end to end: cross-system order investigation, AI reply in brand voice, structured state.
Two-ladder negotiation + full money execution behind guards (verified against the real processor, idempotent, owner-checked).
Manual-takeover detection routes flagged senders to humans with context.
Result: ~80% of return requests resolved without a label issued; ~99% uptime; failures surface in monitoring, not complaints; cost a fraction of the hire it replaced.

case 02 · finance

Self-reconciling bookkeeping — our own

the starting point

Invoices in a folder, transactions in an app, VAT prep as a quarterly archaeology project.

what we built & the outcome

Bank-first ingestion from a multi-currency business account — every transaction lands in the ledger automatically.
Automatic matching of transactions to invoices, in and out.
A web app with an exception case board: anything ambiguous waits for a human confirm / decline / modify.
Result: the ledger is continuously current instead of quarterly reconstructed; human time collapses to reviewing exceptions; every number traces to a source.

case 03 · marketing

Ad-account forensics for an e-commerce advertiser

the starting point

Platform dashboards showed a healthy gross ROAS; the bank account disagreed.
A traffic trough that looked like a tracking failure — or a demand collapse, depending who you asked.
A merchant-feed suspension muddying the picture.

what we did & the outcome

Rebuilt the account’s economics from raw data: spend vs. contribution margin net of returns.
Verified the tracking pipeline end to end; separated measurement artifacts from real demand.
Isolated the feed suspension’s actual impact (free listings only — paid was unaffected).
Result: break-even-to-negative net of returns despite a healthy gross ROAS — budget did not scale into a loss; a standing automated audit now reruns the checks.

Curious what this looks like for your business?

We can talk through any of these in detail — the prompts, the guard design, and what went wrong along the way. The honest version is more useful than the polished one.

Talk to us →