Sample audit

AI Integration Audit —an excerpt.

Disclaimer. Fabricated example. Real audits are confidential and tailored to your codebase — this page just shows what 5 days of senior engineering review actually produces.

Audit for Loop, a hypothetical Shopify-app SaaS adding AI product description generation. Two of the five deliverable sections shown in full; the rest summarized.

Metadata

Client
Loop (hypothetical example)
Stack
Next.js 14, Node 20, Postgres, Shopify Storefront API
MAU
~2,400 active stores
Audit window
5 days, async
Author
Robin Solanki

Executive summary

Loop has a clean codebase ready for AI integration. Of three candidate features the team is considering, “AI product description generation” is the highest-value lowest-cost ship. Recommended architecture uses Claude Haiku for cost efficiency at this scale; expected per-store cost ~$0.40/month at p95 usage. Estimated 2-week build to staging, 3 weeks to production.

Section 1 of 5 — full

Codebase review

Architecture today.

Next.js App Router with API routes calling Postgres directly via Prisma. Background jobs in BullMQ on a Redis side-car. Frontend renders product description previews via SSR. No existing LLM dependencies.

Where AI fits cleanly.

A new API route at /api/generate-description taking product attributes (title, category, materials, price tier) and returning streaming text. Frontend already has a description editor — just needs a “Generate” button wired to the new route.

Refactors needed first.

Two:

  1. Move the existing description-validation logic out of the React component into a shared lib/description.ts so the AI-generated output passes through the same validators (banned words, length limits, locale rules).
  2. The current Prisma schema has description as a single text column. Add a descriptionGeneratedBy enum ('human' | 'ai' | 'edited-ai') and a descriptionGeneratedAt timestamp for telemetry.

Refactors to skip.

The team is considering moving to a microservices architecture before adding AI. Skip — there’s no microservices benefit at 2,400 MAU. Add the feature in the monolith first.

Existing infra you can reuse.

BullMQ for batch generation jobs (e.g. “regenerate all descriptions in this category”). Redis for response caching keyed by product attribute hash.

Section 2 of 5 — full

Feature prioritization

Recommended

Candidate 1: AI product description generation

User value
High — saves merchants 15–30 min per product
Engineering cost
2 weeks (one engineer)
LLM cost
~$0.40/store/month at p95 (1,500 generations/month/store, Haiku)
Risk
Low — generations are reviewed before publish, so model errors don't reach end-customers
Do later

Candidate 2: Personalized email subject lines

User value
Medium — estimated 8–12% open-rate lift based on industry benchmarks
Engineering cost
3 weeks (touches the email-sending path which has more edge cases)
LLM cost
~$1.20/store/month at p95
Risk
Medium — bad subject lines actually go to customers
Don't build

Candidate 3: AI customer support chatbot

User value
Low for Loop's segment (small Shopify stores rarely have support volume)
Engineering cost
6+ weeks (RAG over store inventory, conversation memory, escalation paths)
LLM cost
Highly variable, $5–20/store/month
Risk
High — bad answers go directly to paying customers

Recommendation

Build Candidate 1 first as a 2-week sprint. Revisit Candidate 2 in Q3 once you have generation telemetry. Park Candidate 3 indefinitely.

Sections 3–5 — summarized

Remaining sections

Section 3 — Architecture for the top recommendation

Full audit shows Claude Haiku vs GPT-4o-mini cost math, prompt + chain design, eval strategy with a 200-item golden set, fallback when the API rate-limits, where to cache vs not.

Section 4 — Cost projection

Full audit shows per-request cost at p50/p95/p99, monthly cost at current 2,400 MAU, projected cost at 10,000 MAU, when self-hosting becomes cheaper than API calls — answer: ~50,000 MAU based on current rates.

Section 5 — Build sequence

Full audit shows week-by-week tasks: prompt drafting and eval setup → MVP API route → frontend wiring → batch regeneration → telemetry dashboards. Plus risk register: rate-limit blowouts, prompt injection through merchant input, output drift after model updates.

What the full audit looks like

This is roughly 30% of what a real audit looks like. The full doc is 12–15 pages, indexed by section, with code references and architecture diagrams. Services are currently paused — talk to me and we’ll figure out timing.