AI and machine learning
SDEN audits the AI integrations a business already runs, designs the custom workflows it should run next, and ships them to production with the evaluation harnesses that keep them honest: RAG, agents, classification, generation.
What this domain covers
Most CEOs and founders we meet are already running AI, usually three or four tools, often a homemade ChatGPT workflow, sometimes a vendor agent nobody has audited. The question is rarely whether to use AI. It is which of those integrations is load-bearing, which is leaking trust, and what should be built in-house instead. SDEN's AI engagements take three shapes. First, an AI audit: an inventory of every AI integration in the business, what data it touches, where it sits in critical paths, and a ranked remediation backlog with a build-or-buy verdict for each item. Second, custom AI workflows: designed against a measurable outcome, shipped with an evaluation harness, owned by the client. Third, embedded AI engineering, where SDEN sits inside an existing team as the discipline lead until the team can carry the work itself.
The hard part of shipping AI is not picking a model. It is deciding what to measure, building the evaluation harness that measures it, and keeping a live feedback loop running once the product is in production. We start every AI engagement with the question the model is supposed to answer for the user, and we refuse to write code until we agree on how we will know whether the answer is good. From there we choose the simplest architecture that meets the bar: a well-prompted hosted model where it works, retrieval-augmented generation (RAG) over your data where the answers depend on private content, and fine-tuning only when prompt engineering and RAG have hit a ceiling.
Production-readiness for AI features at SDEN means a documented latency budget, a per-request cost ceiling, deterministic guardrails on the inputs and outputs (PII redaction, jailbreak detection, refusal taxonomy), and a logged evaluation pipeline that runs against a held-out set every time the prompt or the model changes. Models are commodities; the evaluation discipline is the moat.
AI and machine learning: the SDEN defaults
Defaults we ship
- AI integration audit with a remediation backlog scoped into shippable issues
- OpenAI, Anthropic Claude, and open-weight models depending on cost / latency / privacy
- RAG with hybrid retrieval (semantic + lexical) and explicit citation
- Offline eval harness + online A/B before any prompt or model change ships
- PII redaction and prompt-injection guardrails at the boundary
Deliverables
- AI audit report: inventory, risk register (OWASP LLM Top 10 + data exposure), and a ranked remediation backlog
- Use case definition with measurable success criteria
- Evaluation harness committed to your repo with a golden dataset
- Production runtime with latency, cost, and quality dashboards
- Guardrails: input validation, output filtering, refusal handling
What we refuse to ship
We will not ship an AI feature without an evaluation harness. Demos that work in the founders' hands and break in production are how AI projects lose budget.
AI & machine learning
questions we get asked.
Direct answers to the questions we get asked the most. If yours isn't covered, write to the team.
More from
the SDEN blog.
Cornerstone writing from the SDEN team: what AI changes, what it doesn't, and how a senior team ships the difference.
How AI is rewriting business operations, and where it still has to earn trust
AI is moving from demo to production inside operating businesses. What changes, and what to refuse, when intelligence becomes a load-bearing part of the stack.
AI audit for founders: what to assess before you invest more
An AI audit inventories every integration a business already runs, ranks the risk, and gives a defensible build-or-buy verdict before the next investment.
Custom AI workflows vs off-the-shelf tools: when each one wins
The build-versus-buy call for AI is not the same as for software. Five questions that decide whether a custom workflow pays back, or whether the SaaS is the right answer.
Got a project worth building?
Tell us about your project. We work with a limited number of clients at a time, and we'll get back to you within 24 working hours with a first engineer's read, no commitment.