Do you train models from scratch?

Almost never. We bias toward frontier hosted models plus retrieval-augmented generation (RAG) over your data, and only fine-tune when prompt engineering and RAG have hit a ceiling. The interesting work is the evaluation harness. The rest is configuration.

How do you keep client data out of model providers?

By contract and by architecture. We deploy on enterprise tiers that contractually exclude training, in the region the client requires, with open-weight models on the client's own cloud when nothing can leave the network.

What is the realistic timeline for a first AI feature in production?

Six to ten weeks from kick-off to a deployed feature with a written evaluation, a non-AI fallback, and a dashboard the team owns. Faster is usually a mistake: the work that gets compressed is the work that decides whether the feature survives month three.

Engineering domain · Workflows, agents & custom models

AI and machine learning

SDEN audits the AI integrations a business already runs, designs the custom workflows it should run next, and ships them to production with the evaluation harnesses that keep them honest: RAG, agents, classification, generation.

Engineering domainai

What this domain covers

Most CEOs and founders we meet are already running AI, usually three or four tools, often a homemade ChatGPT workflow, sometimes a vendor agent nobody has audited. The question is rarely whether to use AI. It is which of those integrations is load-bearing, which is leaking trust, and what should be built in-house instead. SDEN's AI engagements take three shapes. First, an AI audit: an inventory of every AI integration in the business, what data it touches, where it sits in critical paths, and a ranked remediation backlog with a build-or-buy verdict for each item. Second, custom AI workflows: designed against a measurable outcome, shipped with an evaluation harness, owned by the client. Third, embedded AI engineering, where SDEN sits inside an existing team as the discipline lead until the team can carry the work itself.

The hard part of shipping AI is not picking a model. It is deciding what to measure, building the evaluation harness that measures it, and keeping a live feedback loop running once the product is in production. We start every AI engagement with the question the model is supposed to answer for the user, and we refuse to write code until we agree on how we will know whether the answer is good. From there we choose the simplest architecture that meets the bar: a well-prompted hosted model where it works, retrieval-augmented generation (RAG) over your data where the answers depend on private content, and fine-tuning only when prompt engineering and RAG have hit a ceiling.

Production-readiness for AI features at SDEN means a documented latency budget, a per-request cost ceiling, deterministic guardrails on the inputs and outputs (PII redaction, jailbreak detection, refusal taxonomy), and a logged evaluation pipeline that runs against a held-out set every time the prompt or the model changes. Models are commodities; the evaluation discipline is the moat.

What we ship by default