Which cloud should we be on?

It depends on the workload, the data-residency constraints, and the team's existing skill. For US and Canadian clients with strict residency requirements, the US and Canadian regions of AWS, GCP, and Azure all earn their place. For AI-heavy workloads, the answer often involves splitting across providers. We make the choice on a written set of criteria, not on a vendor preference.

Should we use Kubernetes?

Probably not unless you already do. Kubernetes is excellent at scale but expensive in operational cost for teams under fifty engineers. We default to managed services (Cloud Run, ECS, App Runner) and move to Kubernetes only when the workload genuinely needs it.

How do you control cloud spend without slowing development down?

By giving developers visibility, not by adding approval gates. Every developer can see what their feature costs in production; that visibility plus quarterly architecture reviews produces sustainable spend without bureaucracy. Approval gates produce gaming.

How do you handle backups and disaster recovery?

Encrypted, restorable, and tested. We restore from backups on a schedule, usually quarterly, to confirm the backups work. Untested backups are not backups. Restore-tested backups are the deliverable.

Do you set up the cloud once, or do you keep operating it?

Both shapes are available. Most engagements include a transfer to the client's team at the end. For clients without a platform engineering function, we offer day-2 operations as a separate engagement with explicit SLAs.

Cloud management in the AI era: from cost-out to capability

The premise

Cloud management in 2026 looks superficially like the cloud management of three years ago: the same providers, the same primitives, the same vocabulary. Underneath, the shape of the workload has changed enough that the playbook has to be rewritten.

Inference workloads, GPU spend, data-residency regulation, and the operational realities of running AI features at production scale have pushed cloud from a cost-out conversation to a capability conversation. The team that treats it only as a bill to be optimized is no longer competitive on what the application can do.

This article is about what changed, what the new operational defaults look like, and how SDEN approaches cloud engagements in this new environment.

Why this matters now

The bill is the symptom, not the disease

Cloud cost overruns in 2026 are almost always an architecture problem, not a discounting problem.

A familiar engagement pattern is the finance team escalating a cloud bill that doubled in six months. The reflex is to negotiate harder with the vendor or to chase the obvious culprits: idle instances, oversized databases, snapshots nobody is using. The reflex captures real savings on the first pass and almost nothing on the second.

The real driver of cost in 2026 cloud deployments is usually structural: a workload pattern that does not match the pricing model of the resources it runs on. Bursty AI inference on always-on instances. Analytical queries against the operational database. Network egress between regions that nobody noticed in the architecture review. Logs and traces collected at full fidelity, retained indefinitely.

Fixing the architecture produces order-of-magnitude savings. Negotiating the contract produces single-digit ones. The order matters.

Fig.: The bill is the symptom, not the disease

What the discipline now covers

From provisioning to capability

Cloud engineering still includes the work it has always included: architecture design, provisioning through infrastructure-as-code, deployment automation, observability, and the operational discipline of running production. That is the floor.

What is new is the layer above the floor. Capacity planning for inference workloads with bursty, expensive GPU profiles. Multi-region architecture that respects data-residency regulation in a world where US, Canadian, and EU rules diverge meaningfully. Hybrid models where the operational workload runs on the cheapest available compute and the model serving runs where the latency and the licensing allow. None of these were core to the cloud engineer's job five years ago. All of them are now.

SDEN's cloud engagements increasingly look like architectural engagements (designing the shape of the deployment) rather than provisioning engagements. The provisioning has been automated for years; the design has not.

Fig.: From provisioning to capability

Where AI changes the work

Operational defaults for an AI-shaped workload

AI workloads break some of the assumptions classical cloud architectures rely on. They are bursty rather than steady, expensive per call rather than per byte, dependent on third-party providers whose latency and availability the cloud architect does not control, and sensitive to where the data is physically located in ways the rest of the application is not.

The operational defaults that work for this shape are different. Caching, batching, and graceful degradation at the application layer become first-class concerns. Provider abstraction at the model layer becomes mandatory, because every quarter the right model is a different model. And cost observability has to be wired into the application itself, not just into the cloud bill, because the unit economics of an AI feature are decided per request and need to be visible per request.

When SDEN designs cloud architecture for an AI-using product, this is where most of the work goes: not into Kubernetes manifests, but into the layer that decides what happens when the model takes too long, costs too much, or returns the wrong thing.

Fig.: Operational defaults for an AI-shaped workload

Before / after

What changes in the cloud stack when AI shows up

Four practical shifts visible in production deployments where AI has become a load-bearing part of the application.

Before

Capacity is planned against steady-state CPU and memory profiles. The autoscaler keeps the cluster within range, and the bill is predictable.

After

Capacity is planned against bursty GPU profiles with order-of-magnitude swings between idle and peak. The autoscaler decisions are now business decisions: more capacity means more cost-per-request.

Takeaway · Capacity planning becomes a unit-economics conversation, not an SRE conversation.

Before

Observability tracks request latency, error rates, and saturation: the classical golden signals.

After

Observability tracks model latency, model errors, model cost-per-request, model output quality, and the cache hit rate that determines whether the feature is solvent.

Takeaway · The dashboard the SRE looks at has new rows. Some of them belong to the product team.

Before

The cloud bill is broken down by service: compute, storage, network.

After

The cloud bill is broken down by product feature, with model spend attributed to the workflow that triggered it. Finance can answer the question of what an AI feature actually costs.

Takeaway · Attribution becomes a first-class engineering deliverable, because the alternative is flying blind.

Before

Region selection is a one-time architectural decision at project kick-off.

After

Region selection is revisited every time a new regulation lands or a new provider opens a region, and the architecture is designed so that the move is a configuration change, not a rewrite.

Takeaway · Data-residency rules now move faster than projects. The architecture has to keep up.

Fig.: What changes in the cloud stack when AI shows up

How SDEN ships cloud

Three defaults on every cloud engagement

We do not ship clouds. We ship architectures that happen to run on one. The pillars below are what we hold to.

Infrastructure as code, end-to-end

Every piece of the infrastructure is described in code, versioned, reviewed, and reproducible. We do not click-ops production. We do not click-ops staging either.

Cost observability at the feature level

Cost is attributed to features and traced to the requests that drove it. Surprises in the bill become exceptions, not the norm.

Region and provider portability where it matters

We pick one cloud as the home base, but the architecture keeps escape paths open for the workloads that may need them, whether for data-residency, sovereign-cloud, or cost-driven reasons.

What good looks like

The infrastructure the team trusts to grow into

Cloud success is invisible. The team uses the infrastructure and stops thinking about it.

A working cloud architecture lets the engineering team focus on the product, not on the plumbing. Deployments are routine. The bill is predictable, and when it is not, the team can explain why. Capacity is provisioned ahead of demand and decommissioned when demand ends. Regulators are answered with documentation that was generated, not retroactively assembled.

When SDEN finishes a cloud engagement, the test is simple: does the engineering team operate the system without us, comfortably, six months later. If they need us, we have not done the job.

The technology under the hood matters less than this property. It can be AWS, GCP, Azure, or a hybrid; it can be Kubernetes, Nomad, or serverless. What it cannot be is a black box only one engineer understands.

Fig.: The infrastructure the team trusts to grow into

FAQ

Cloud:
questions we get asked.

Direct answers to the questions we get asked the most. If yours isn't covered, write to the team.

Contact the team

Cloud management in the AI era: from cost-out to capability

The bill is the symptom, not the disease

From provisioning to capability

Operational defaults for an AI-shaped workload

What changes in the cloud stack when AI shows up

Three defaults on every cloud engagement

Infrastructure as code, end-to-end

Cost observability at the feature level

Region and provider portability where it matters

The infrastructure the team trusts to grow into

Cloud:
questions we get asked.

Related on SDEN

DevOps and automation: the operational layer that lets AI products ship

Data engineering meets AI: why trustworthy pipelines are the precondition

Cloud & Infrastructure expertise

Got a project worth building?

The bill is the symptom, not the disease

From provisioning to capability

Operational defaults for an AI-shaped workload

What changes in the cloud stack when AI shows up

Three defaults on every cloud engagement

Infrastructure as code, end-to-end

Cost observability at the feature level

Region and provider portability where it matters

The infrastructure the team trusts to grow into

Cloud:questions we get asked.

Which cloud should we be on?

Should we use Kubernetes?

How do you control cloud spend without slowing development down?

How do you handle backups and disaster recovery?

Do you set up the cloud once, or do you keep operating it?

Related on SDEN

DevOps and automation: the operational layer that lets AI products ship

Data engineering meets AI: why trustworthy pipelines are the precondition

Cloud & Infrastructure expertise

Got a project worth building?

Cloud:
questions we get asked.