How Azure Kubernetes Service secures AI agent workloads across networking, policy, image scanning, and runtime detection on shared GPU clusters.
I've been thinking a lot about AI and security, but also about infrastructure fundamentals that have to apply to all your workloads, so there are some themes that keep coming up in what I'm writing at the moment: about how different workloads look these days, about dependencies and provenance and isolation and trust and identity and auth, about GPUs finally dragging their tardy selves to the management party - and about how doing the necessary work will improve all the non-AI workloads too. This time it's looking at cluster hardening on AKS.
Securing Kubernetes is hard enough but AI workloads and especially agents make it even harder because they're unpredictable and often untrusted; I dig into how AKS and some new community projects around auth give you the tools to tackle this from networking to policy to identity and monitoring
Part of the problem is the way that securely partitioning and sharing GPUs is still pretty new; because we didn't need that for IaaS, the primitives weren't there to build into Kubernetes. The good news is once you know how to secure a cluster for AI workloads, all your other workloads benefit too
Because I write for different sites, there are pieces of my thinking in this area in multiple places: you could read this piece and think about a couple of others:
Mary Branscombe|June 02, 2026
Mary Branscombe|March 19, 2026
Mary Branscombe|February 09, 2026
Microsoft
Kubernetes
AI
cluster hardening
Azure Kubernetes Service AKS
Agent gateway
zero trust
identity
authentication
policy
networking
registries
sandbox
Hyperlight
GPU
sponsored content