As LLMs spread across the enterprise - assistants, co-pilots, IDE integrations, automation agents and custom LLM-based applications. We are effectively onboarding a new digital workforce. And like any workforce they need identity, permission boundaries and oversight.
That is the essence of Zero Trust for AI. This article focuses on the brains - the LLMs.
The LLM landscape is inconsistent by default
Enterprises now rely on:
• Foundation model APIs secured by static keys
• Cloud-hosted LLMs tied to workload identity
• Self-hosted models that often lack strong controls
Because each handles security differently, governance becomes fragmented before it even begins.
Clients and agent frameworks amplify the sprawl
Chatbots, IDE tools like Claude Code or Cursor, and frameworks like LangChain or CrewAI all integrate LLMs in their own unique ways. They store their own secrets, pick their own models, and implement their own logging. This leads to gaps, duplication, and blind spots.
Model choice comes with real security implications
Reasoning models, small fast models, multimodal models, and tool-calling models all expose different risks. Most organizations have little insight into which models are used where or how those choices affect data exposure.
Zero Trust begins with real enterprise identity
Every LLM call or agent action should reflect a verifiable identity:
• Workforce identity from Okta, Entra ID, Ping, Google Workspace
• Workload identity from AWS IAM, GCP IAM, Azure Managed Identity, or Kubernetes
No new identities are invented. We extend the identity fabric the enterprise already trusts.
Public and private service edges meet you where your workloads run
• The Public Service Edge supports clients, assistants, and SaaS-like access
• The Private Service Edge supports workloads inside isolated VPCs and internal networks
Both enforce the same identity, policy and governance model.
A real control point needs a policy engine and a secure vault
The Policy Engine
It determines which identities can use which models or tools, and under what conditions. User intent, model capability, data sensitivity, and workflow stage all influence the decision.
The Vault (Privileged Access Management for AI)
API keys and provider credentials should never sit inside agent code or config files. A secure vault protects them using per-tenant encryption keys with hardware roots of trust, fully tied to enterprise identity controls.
The vault does not hand out raw provider credentials. Instead, the control point issues short-lived ephemeral tokens that represent the caller’s identity and policy context. These ephemeral tokens are used to access the LLM on the caller’s behalf so the underlying credentials remain hidden and protected.
Rotation becomes painless. When a provider key rotates, the vault updates it without breaking clients or requiring code changes. Agents only ever see the ephemeral token not the secret itself.
This is Privileged Access Management applied to AI: secure storage, secure brokering, and secure delegation.
The result
Agents remain powerful but never over-permissioned. Model selection and tool usage flow through identity and policy. Credentials stay protected. Ephemeral tokens enforce least privilege. Auditability and cost visibility are built in. AI finally operates within a security model the enterprise can trust.
At Ferentin , this is exactly what we are building: an AI Security Services Edge that unifies identity, policy, governance and privileged access controls for the emerging AI workforce.
If your team is navigating AI adoption or trying to rein in early agent sprawl, please connect with us at enterprise@ferentin.com .
Share

