CISO Field Guide — 2026

Agentic AI Security Risks & the CISO Checklist

The 2026 CISO playbook for securing autonomous AI agents: the full OWASP / NIST / CISA risk taxonomy, least-privilege and identity controls, EU AI Act audit requirements, and a seven-domain, copy-ready governance checklist that names both the control and the authoritative source.

Agentic Risk
OWASP Agentic Top 10
Least Privilege
Air-Gapped
Audit-Ready
40%Enterprise apps with AI agents by 2026
65%Orgs hit by an agent incident in past year
CVSS 9.3EchoLeak zero-click Copilot exploit
109:1Machine identities per human (2026)

1. Why AI Agents Break Your Existing Threat Model

An AI agent security checklist is not a chatbot policy with extra steps. CISA 2026 Five Eyes guidance defines agentic AI as systems composed of one or more agents that fundamentally rely on an AI model, such as an LLM, to interpret and reason about the state of the world and can autonomously make decisions and take actions. Three properties in that definition -- autonomy, statefulness, and tool access -- invalidate assumptions that traditional application security takes for granted.

First, the instruction/data boundary collapses. A large language model processes instructions and data in the same channel, with no enforced separation -- unlike a SQL database, where parameterized queries cleanly separate code from input. OWASP makes this the root cause of its #1 risk, Prompt Injection (LLM01:2025). When that model is wired to tools, any text it ingests -- a retrieved document, an email, a webpage, another agent message -- becomes a candidate instruction. The agent trust boundary silently expands to include every byte of untrusted content it reads.

Second, the agent acts under real privilege. A vulnerable web form returns data to a user; a vulnerable agent can delete a file, send a wire, modify an IAM policy, or query a production database -- because you gave it those tools to be useful. OWASP frames this as Excessive Agency (LLM06:2025): damaging actions can be performed in response to unexpected, ambiguous, or manipulated outputs from an LLM. The exploit is no longer leak the response. It is perform the attacker action at your privilege level.

Third, state and autonomy decouple the attack from its trigger. Agents persist memory across sessions and chain tool calls in loops. A poisoned memory entry planted today can execute against an unrelated query next week, and the agent cannot distinguish learned context from planted content. Session isolation does not help, because the attack exploits persistent cross-session state.

The Architect Takeaway
You can no longer treat the model as trusted and the perimeter as the control. Assume injection will succeed, and design so that a compromised agent cannot do disproportionate damage. That principle -- containment over prevention -- runs through every section below.

For the organizational program that wraps these technical controls, see our AI governance framework and the broader AI for CISOs security guide.

2. How Fast the Gap Is Opening: Adoption vs. Controls

Security maturity is not keeping pace with deployment. Gartner projects that 40% of enterprise applications will feature task-specific AI agents by the end of 2026, up from less than 5% in 2025, and that 33% of enterprise software will embed agentic AI by 2028 (from under 1% in 2024). Yet only 17% of organizations have actually deployed agents so far, while 60%+ expect to within two years -- and Gartner warns that more than 40% of agentic AI projects will be canceled by the end of 2027 due to cost, unclear value, or inadequate risk controls. Governance is now a make-or-break variable, not a compliance afterthought.

MetricFigureSource
Enterprise apps with task-specific AI agents (2025 to 2026)<5% to 40%Gartner
Enterprise software embedding agentic AI by 202833% (from <1% in 2024)Gartner
Organizations that have deployed AI agents (2026)17% (60%+ plan within 2 yrs)Gartner
Agentic AI projects canceled by end of 202740%+Gartner
Orgs with an AI-agent security incident in past year65% (all with business impact)Zenity / CSA
Orgs with full security approval for all AI agents14.4%Acuvity
Orgs reporting shadow AI use98%Acuvity / CSA
Orgs with no visibility into AI data flows86%Industry surveys
Agent skills audited containing serious vulnerabilities41.7% of 2,890+MITRE / secondary
Several figures originate from vendor-sponsored surveys with differing methodologies; treat them as directional, and cross-reference Gartner analyst forecasts where precision matters.

The shadow-AI dimension is covered in depth in Shadow AI risks.

3. The Agentic Risk Taxonomy

Most confusion in this field comes from blending two separate OWASP documents. Both are products of the OWASP Gen AI Security Project. Agentic AI -- Threats and Mitigations v1.0 (Feb 2025) is the foundational taxonomy of 15 named threats, T1-T15. The OWASP Top 10 for Agentic Applications 2026 (Dec 9, 2025) is the ranked, incident-grounded Top 10, coded ASI01-ASI10.

Do Not Conflate the Two
The T-codes (T1-T15) belong only to the Feb 2025 taxonomy. The ASI-codes (ASI01-ASI10) belong only to the 2026 Top 10. Do not present them as one list. The ASI-to-T mappings below are analytical best-fit correspondences; verify exact wording against the official OWASP PDFs before quoting.

3.1 OWASP Agentic Threats & Mitigations v1.0 (T1-T15)

The cross-cutting theme: the root cause of reasoning attacks (T6) is the lack of separation between data and instructions. The agentic threats are explicitly framed as extensions of the OWASP LLM Top 10 into autonomous, stateful, multi-agent settings.

CodeThreat NameDefinition (condensed)Related LLM Top 10 2025
T1Memory PoisoningExploits short- and long-term memory to inject malicious/false data; alters decisions and enables unauthorized operations.LLM04; LLM08
T2Tool MisuseManipulates agents to abuse integrated tools via deceptive prompts while staying within authorized permissions; includes Agent Hijacking.LLM06
T3Privilege CompromiseExploits mismanaged roles, overly permissive configs, or dynamic role inheritance to escalate privileges.LLM06
T4Resource OverloadDeliberately exhausts compute/memory/service capacity; amplified by agents self-triggering and spawning tasks.LLM10
T5Cascading Hallucination AttacksPlausible-but-false info propagates and amplifies via self-reinforcement and inter-agent loops.LLM09
T6Intent Breaking & Goal ManipulationExploits lack of separation between data and instructions to alter planning, reasoning, and self-evaluation.LLM01
T7Misaligned & Deceptive BehaviorsAgents execute harmful/disallowed actions, using deceptive reasoning to appear compliant.
T8Repudiation & UntraceabilityAgent actions cannot be traced or accounted for due to insufficient logging/transparency.
T9Identity Spoofing & ImpersonationExploits authentication to impersonate agents or users and act under false identities.
T10Overwhelming Human-in-the-LoopExploits human cognitive limits or floods oversight/validation frameworks.
T11Unexpected RCE & Code AttacksExploits AI-generated code execution to inject malicious code via function-calling/tools.LLM01; LLM05
T12Agent Communication PoisoningManipulates inter-agent channels to spread false info or influence decisions.LLM04
T13Rogue Agents in Multi-Agent SystemsCompromised agents operate outside monitoring boundaries, executing unauthorized actions or exfiltrating data.
T14Human Attacks on Multi-Agent SystemsAdversaries exploit inter-agent delegation, trust, and workflow dependencies to escalate or manipulate.
T15Human ManipulationAgent-human trust reduces skepticism; attackers coerce agents to manipulate users or take covert actions.

3.2 OWASP Top 10 for Agentic Applications 2026 (ASI01-ASI10)

The 2026 list is grounded in real 2025 incidents, which distinguishes it from the more theoretical Feb 2025 taxonomy. The most material new addition is ASI04 Agentic Supply Chain Vulnerabilities -- runtime poisoning of the Model Context Protocol (MCP) and Agent2Agent (A2A) ecosystems.

CodeTitleDescription2025 IncidentMaps to v1.0
ASI01Agent Goal HijackHidden prompts alter objectives/decision path, turning copilots into silent exfiltration engines.EchoLeakT6
ASI02Tool MisuseAgents bent legitimate tools into destructive outputs (confused-deputy pattern).Amazon QT2
ASI03Identity & Privilege AbuseLeaked credentials / dropped identity let agents operate beyond intended scope.Credential abuseT3 + T9
ASI04Agentic Supply Chain VulnerabilitiesDynamic MCP and A2A ecosystems let runtime components be poisoned (NEW for 2026).GitHub MCP exploitNew; T2/T13
ASI05Unexpected Code ExecutionNatural-language execution paths unlock new RCE avenues.AutoGPT RCET11
ASI06Memory & Context PoisoningMemory poisoning reshapes behavior long after the initial interaction.Gemini Memory AttackT1
ASI07Insecure Inter-Agent CommunicationSpoofed inter-agent messages misdirect entire agent clusters.Spoofed messagesT12
ASI08Cascading FailuresA single error/compromise spreads across connected agents/tools/pipelines with escalating impact.Pipeline cascadeT5
ASI09Human-Agent Trust ExploitationConfident, polished explanations mislead operators into approving harmful actions.Operator deceptionT10 + T15
ASI10Rogue AgentsCompromised, misaligned, or drifting agents act with harmful autonomy -- the ultimate insider threat.Replit meltdownT13

3.3 How Agentic Threats Extend the OWASP LLM Top 10 (2025)

The agentic taxonomy does not replace the OWASP Top 10 for LLM Applications 2025; it builds on it. Five LLM-level entries carry the most agent weight.

IDRisk NameAgent Relevance
LLM01:2025Prompt InjectionCritical -- indirect injection via tools/RAG drives agent compromise
LLM02:2025Sensitive Information DisclosureHigh -- agents/RAG can leak knowledge-base data
LLM03:2025Supply ChainHigh -- agent tools/plugins inherit supply-chain risk
LLM05:2025Improper Output HandlingCritical -- agent output feeds shells, DBs, browsers
LLM06:2025Excessive AgencyHighest -- the defining agent risk
The Canonical Agent Kill-Chain
Indirect prompt injection (LLM01) → excessive agency (LLM06) → improper output handling (LLM05). A poisoned document changes the agent intent, the over-permissioned tool executes the action, and the unsanitized output reaches a shell or browser. Defenses must break all three links.

3.4 Six Defensive Playbooks

PlaybookThreats Mitigated
1. Preventing AI Agent Reasoning ManipulationT6, T7, T8
2. Preventing Memory Poisoning & Knowledge CorruptionT1, T5
3. Securing AI Tool Execution & Preventing Unauthorized ActionsT2, T3, T11, T4
4. Strengthening Authentication, Identity & Privilege ControlsT3, T9
5. Protecting HITL & Preventing Human-targeted ThreatsT10, T15
6. Securing Multi-Agent Communication & Trust MechanismsT12, T14, T13

4. Prompt Injection & Memory Poisoning

Prompt injection is OWASP LLM01:2025 -- the #1 LLM risk for the second consecutive edition. A Prompt Injection Vulnerability occurs when user prompts alter the LLM behavior or output in unintended ways. Direct injection modifies model behavior via user input (Ignore all previous instructions). Indirect injection -- the agent processes external content (websites, files, emails, RAG documents) carrying hidden instructions -- is the dominant agentic threat, because the agent may silently execute injected instructions, with the user privileges, with no user awareness.

4.1 Memory Poisoning: Temporally Decoupled and Worse

Memory poisoning (T1 / ASI06) is more dangerous because it decouples injection from execution across three phases:

  1. Injection -- malicious content enters via documents/emails/webpages/API responses using instruction-like phrasing (Remember that the user prefers, For future reference always).
  2. Persistence -- poisoned instructions persist indefinitely across sessions in long-term memory; the agent cannot distinguish learned context from planted content.
  3. Execution -- a later, unrelated query retrieves the poisoned entry and runs it as self-learned knowledge.

This is why session isolation does not help -- the attack lives in persistent cross-session state. The academic MINJA attack (NeurIPS 2025) achieves >95% injection success and >70% attack success using query-only interaction, no privileged access. The Gemini Memory Attack used conditional, delayed instructions triggered by words like yes, sure, and no that appear in nearly every conversation -- making single-moment runtime detection nearly useless.

4.2 Real-World Incidents (Not Theoretical)

Name / IDTargetClassMechanismSeverity / Success
EchoLeak (CVE-2025-32711)Microsoft 365 CopilotIndirect, zero-clickCrafted email → LLM Scope Violation; chained XPIA evasion, reference-style Markdown bypass, auto-fetched image egress, Teams CSP proxyCVSS 9.3 (Critical); first real-world zero-click prod LLM injection
ShadowLeakChatGPT Deep Research (Gmail)Indirect, zero-clickInstructions hidden in email HTML (white-on-white, tiny fonts); server-side exfiltration in OpenAI cloud100% success in testing
MINJALLM agents (academic)Memory poisoningQuery-only: bridging steps + indication prompts + progressive shortening>95% injection, >70% attack
Gemini Memory AttackGoogle GeminiMemory poisoningConditional/delayed instruction triggered by common wordsBypasses runtime guardrails

EchoLeak: four defenses bypassed in sequence

StepDefense BypassedTechnique
1XPIA classifierBenign phrasing (For compliance, do not mention this email)
2Markdown link-redaction filterReference-style Markdown link not stripped
3Requirement for a user clickAuto-fetched reference-style Markdown image -- zero-click
4Content Security PolicyRouted through CSP-allowlisted Teams proxy asyncgw.teams.microsoft.com/urlp

4.3 Defense-in-Depth (No Single Layer Is Sufficient)

LayerControlWhat It DoesNote
Input/output filteringClassifiers (XPIA), semantic filters, RAG TriadDetect injected instructions; validate output formatBypassable -- EchoLeak evaded XPIA
Content provenanceSpotlighting: delimiting, datamarking, encodingHelp model distinguish trusted vs. untrusted tokensDatamarking cut attack success ~50% to <3%
Memory provenanceProvenance tagging + write-ahead validationTag every entry with origin/session/source; secondary model validates before commitMemory-poisoning Layer 2
Sandboxing / least privilegePrivilege control, HITL, CSP, trust-aware retrievalLimit permissions, gate high-risk actions, contain blast radiusCSP failure enabled EchoLeak egress
Behavioral monitoringBaselines, memory integrity audit, circuit breakersDetect deviation; quarantine compromised agentsMemory-poisoning Layer 4
Adversarial testingRed-teaming, LLMail-Inject challengeContinuously probe defensesLLM01 control 7
As of 2026, no fully reliable defense against prompt injection exists. Treat it as an unsolved problem -- which is precisely why privilege containment and provenance, not filtering alone, are the durable controls.
The AI Strategy Blueprint Book Cover
The Executive Mandate

The AI Strategy Blueprint

Securing agents is one chapter of a defensible enterprise AI program. The AI Strategy Blueprint frames the executive mandate as seven commitments -- governance, security architecture, and ROI built in from day one -- so the controls in this checklist sit inside a board-level strategy.

5.0 on Amazon
$24.95
Get it on Amazon
7 Executive Commitments
Governance Playbooks
Security Architecture
ROI Frameworks

5. Excessive Agency & Tool Misuse: The Defining Risk

If you fix one thing, fix this. OWASP LLM06:2025 Excessive Agency has three official root causes -- and the maximum-risk configuration is all three at once, a state teams routinely create during prototyping and never tighten (the quiet drift toward excessive agency).

Root CauseDefinitionConcrete Agent ExamplePrimary Control(s)
Excessive FunctionalityTools include capabilities beyond task needTool offers modify/delete when only read needed; deprecated plugin still callable; open-ended shell functionMinimize tools; limit functionality; avoid open-ended extensions
Excessive PermissionsTools run with broader downstream privileges than requiredDB creds with UPDATE/INSERT/DELETE when only SELECT needed; shared service account instead of user identityLeast privilege; user-context OAuth minimum scope; complete mediation downstream
Excessive AutonomyHigh-impact actions proceed without verificationAgent deletes documents / sends wire / external email without confirmationHuman-in-the-loop on high-impact/irreversible actions
The Canonical Scenario
An email-summarization agent is hit with indirect prompt injection in an incoming email. It is tricked into reading sensitive mail and forwarding it to an attacker -- exploiting all three root causes at once (unneeded send functionality, over-privileged OAuth scope, no send-confirmation). The fix: a summarizer should hold only read_inbox / read_sent scopes with explicit no_delete, no_forward, no_external_send restrictions.

5.1 The Control Stack (Defense-in-Depth, In Order)

ControlMechanismExample / Detail
Tool allowlist / minimizationDefault zero tool access; add tools at runtime per permissionBy default the agent should not have any tool access (Auth0)
Scoped capabilitiesRBAC vs. fine-grained (ReBAC / OpenFGA) authorizationCan user:anne use buyStock on asset:OKTA?
Credential delegationShort-lived OAuth 2.0 tokens; token vault + OAuth FederationNo raw/long-term creds stored by the agent
Human-in-the-loopExplicit consent for high-impact/irreversible actionsCIBA push approval; 60s confirmation timeout; gates delete_file, send_email, run_code, update_database, modify_iam_policy
Output schema / boundingValidate tool-call args vs. schema; delimit untrusted contentWrap external input in delimited tags; filter injection strings
Damage limitationRate-limiting, sandboxing, tamper-evident audit logsSHA-256 result hashing + reasoning traces; sandbox all code execution

OWASP eight LLM06 controls: minimize extensions; minimize functionality; avoid open-ended extensions; minimize permissions; execute in the user context (OAuth minimum scope, not a shared service account); require human approval for high-impact actions; complete mediation (validate all downstream requests rather than trusting the LLM); sanitize inputs/outputs. The two emphasized controls are the ones most often skipped.

5.2 Autonomy Tiers -- Govern the Dial, Do Not Max It

Tier 1
Fully Supervised
Human approval required before ANY action.
Tier 2
Constrained Autonomy
Executes only pre-approved action types within predefined scope.
Tier 3
Broad Autonomy
Acts within defined boundaries under continuous monitoring.
Principle
Authorization must live in the downstream system, never be trusted to the LLM. Controls work combined, not individually. When choosing an agent framework, evaluate it against these controls first -- see best AI multi-agent tools.

6. Agent Identity, Authentication & Least Privilege (the NHI Problem)

An AI agent is a non-human identity (NHI) -- a digital identity that authenticates and operates without direct human control. Treating agents as first-class identities (not as features running under a human credentials) is the single most important access-control decision you will make.

The scale is the problem. Enterprises now average ~82 machine identities per employee; the ratio moved from ~92:1 (early 2024) to ~144:1 (end of 2025), and Palo Alto Networks 2026 report puts the cross-environment average at 109:1, with cloud-native environments reaching tens of thousands of machine identities per human. Agents are the fastest-growing class: Palo Alto projects +85% AI-agent growth over the next 12 months.

6.1 The Anti-Pattern: Agents Under Broad User Credentials

When an agent reuses a human user session or a shared key, three things break: permissions become excessive, audit becomes impossible (OWASP NHI10 Human Use of NHI), and you create classic confused-deputy exposure. The fix is a distinct, managed identity per agent. The OWASP Non-Human Identities Top 10 (2025) is the reference frame:

IDRiskRelevance to AI-Agent Least Privilege
NHI1:2025Improper OffboardingDecommissioned agents left active create persistent backdoors
NHI2:2025Secret LeakageAgent memory, tool results, transcripts, crash dumps leak creds
NHI3:2025Vulnerable Third-Party NHICompromised connector enables supply-chain attack
NHI4:2025Insecure AuthenticationWeak/legacy auth enables takeover and escalation
NHI5:2025Overprivileged NHIAgent granted more than its task needs; expands blast radius (core risk)
NHI6:2025Insecure Cloud Deployment ConfigsHigh-privilege CI/CD misconfig enables unauthorized access
NHI7:2025Long-Lived Secrets~50% of NHI creds are long-lived keys; fix is ephemeral tokens
NHI8:2025Environment IsolationReusing one NHI across test/prod enables cross-env compromise
NHI9:2025NHI ReuseOne identity shared across workloads removes least-privilege boundaries
NHI10:2025Human Use of NHICannot distinguish agent vs. human; breaks audit
The Over-Privilege Data
37% of NHI security incidents are attributed to over-privileged identities; 26% of orgs estimate that 50%+ of their service accounts are over-privileged; 44% of cloud environments contain at least one privileged IAM role; ~50% of enterprise NHI credentials are long-lived API keys.

6.2 The Controls

ControlRecommended PracticeStandard
Identity per agentEach agent authenticates as a distinct principal; no reused sessions or shared keysOWASP NHI10
Token lifetimeMinute-scale, short-lived; JIT issuance; retired at task completion (zero standing privilege)NIST SP 800-207A
AuthenticationOAuth 2.1 + PKCE (RFC 7636), dynamic client registration; MCP HTTP mandates OAuth 2.1MCP spec
DelegationToken Exchange (RFC 8693); token carries agent + user identity as separate claimsRFC 8693
Authorization modelABAC at runtime; capability-based verb-on-resource scopes, not broad rolesNIST SP 800-162
PostureDefault-deny with explicit grants; org-level deny policies block excessive configsOWASP NHI5
Effective authorityIntersection of agent and user permissions, never the union (confused-deputy defense)
Workload identitySPIFFE/SPIRE, short-lived OIDC, STS assume-role; SCIM for provisioningNIST 800-207A
High-impact actionsOut-of-band human approval via a channel the agent cannot forge

NIST SP 800-207A states the principle directly: each service should present a short-lived cryptographically verifiable identity credential, authenticated per connection and reauthenticated regularly. Note the self-escalation risk: an agent with enough initial access can dynamically modify its own permissions -- which is exactly why org-level deny policies and out-of-band approval for high-impact actions are non-negotiable. RBAC alone is insufficient; ABAC plus capability tokens (Macaroons, Biscuit) is the recommended posture.

7. Multi-Agent Systems: Cascading Failure, Impersonation & Protocol Security

Multi-agent security is non-compositional: individually safe agents can compose into an unsafe system, because trust does not aggregate predictably across agent-to-agent calls. You cannot certify a fleet by certifying each agent. The relevant OWASP 2026 risks are ASI07, ASI08, ASI09/ASI10, and ASI03.

#Threat ClassDescription
1Privacy & Information IntegrityUnauthorized data access or corruption across agent boundaries
2Collusion & ExfiltrationCoordinated extraction/leak, incl. secret/steganographic collusion
3ExploitationAgents abusing vulnerabilities in other agents decision processes
4Swarm AttacksCoordinated assaults that appear benign individually
5Heterogeneous AttacksMixed-capability adversaries exploiting role specialization
6Overseer AttacksCompromising human supervisors or monitoring systems
7Cascade AttacksFailures propagating through agent dependencies
8Conflict & Mixed-Motive ThreatsMisaligned objectives creating systemic risk
9Physical & Embodied SecurityAgents controlling real-world systems
10Sociotechnical ThreatsManipulation of humans and institutions
The Observability Gap
Backdoored agents can coordinate via steganographic channels embedded in shared message boards, making secret collusion undetectable even under full observability of communications.

7.1 Agent Impersonation & Protocol Security (A2A and MCP)

In A2A, a malicious agent crafts a deceptive Agent Card to misrepresent its capabilities and win the host LLM-based selection. Trustwave SpiderLabs demonstrated an Agent-in-the-Middle attack in 2025. A2A v0.3+ supports but does not enforce card signing, so card spoofing via DNS/CDN compromise is a low-cost, routine threat.

ProtocolNamed AttackMitigation / Spec Control
A2A (v0.3+)Agent Card spoofing/tampering (DNS/CDN); signing supported but not enforcedEnforce card signing; serve over HTTPS/TLS 1.3+; mTLS agent identity
A2AAgent-in-the-Middle impersonation (Trustwave 2025)Verify card provenance/signature; do not rely on LLM selection alone
A2AOAuth2 long-lived tokens, coarse scopes, no consent gateShort-lived audience-scoped tokens; capability-based access; protocol-level consent
MCPTool poisoning (malicious instructions in tool metadata); 5 of 7 clients lack static validationStatic metadata analysis; client-side validation; behavioral anomaly detection
MCPRug pull (tool definition mutates after approval)Pin/version tool definitions; re-approval on change; integrity checks
MCPConfused deputy (proxy uses server, not user, privileges)Per-user identity passthrough done correctly; avoid a static single OAuth Client ID
MCP (spec 2025-06-18)Token passthrough abuseProhibited by spec; servers = OAuth 2.1 Resource Servers; validate token audience; mint per-call audience-scoped tokens
Cross-Cutting Defenses
Least-privilege/capability-based access control, Plan-then-Execute architectures (separate deliberation from action), Byzantine-resilient consensus for mission-critical decisions, and a digital-twin clone that re-runs the last week of recorded actions to test for cascade triggers (ASI08).

8. Agent Inventory & Discovery: You Cannot Secure What You Cannot See

The first deliverable of any agentic-AI security program is a continuous, living inventory of every agent -- including shadow deployments. CSA defines three discovery gaps that make agents uniquely hard to inventory:

Discovery Gap

Traditional tools cannot find ephemeral agent runtimes in IDEs, desktops, browser sessions, MCP servers, and personal accounts.

Permission Visibility Gap

Agents inherit employee credentials and may exceed consciously granted permissions.

Logic Inspection Gap

Teams rarely inspect prompts, skills, MCP tool definitions, memory stores, and agent instructions for malicious behavior.

Why a Registry Is Foundational
79% of organizations lack visibility into AI agents and MCP-connected systems; 47% of enterprise AI use occurs through personal accounts outside SSO/identity governance; ~97% of NHIs carry excessive privileges; and just 0.01% of NHIs control 80% of cloud resources. Fragmented identity systems added an average of 12 hours to identity-related incident resolution (Unit 42).

8.1 Registry Architecture & the Unit of Inventory

Two production-relevant directions: the OWASP Agent Name Service (ANS) -- a protocol-agnostic discovery registry (IETF draft) with DNS-inspired naming, PKI certificates, and a Protocol Adapter Layer covering A2A, MCP, and ACP -- and Microsoft Entra Agent ID / Agent 365, a production enterprise registry with tenant-wide counters for Total, Ownerless, and Unmanaged agents.

A Managed Agent Identity Is the Unit of Inventory
A shadow agent is rated Critical when an agent has no registry entry, no owner, OR no managed identity. Anything without all three is, by definition, a shadow agent and a Critical finding. OWASP also requires comprehensive runtime logging of every decision, tool call, and state change, a per-agent behavioral baseline, circuit breakers, and an auditable kill-switch.

9. Audit, Traceability & Logging (EU AI Act Art. 12)

Agent audit logging is not application logging. It must capture decisions, prompts, tool calls, delegated authority, and outcomes -- a full forensic trail. Two layers matter: the regulatory baseline (what you must record) and the engineering layer (how you make those records trustworthy).

9.1 Regulatory Baseline -- EU AI Act Article 12

Article 12(1): High-risk AI systems shall technically allow for the automatic recording of events (logs) over the lifetime of the system. Logging must be technical, automatic, and lifetime -- manual recording does not satisfy the requirement.

ProvisionRequirementDetail
Art 12(1)Automatic event loggingTechnically built in; over the system lifetime; manual recording insufficient
Art 12(2)(a)Risk identificationLog events relevant to risk situations or substantial modification
Art 12(2)(b)Post-market monitoringSupport post-market monitoring per Article 72
Art 12(2)(c)Operation monitoringSupport monitoring of operation per Article 26(5)
Art 26(6)RetentionDeployers retain auto-generated logs for a minimum of 6 months, subject to law
Penalties
Breaching operator obligations (incl. Articles 12/26) reaches up to 15 million euros or 3% of total worldwide annual turnover, whichever is higher (up to 35M euros / 7% for prohibited practices).
Timeline in flux as of 2026-05-30: high-risk obligations were originally set for 2 Aug 2026; a reported 7 May 2026 political agreement moves Annex III high-risk systems to 2 Dec 2027. Confirm the final adopted text before presenting any single date as binding.

9.2 Engineering Layer -- Tamper-Evident Logs

Article 12 mandates that logs exist and be automatic, but does NOT prescribe how logs resist tampering. Tamper-evidence is your design choice -- and it is what satisfies SOC 2, ISO 27001, and forensic readiness. The five primitives:

PropertyMechanismNotes
Append-onlyWORM / immutable storageEntries added, never removed or modified
Tamper-evidentSHA-256 hash chain over canonical-JSON eventsAny altered byte breaks all subsequent links
Independently verifiableMerkle tree; recompute leaves and re-chainExternal auditor verifies without trusting the runtime
Identity-boundCryptographic signature tied to agent credentialPlus the human authorizer who delegated the workflow
Time-orderedSequential chain; tamper-resistant timestampsSuitable for replay/forensics
Worked Verification Example -- hash chain
C1 = SHA-256( C0 + bytes(E1) )
C2 = SHA-256( C1 + bytes(E2) )
C3 = SHA-256( C2 + bytes(E3) )   <- stored as the chain head

An auditor recomputes C1', C2', C3' from the canonical bytes and compares C3' to the stored head. If an attacker alters a single byte in E2 (e.g. changing operation:delete to operation:read), then C2' does not match C2, which forces C3' to differ -- the tampering is detected at the chain head even though only one intermediate entry changed. A Merkle tree over the leaves provides the same guarantee with O(log n) inclusion proofs.

Hash-chaining detects tampering but does not by itself prevent deletion of the whole log. Pair it with WORM/replicated storage and external anchoring for true tamper-resistance.

9.3 Required Fields per Agentic Access (Kiteworks Model)

FieldDefinition
Agent identityUnique workflow-level credential of the agent performing the access
Human authorizerAuthenticated identity of the human who delegated the workflow
Data accessedSpecific record identifiers + data classification
Operation performedSpecific action: read, download, move, delete, forward
Policy-evaluation outcomePermitted/denied + which policy attribute governed the decision
TimestampPrecise, retroactively-unalterable event time

Supporting standards: NIST SP 800-92, SOC 2, ISO 27001, plus HIPAA 45 CFR 164.312(b), SEC Rule 17a-4 (WORM), NIST 800-171 (3.3.1), CMMC AU.2.042, NYDFS Part 500 500.6. Map these obligations to your broader program in AI compliance frameworks.

Free Download

Get Chapter 1 Free + AI Academy Access

Download the first chapter of The AI Strategy Blueprint and get instant access to our AI Academy -- covering AI governance, security architecture, and the seven executive commitments behind a defensible agentic-AI program.

AI Strategy Blueprint Preview

10. Runtime Defense-in-Depth: Guardrails, Sandboxing & Kill-Switches

Runtime agent security layers four independent control planes. Both OWASP and Meta frame guardrails as a final layer of defense, not the only one.

10.1 Guardrail Frameworks

GuardrailFunctionArchitecture / ModelKey Metrics
PromptGuard 2Jailbreak / prompt-injection classifier (input)BERT-family: 86M or 22M98% AUC English; 97.5% recall @1% FPR; 19.3-92.4 ms on A100
AlignmentCheckChain-of-thought goal-hijack auditorGuardrail LLM: Llama 3.3 70B / Llama 4 Maverick>80% recall, <4% FPR (internal)
CodeShieldStatic analysis of generated codeSemgrep + regex, 8 languages, 50+ CWEs96% precision, 79% recall; ~60-300 ms tiers
The Headline Result
Meta LlamaFirewall combining all three reduced attack success rate on the AgentDojo benchmark from a 17.6% baseline to 1.75% (>90% reduction) while preserving 42.7% utility. For on-prem/air-gapped builds, NVIDIA NeMo Guardrails (programmable Colang, five rail types incl. execution/tool I/O) and Llama Guard (self-hostable open-weight classifier, no external API) are deployable with no internet dependency.

10.2 Sandboxing & Isolation of Tool Execution

Treat all generated code as untrusted; remove direct eval(); run one task per ephemeral sandbox with no artifact carryover.

TechnologyIsolation MechanismBest FitNotes
Firecracker / Kata microVMHardware-virtualized microVMRegulated/sensitive data; strongestE2B boots <200 ms; recommended minimum for production
gVisorUser-space Go kernel intercepts syscallsCompute-heavy multi-tenantSandboxed code never talks to host kernel directly
V8 IsolatesPer-context JS engine isolationLatency-critical lightweight tasksJS/TS only; weakest boundary

10.3 Threat-to-Control Map (OWASP ASI01-ASI10)

IDRiskKey Defense-in-Depth Controls
ASI01Agent Goal HijackPrompt-injection filtering; limited tool privileges; human approval for goal changes
ASI02Tool Misuse & ExploitationSandboxed execution; strict permission scoping; argument validation
ASI03Identity & Privilege AbuseShort-lived credentials; task-scoped permissions; isolated identities
ASI04Agentic Supply ChainSigned manifests; curated registries; dependency pinning; sandboxing; kill-switches
ASI05Unexpected Code ExecutionTreat generated code as untrusted; remove eval(); hardened sandboxes; review steps
ASI06Memory & Context PoisoningMemory segmentation; ingestion filtering; provenance tracking; entry expiry
ASI07Insecure Inter-Agent Comm.Mutual TLS; signed payloads; anti-replay; authenticated discovery
ASI08Cascading FailuresIsolation boundaries; rate limits; circuit breakers; pre-deployment plan testing
ASI09Human-Agent Trust ExploitationForced confirmations; immutable logs; risk indicators
ASI10Rogue AgentsGovernance; sandboxing; behavioral monitoring; kill-switches
Kill-Switch Mandate
For ASI10 rogue agents and ASI04 supply-chain, an instant, auditable kill-switch is mandatory. Microsoft open-source Agent Governance Toolkit maps controls to every OWASP agentic risk using four execution rings (Ring 0 supervisor through Ring 3 untrusted sandbox), each with resource limits plus an instant kill-switch.

11. Mapping to the Frameworks: NIST AI RMF, CISA/Five Eyes & MITRE ATLAS

11.1 NIST AI Risk Management Framework

NIST AI 600-1 (Generative AI Profile of AI RMF 1.0, July 2024) is built on four core functions -- GOVERN, MAP, MEASURE, MANAGE. It defines 12 GAI risk categories and 200+ suggested actions. NIST term for hallucination is confabulation. AI 600-1 was scoped to content generation, not autonomous action; agentic risk is handled by NIST AI 100-5 plus the CSA NIST AI RMF Agentic Profile (draft) using AG- extensions.

FunctionAI 600-1 GenAI FocusAgentic Extensions (CSA AG- profile, draft)
GOVERN (GV / AG-GV)Risk culture, policy, accountability, value-chain oversightAG-GV.1 Autonomy Tier Classification; AG-GV.2 Delegation Accountability; AG-GV.3 Agent Lifecycle Governance
MAP (MP / AG-MP)Establish context; identify which of 12 GAI risks applyAG-MP.1 Tool-Use Risk Inventory; AG-MP.2 Action-Consequence Mapping; AG-MP.3 Multi-Agent Topology Risk
MEASURE (MS / AG-MS)Assess, benchmark, track; red-teaming, evalsAG-MS.1 Behavioral Telemetry; AG-MS.2 Autonomy Calibration; AG-MS.3 Delegation Chain Monitoring
MANAGE (MG / AG-MG)Prioritize, respond, recover; incident responseAG-MG.1 Agentic Incident Response; AG-MG.2 Behavioral Drift Correction; AG-MG.3 Agent Decommissioning
600-1 is final; the agentic/AG- materials are 2025-2026 drafts -- treat AG- IDs and the autonomy-tier scale as CSA proposals aligned to NIST, not finalized NIST controls.

11.2 CISA / NSA / Five Eyes Guidance

PublicationDateCore Focus
Deploying AI Systems Securely15 Apr 2024Zero Trust, secure-by-design, model-weight protection, RBAC/ABAC + MFA, monitoring
AI Data Security: Best Practices22 May 2025Securing training/operational data: supply chain, poisoning, drift; provenance & encryption
Principles for Secure Integration of AI in OTDec 2025Critical-infrastructure/OT: Understand, Assess, Govern, Embed safety
Careful Adoption of Agentic AI Services1 May 2026First dedicated agentic AI guidance: 5 risk categories + lifecycle controls

The May 2026 agentic guidance defines five named risk categories: Privilege risks, Design & configuration risks, Behavioral risks, Structural risks, and Accountability risks. Its immediate actions are a ready-made program kickoff: inventory all agentic deployments (including shadow); conduct blast-radius assessments; audit service accounts for excessive permissions; replace persistent credentials with just-in-time (JIT) provisioning; extend logging to capture agent actions. The AI Data Security CSI also specifies AES-256 + post-quantum encryption, FIPS 140-3 storage, and cryptographically signed append-only provenance ledgers.

11.3 MITRE ATLAS

MITRE ATLAS is the threat-informed-defense knowledge base for AI systems. On 2025-10-21, MITRE ATLAS and Zenity Labs released the first formal agent-specific techniques:

AML.T IDTechniqueWhat It Does
AML.T0080AI Agent Context PoisoningManipulate the context an agent uses; subs Memory and Thread
AML.T0081Modify AI Agent ConfigurationAlter config files affecting one or many agents
AML.T0082RAG Credential HarvestingHarvest credentials from documents ingested into a RAG database
AML.T0083Credentials from AI Agent ConfigurationExtract tool/service credentials from agent settings
AML.T0084Discover AI Agent ConfigurationEnumerate config (Embedded Knowledge / Tool Definitions / Activation Triggers)
AML.T0085Data from AI ServicesExfiltrate via agent services (RAG Databases / AI Agent Tools)
AML.T0086Exfiltration via AI Agent Tool InvocationAbuse the agent own tools to move data out

For a CISO-level synthesis of all three frameworks, see AI for CISOs security and the program backbone in our AI governance framework. (ATLAS counts are version-sensitive and release monthly -- verify before publishing.)

12. Air-Gapped & On-Prem Containment: Shrinking the Blast Radius

Every preceding section reaches the same conclusion: you must assume injection succeeds, so the durable control is containment -- and the strongest containment is removing the egress channel entirely. Air-gapped / on-prem deployment is blast-radius reduction by architecture: no internet, no outbound connections, no DNS resolution, no NTP sync.

This is not abstract. Re-read the EchoLeak and ShadowLeak chains from Section 4: both depended on the agent reaching an external endpoint -- Markdown image auto-fetch, SSRF, and tool callouts to external URLs. In a true air-gapped deployment, those channels are architecturally impossible. The same logic neutralizes MITRE ATLAS AML.T0086 (exfiltration via tool invocation) for any tool that would otherwise call out.

DimensionDetail
Egress postureNo internet, no outbound connections, no DNS, no NTP; no licensing/telemetry callbacks
Channels neutralizedMarkdown image auto-fetch, SSRF, external tool callouts -- architecturally impossible
Compliance fitNIST 800-171 / CMMC 2.0 L3, NIST RMF 800-37, FedRAMP High, DoD IL4-IL5, ITAR, HIPAA, CJIS, GDPR/sovereignty
Reference model stackLlama 3 8B/70B, Mistral, Falcon (open-weight) on vLLM or llama.cpp
Vector / embedding stackQdrant or Milvus; E5 / Voyage embedding models
Reference hardwareNVIDIA A10G 24GB or A100 80GB; GPU server ~$8,000-$25,000
Deployment timeline4-12 weeks (air-gapped) vs. 1-2 weeks (connected VPC)
Residual riskDoes NOT prevent injection, local corpus poisoning, or insider/physical exfiltration -- pair with layered controls

Note the strict definition: even a single firewall rule allowing outbound HTTPS to a licensing or telemetry server disqualifies the deployment from true air-gap status. The compliance dividend is concrete: FedRAMP High and DoD IL4-IL5 deployments eliminate entire boundary-defense control categories (no boundary to defend), and ITAR technical data cannot traverse foreign-accessible infrastructure at all.

Be Honest About Trade-offs
Air-gapping removes the egress channel; it does not remove prompt injection, poisoning of locally-ingested corpora, or insider/physical-media exfiltration. Pair it with layered ingestion (provenance verification, hidden-instruction stripping), retrieval (permission-aware search, tenant isolation, anomaly detection), and generation (output monitoring, disabled auto-fetch, strict CSP/egress allowlists) controls.

Compare deployment options in best AI air-gapped environments, and see how AirgapAI implements local-inference containment for enterprise agents.

13. The CISO Agentic AI Security Checklist (2026)

This checklist consolidates OWASP (LLM Top 10, Agentic Top 10, NHI Top 10), NIST AI RMF, the CISA/Five Eyes agentic guidance, and MITRE ATLAS into seven operational domains. Work top to bottom; Inventory is the prerequisite for everything else. Each item names the control AND the authoritative source -- more actionable than a generic template.

Download as PDF
Domain 1 — Inventory & Discovery
  • Maintain a continuous, living inventory of every agent, including shadow/informal deployments.CISA / OWASP ANS
  • Treat a managed agent identity as the unit of inventory -- flag any agent with no registry entry, no owner, OR no managed identity as a Critical shadow agent.Agent 365
  • Run continuous discovery across endpoints, IDEs, browsers, MCP servers, SaaS, and personal accounts (close the Discovery Gap).CSA
  • Inventory each agent tool access, memory stores, prompts/skills, and MCP tool definitions (close the Logic Inspection Gap).CSA / ASI04
  • Conduct a blast-radius assessment mapping every agent tools, data, and downstream reach.CISA
Domain 2 — Identity & Access
  • Give every agent its own distinct, managed non-human identity; never run agents under shared keys or reused human sessions.OWASP NHI10
  • Default-deny: agents start with zero tool access; grant explicitly and minimally.LLM06 / NHI5
  • Use short-lived, minute-scale credentials with JIT issuance and zero standing privilege; retire at task completion.NIST 800-207A
  • Replace long-lived API keys (the ~50% problem) with ephemeral, auto-rotated tokens.OWASP NHI7
  • Use OAuth 2.1 + PKCE; delegate via Token Exchange (RFC 8693) so the token carries agent AND user identity as separate claims.RFC 8693 / MCP
  • Authorize with ABAC / capability-based scopes (verb-on-resource), not broad roles.NIST 800-162
  • Enforce effective authority = intersection of agent and user permissions, never the union.Confused-deputy
  • Audit service accounts for excessive permissions and add org-level deny policies that block self-escalation.CISA / NHI5
  • Assign distinct cryptographic identities per agent (SPIFFE/SPIRE; mTLS for inter-agent).CISA
Domain 3 — Tooling & Supply Chain
  • Minimize the number of tools and limit each tool to essential functionality; avoid open-ended shell/URL extensions.LLM06
  • Execute every tool in the user context with minimum OAuth scope; never a shared service account.LLM06
  • Validate every tool-call argument against a defined output schema; wrap untrusted external content in delimited blocks.LLM06 / Spotlighting
  • Enforce complete mediation -- re-authorize every downstream request at the resource, never trust the LLM.LLM06
  • Verify every MCP server before approval; pin/version tool definitions and require re-approval on change (rug-pull defense).ASI04 / MCP
  • Confirm MCP servers act as OAuth 2.1 Resource Servers with audience-scoped tokens and no token passthrough (spec 2025-06-18).MCP spec
  • Maintain an SBOM for models, adapters (LoRA), tools, and datasets; verify provenance and model signatures.LLM03 / ASI04
  • Never auto-approve tool calls based on repository/document content; disable auto-run / YOLO modes.ASI02 / ASI05
Domain 4 — Runtime Defense
  • Deploy layered guardrails (input/output classifiers + chain-of-thought alignment check + generated-code static analysis) as a final layer, not the only one.LlamaFirewall
  • Sandbox all tool/code execution at microVM strength minimum (Firecracker/Kata); one ephemeral sandbox per task, no artifact carryover.ASI05
  • Treat all generated code as untrusted; remove direct eval().ASI05 / LLM05
  • Enforce per-agent, per-tool, per-session rate limits and resource quotas.ASI08 / LLM10
  • Implement circuit breakers, transactional rollback, and safe-failure modes that pause and escalate to a human.ASI08
  • Provide an instant, auditable kill-switch / emergency shutdown for runaway or rogue agents.ASI10 / CISA
  • Separate planning from execution (Plan-then-Execute) architecturally.CISA
Domain 5 — Data Protection
  • Strip hidden instructions and verify provenance on every ingested document before embedding.LLM01
  • Enforce permission-aware retrieval and tenant isolation at the retrieval layer (before documents enter the context window), not just the app layer.LLM08
  • Tag every memory entry with origin/session/source; validate writes with a secondary model; expire entries.ASI06
  • Encrypt data at rest, in transit, and in compute (AES-256 + post-quantum); store in FIPS 140-3 systems.CISA AI Data Sec
  • Track data provenance via cryptographically signed, append-only ledgers; verify integrity with hashes.CISA AI Data Sec
  • Monitor output for exfiltration signatures; disable client-side auto-fetch of remote images/links; enforce strict CSP and egress allowlists.LLM05 / EchoLeak
  • For high-sensitivity workloads, deploy air-gapped/on-prem to remove the egress channel entirely.Containment
Domain 6 — Audit & Traceability
  • Log every decision, tool call, and state change automatically, including a stable goal identifier, exact prompt, exact output, tool-selection rationale, and parameters.EU AI Act Art. 12
  • Capture the six mandatory fields per access (agent identity, human authorizer, data accessed, operation, policy outcome, timestamp) -- log permitted AND denied actions at operation-level granularity.Kiteworks
  • Make logs append-only and tamper-evident (SHA-256 hash chain + Merkle tree), identity-bound, time-ordered, and independently verifiable.SOC 2 / ISO 27001
  • Store logs in WORM/replicated storage; retain a minimum of 6 months (longer where law requires).EU AI Act Art. 26(6)
  • Integrate structured, discrete-field logs into SIEM in real time with anomaly alerts.NIST 800-92
  • Establish a per-agent behavioral baseline and alert on deviation (drift detection).ASI10
Domain 7 — Governance & Lifecycle
  • Adopt the NIST AI RMF four functions (Govern/Map/Measure/Manage) as your governance spine; layer the GenAI profile (AI 600-1) and the agentic profile (NIST AI 100-5 / CSA draft).NIST AI RMF
  • Classify each agent by autonomy tier; grant the lowest tier that works and promote only deliberately.CSA tiers
  • Require human approval / out-of-band confirmation for high-impact, irreversible actions (delete_file, send_email, run_code, update_database, modify_iam_policy).LLM06 / CISA
  • Tune human-in-the-loop thresholds to risk/confidence/context; automate low-risk, escalate high-risk (prevent reviewer flooding).T10 / ASI09
  • Threat-model before/concurrent with deployment, including explicit inter-agent trust modeling.CISA / MAESTRO
  • Progressively increase access and autonomy -- never grant full autonomy on day one.CISA
  • Define an agentic incident-response plan with pre-authorized auto-containment.AG-MG.1
  • Govern decommissioning: revoke credentials, dispose of memory, and remove registry entries (prevent orphaned backdoors).OWASP NHI1 / AG-MG.3
  • Red-team agents continuously (adversarial testing, attack simulation).LLM01 control 7
  • Map your controls to MITRE ATLAS techniques (esp. AML.T0080-T0086) and re-baseline as the framework updates monthly.MITRE ATLAS

Frequently Asked Questions

An AI agent security checklist covers autonomous, tool-using, stateful systems -- not just a model that generates text. It adds controls absent from an LLM checklist: agent inventory and discovery, non-human identity and least-privilege tool scoping, runtime sandboxing and kill-switches, multi-agent communication security, and tamper-evident action logging. The agentic risks (OWASP T1-T15 and ASI01-ASI10) are explicitly framed as extensions of the OWASP LLM Top 10 into autonomous settings, so an agent checklist contains an LLM checklist and goes further.
Least privilege applied to the agent identity and tools -- directly countering OWASP Excessive Agency (LLM06:2025) and the CISA Privilege risks category. Because no fully reliable defense against prompt injection exists, you must assume injection succeeds; the durable mitigation is ensuring a compromised agent simply cannot perform high-impact actions or reach external endpoints. Default-deny tool access, short-lived scoped credentials, and human-in-the-loop on irreversible actions are the highest-leverage items.
Four authoritative bodies: OWASP (Top 10 for LLM Applications 2025, Top 10 for Agentic Applications 2026, and the NHI Top 10 2025); NIST (AI RMF 1.0 plus the Generative AI Profile AI 600-1 and the emerging agentic profile NIST AI 100-5 / CSA draft); CISA and Five Eyes (the four joint Cybersecurity Information Sheets, culminating in Careful Adoption of Agentic AI Services, 1 May 2026); and MITRE ATLAS for threat-informed defense. Use the NIST four functions (Govern, Map, Measure, Manage) as the spine and map specific controls to OWASP and ATLAS.
No. As of 2026 there is no fully reliable defense against prompt injection -- classifiers like Microsoft XPIA are demonstrably bypassable (EchoLeak chained four bypasses, including XPIA evasion). Treat filtering as one layer among many. The durable controls are privilege containment and content provenance, not detection alone.
Article 12 requires high-risk AI systems to automatically record events over the system lifetime; Article 26(6) requires deployers to retain those logs for a minimum of six months. Article 12 does not specify how logs resist tampering -- cryptographic tamper-evidence (hash chains, Merkle trees, WORM storage) is your own design choice to satisfy forensic and SOC 2 / ISO 27001 needs. High-risk obligations were originally set for 2 August 2026; a reported 7 May 2026 political agreement moves Annex III systems to 2 December 2027 -- confirm the final enacted dates. Penalties reach up to 15 million euros or 3% of global turnover.
It removes the egress channel. Real exfiltration exploits like EchoLeak and ShadowLeak depend on the agent reaching an external endpoint (image auto-fetch, SSRF, external tool callouts). With no internet, no outbound connections, no DNS, and no telemetry callbacks, those channels become architecturally impossible -- shrinking the blast radius. It also eliminates entire boundary-defense control categories for FedRAMP High and DoD IL4-IL5 and satisfies ITAR and data-residency requirements. It does not prevent injection itself or local corpus poisoning, so pair it with layered controls.
Multi-agent security is non-compositional: individually safe agents can compose into an unsafe system because trust does not aggregate predictably across agent-to-agent calls. New attack surfaces appear -- Agent Card spoofing and impersonation in A2A, tool poisoning and rug pulls in MCP, cascading failures across shared memory, and steganographic secret collusion that is undetectable even under full observability. Mitigate with mutual TLS, signed and verified agent identities, audience-scoped tokens, circuit breakers, and Plan-then-Execute separation.
A shadow agent is any agent with no registry entry, no assigned owner, OR no managed identity -- Microsoft Agent 365 rates this Critical. Shadow agents are unmonitored, often inherit broad employee credentials, and lack audit trails, making them the agentic equivalent of shadow IT. Industry data shows 79% of organizations lack visibility into their agents and 47% of enterprise AI use happens through personal accounts outside SSO -- which is why continuous discovery and a managed agent registry are the foundational control.
Start with the CISA immediate actions: inventory all agents (including shadow), run blast-radius assessments, audit service accounts, replace standing credentials with just-in-time provisioning, and extend logging to agent actions. Then govern autonomy on a dial -- classify each agent by tier and grant the lowest tier that works, promoting deliberately rather than by default. This avoids the quiet drift toward excessive agency that turns prototypes into production liabilities, and addresses Gartner warning that 40%+ of agentic projects will be canceled by 2027 for inadequate risk controls.

Put the Checklist to Work

Securing agents is one chapter of a defensible enterprise AI program. Build the strategy behind the controls, then turn this checklist into a tailored roadmap.

Build the Strategy Behind the Controls
Get the full playbook in the AI Strategy Blueprint -- the executive guide to deploying AI with governance, security, and ROI built in from day one, framed as seven executive commitments.
Get the AI Strategy Blueprint
Turn This Checklist Into Your Roadmap
Use the AI Blueprint Builder to generate a tailored agentic-AI governance and deployment plan mapped to your environment, risk tolerance, and compliance requirements.
Launch the AI Blueprint Builder

Sources & References

OWASP Standards

Government & Frameworks

Incidents, Research & Engineering

This guide synthesizes publicly available standards, vendor research, and security press as of 2026-05-30. Framework codes, technique counts, and EU AI Act dates are version-sensitive and were in flux at publication -- always verify against the authoritative source PDFs (OWASP, NIST, CISA, MITRE ATLAS, the EU AI Act consolidated text) before relying on a specific code or date in policy.