Lindsey HullJune 03, 2026

Governing AI Agents: When Your Assistant Recommends a Phishing Email

Governing AI Agents: What Changes When AI Acts | SBS

4:31

KEY TAKEAWAYS

Recognize the shift: AI has moved from answering questions to taking actions on your behalf. The risk model hasn't caught up.

Old controls miss new threats: Prompt injection, automation bias, and missing security context don't respond to traditional defenses.

Governance has to anticipate action: Acceptable use, vendor oversight, and staff training need to keep pace with what AI agents now do.

Recently, while testing an early-access AI assistant that can take actions on a user's behalf, an employee at a cybersecurity company asked it to help review a suspicious email. The assistant recommended downloading the attachment.

The email turned out to be a KnowBe4 phishing simulation, and the attachment was a test payload, so the incident caused no harm. But the lesson is real: The AI did not know it was looking at a phishing email. It had no security context. It saw an email with an attachment and a user asking for help, and it responded helpfully.

That's the story. The bigger one is what it reveals about a category of tools moving into regulated organizations faster than most policies can keep up with.

From AI That Talks to AI That Acts

The first wave of generative AI was largely conversational, with users asking questions and applying the answers themselves. The associated risks were relatively contained, mostly to inaccurate output or hallucinated facts, along with sensitive data entered into public AI tools.

The current wave is different. AI agents read inboxes, open documents, schedule meetings, send replies, and download files. They don't just produce text. They take actions in your environment. This is a categorical shift in the AI risk model, and most organizations are still governing for the previous one.

5 Risks Every Security Leader Should Understand

1. AI agents have no security context.

They generally don't see what your email filter flagged, what users have reported as phishing, or what your threat intelligence says about a sender. They evaluate content on its surface. A polished phishing email appears legitimate to an AI, sometimes more so than it does to a trained human.

2. Prompt injection is real and underappreciated.

Emails and documents can contain hidden instructions aimed at the AI, not the human reader: invisible text, formatting tricks, or natural-language instructions that tell the agent to forward a thread, share a file, or download an attachment. Any time an AI agent reads untrusted content with permission to act, this is a live attack surface. This isn't hypothetical: In 2025, researchers disclosed EchoLeak (CVE-2025-32711), a zero-click flaw in Microsoft 365 Copilot in which a single crafted email could silently exfiltrate internal data with no user interaction. Microsoft patched it, but it proved the attack class is practical, not theoretical.

3. Automation bias erodes human judgment.

Decades of research show people defer to automated systems, especially confident and articulate ones. When an AI tells a user a message is safe or urgent, the user is more likely to believe it, even when their instincts would otherwise pause. The more capable the agent, the stronger the bias.

4. There's a verification gap.

Agents offer to send replies, download files, and act on documents. Each action is a moment when a human should verify, but speed and convenience push in the opposite direction. A culture of "approve and move on" turns the agent's mistakes into the organization's mistakes.

5. Output review still matters.

When AI edits a document, drafts a client deliverable, or summarizes a thread, it can miss context, introduce errors, or quietly change meaning. Treating AI output as finished work is a quality and compliance risk, not just a productivity question.

Governing AI Agents Demands More Than a Policy

None of these risks are arguments against AI agents. They're genuinely useful, and institutions that ignore them will fall behind. But they require a different governance posture than the chat-era tools they're replacing, one that anticipates action, not just answers.

That governance work is what SBS's Virtual Chief AI Officer (vCAIO) program is built for. Grounded in the NIST AI Risk Management Framework and the NIST Cybersecurity Framework, a vCAIO helps institutions build an executive-level AI strategy, establish acceptable-use guidelines, assess and mitigate risk, evaluate AI-embedded vendors, and train staff on safe, practical use of tools like Microsoft Copilot. It's the structure that turns AI from an unmanaged liability into a strategic advantage.

If AI agents are already in your environment (and they almost certainly are), governing them isn't optional. The only question is whether you'll do it before or after your first incident.

How Can SBS Help?

Govern AI With Confidence

A person standing in front of a screen displaying the text Machine Learning.

Virtual Chief AI Officer

Gain a clear AI strategy with governance, risk management, vendor validation, and pilot projects designed to deliver measurable outcomes.

A human hand shaking an AI-generated robot hand.

Certified Banking AI Strategist

Master AI in banking through strategy, governance, risk management, and vendor evaluation, using practical tools to move from exploration to execution.