Most enterprise security failures don’t happen because a control is missing. They happen because a control behaves differently than everyone assumes.
In modern organizations, identity and access management stretches across hundreds of SaaS applications. Each application brings its own authentication semantics, transitional states, and undocumented edge cases. Security teams attempt to manage this complexity through configuration checks, policy assertions, and vendor APIs. Yet the reality of how users actually authenticate doesn’t live inside any single system. It lives in the interactions between them.
This creates a growing class of risk that violates no explicit rule, triggers no alert, and appears in no dashboard—while quietly undermining the security model. These are not misconfigurations. They are unknown-unknowns. And free-roaming AI agents are emerging as the first technology capable of finding them.
The Limits of Configuration-Driven Security
Most security tooling today is built on a deterministic idea of safety. A control exists, its configuration is documented, that configuration can be queried, and compliance becomes a matter of comparison. When risks are explicit and systems are self-contained, this model works well.
But it begins to fail when systems span multiple vendors, when authentication paths are optional rather than mandatory, and when enforcement is implicit instead of declarative. Identity systems are especially vulnerable to this gap. Their behavior often emerges from how multiple platforms interact, not from how any single one is configured.
SSO Isn’t a Control, It’s a Contract
Single Sign-On is frequently treated as a security control, but in practice it is a protocol contract between two independent parties: the identity provider and the service provider.
The identity provider governs how users authenticate. It enforces MFA, evaluates conditional access policies, issues tokens, and signs assertions. The service provider, however, decides whether those assertions are required, whether passwords remain valid, and whether fallback authentication paths exist at all.
This division of responsibility is fundamental, yet often misunderstood. Configuring SSO in the identity provider does not enforce SSO. It only makes SSO available. Enforcement lives entirely on the application side.
Authentication Reality Is Behavioral
Most security models rely on a simple assumption: if SSO is configured and MFA is enforced, then all users authenticate through the identity provider. That assumption holds only if the service provider requires it.
When an application allows password-based login, local authentication, legacy credentials, or transitional “test” states, MFA enforcement becomes optional from the user’s perspective. And because the identity provider is never involved, it has no visibility into these authentications at all.
The result is a gap between the security model teams believe they have and the behavior that actually occurs.
An Unknown-Unknown in the Wild
This gap became visible through the work of a free-roaming AI agent analyzing identity and application behavior across an organization. The agent wasn’t searching for a known misconfiguration or a specific failure mode. Instead, it asked a more fundamental question: does authentication behavior match the security model the organization believes is in place?
Step 1: Establishing Intent: Who Should Be Using SSO
The agent first established intent. Users assigned to the “SSO – Box” group in Microsoft Entra ID were clearly meant to authenticate via Entra, be subject to MFA, and avoid local passwords. Group membership expressed administrative expectations, even if it didn’t enforce them.
Step 2: Observing Reality: Who Actually Authenticated via Entra
The agent then examined Entra sign-in logs for successful authentications to Box. This produced a set of users who demonstrably authenticated via Entra. If SSO were fully enforced, this set would align with the group membership. It didn’t.
Step 3: The Missing Users Problem
Several users who belonged to the SSO group and had active Box accounts were completely absent from Entra’s Box sign-in logs. There was no misconfiguration to flag, no policy violation to alert on, and no failed authentication attempt to investigate. There was simply a gap, an absence of evidence where evidence should have existed.
This is where traditional security tooling would stop. The agent didn’t.
Step 4: Crossing the Boundary: Box Native Login Events
The agent crossed the boundary into Box’s own audit logs and examined native LOGIN and ADMIN_LOGIN events generated directly by Box. Those logs showed recent login activity for the same users, occurring after Entra had already recorded their first successful SSO authentication. This ruled out onboarding timing issues or historical artifacts. These users were actively authenticating to Box without Entra being involved.
At this point, only two explanations were plausible. Either users were being temporarily removed from the SSO group at the moment of login, or Box was still allowing non-SSO authentication.
The second explanation was systemic, and correct.SSO was configured, but not enforced. Box was operating in SSO Test Mode.
When the Signal Doesn’t Exist
Box has since introduced an API endpoint that exposes enterprise configuration details, including SSO enforcement and test mode status. Today, this condition can be queried directly.
At the time of the agent’s discovery, it couldn’t.
There was no reliable field indicating enforcement state, no ‘is_test_mode’ or ‘sso_enforced’ flag, and no configuration API that could surface the issue. Security teams had no way to detect this gap through traditional checks. Only behavioral correlation across systems made it visible.
This is what makes unknown-unknowns so difficult. They aren’t the result of careless vendors or missing controls. They emerge when multiple systems interact in ways that no single system fully represents.

Why Humans Don’t Scale Here
Modern enterprises operate hundreds of SaaS applications, each with unique SSO semantics, undocumented edge cases, and independent release cycles. Expecting security teams to continuously reason about every possible authentication path, correlate intent with behavior, and keep pace with these changes is structurally impossible.
Even highly skilled teams miss these issues, not because of negligence but because the system has outgrown human cognitive limits.
What Free-Roaming AI Agents Change
Free-roaming AI agents don’t rely on predefined checks or static assertions. They model expected behavior, observe real behavior, and look for contradictions. They move freely across systems and reason across trust boundaries.
Instead of asking whether SSO is enabled, they ask whether it is possible to authenticate in a way that violates the organization’s security assumptions.
That shift from configuration to behavior is the breakthrough.
Easy Fix, Hard Discovery
Once identified, remediation was trivial. Enabling “Complete SSO activation” in the Box admin console required a single checkbox.
But discovering the need for that checkbox required identity expertise, cross-system reasoning, behavioral validation, hypothesis testing, and data correlation. This is exactly the kind of work free-roaming AI agents are uniquely suited for.
Security Lives Between Systems
This Box example isn’t exceptional. Most large organizations have SSO that is “mostly enforced,” MFA that is “usually applied,” transitional states no one remembers, and legacy paths no one monitors.
Unknown-unknowns don’t live in dashboards. They live in the seams between products.
Toward a New Security Model
The future of security isn’t more controls, more settings, or more dashboards. It is continuous reasoning, behavioral validation, cross-system understanding, and autonomous exploration.
Free-roaming AI agents aren’t an optimization. They are becoming a foundational requirement for securing systems that no human—or static tool—can fully reason about anymore.
Unknown-unknowns are inevitable, and finding them no longer has to be optional. Book a free assessment to identify Entra ID Risks.



