79% of Companies Can't See What Their AI Agents Are Doing

On March 5, OpenAI shipped GPT-5.4 with native computer use — a model that sees your screen, moves your mouse, and executes multi-step workflows across applications without a human touching the keyboard. On the GDPval benchmark, which tests agents across 44 occupations, GPT-5.4 matches or exceeds industry professionals in 83% of comparisons. The same week, Gravitee's State of AI Agent Security 2026 report confirmed that only 21% of executives have complete visibility into what their agents are doing, accessing, or spending.

That means 79% of the companies deploying these agents are operating blind — exactly the accountability gap that AgentPMT's verifiable agent infrastructure was built to close.

The timing of that gap matters. Nearly 70% of enterprises already run AI agents in production. Gartner projects 40% of enterprise applications will embed agents by the end of this year, up from 5% in 2025. Luma launched creative agents this week that coordinate text, image, video, and audio generation for clients including Publicis Groupe and Adidas. The capability expansion is real, and it's accelerating.

What's not accelerating is the infrastructure to verify what those agents actually did. AgentPMT was built to close this exact gap — blockchain wallets that give every agent a verifiable on-chain identity, credit card integration where agents never see payment credentials, and auditable execution logs for every tool call and workflow step. While the rest of the industry scrambles to build trust infrastructure, the platform already ships it. And this week, the biggest names in payments and AI started validating that approach.

The Trust Layer That Just Shipped

Between March 3 and March 6, four separate announcements laid the foundation for what could become the trust infrastructure of agentic commerce.

Mastercard and Google went first, unveiling Verifiable Intent on March 5 — an open-source framework that creates tamper-resistant cryptographic records of what a user authorized when an AI agent acts on their behalf. The record links consumer identity, specific instructions, and transaction outcomes into a single audit trail that all parties can consult during disputes. "As autonomy increases, trust cannot be implied. It must be proven," said Pablo Fourez, Mastercard's CDO. Google's VP of Payments Stavan Parikh called it "a natural accelerator for scaling agentic commerce." IBM, Fiserv, Adyen, Checkout.com, Basis Theory, and Getnet have committed to adopting it. Mastercard open-sourced the specification on GitHub.

Two days earlier, Stripe introduced Shared Payment Tokens — a payment primitive built specifically for agents. SPTs let an AI agent initiate a purchase using a customer's preferred payment method without the agent ever seeing the actual card number, CVV, or expiration date. The customer grants permission once. The agent carries a token. The merchant gets paid. Stripe is now the first and only provider that supports agentic network tokens from both Visa and Mastercard alongside buy-now-pay-later tokens from Affirm and Klarna through a single integration. Before SPTs, AI agents defaulted to card-on-file payments, freezing out every alternative payment method. That restriction quietly shaped how millions of agent-assisted transactions processed — and consumers had no idea they were losing payment flexibility.

Then on March 6, OpenAI released Codex Security in research preview — an AI security agent that scans codebases, identifies vulnerabilities, builds proof-of-concept exploits in sandboxed environments, and proposes fixes. During testing, Codex Security scanned 1.2 million commits and surfaced 10,561 high-severity issues across repositories including OpenSSH, GnuTLS, and Chromium. False positive rates dropped by more than 50% across all repositories during the preview period.

Each of these is a significant step forward. But notice the pattern: Verifiable Intent covers payment authorization. SPTs cover checkout credential isolation. Codex Security covers code vulnerabilities. None of them addresses the fundamental question of agent identity — who is this agent, what is it authorized to do across all systems, and can you prove it after the fact?

AgentPMT's credit card integration already does what Stripe's SPTs are introducing — agents transact without ever seeing credentials, with server-side injection at the moment of purchase. AgentPMT's blockchain wallets on Base provide the tamper-resistant audit trail that Verifiable Intent is building for payments, but extended to every agent action — tool calls, workflow steps, budget consumption — not just card transactions. The difference between a single-domain trust layer and a comprehensive one is the difference between locking your front door and having a security system for the entire building.

The Identity Dark Matter Problem

A Hacker News investigation published March 3 gave the underlying crisis a name: identity dark matter. The framing is precise. Dark matter is powerful, invisible, and unmanaged — and that describes exactly how most AI agents operate inside enterprise environments.

The numbers confirm it. Gravitee's security report found that 88% of organizations experienced confirmed or suspected AI agent security incidents in the past year. In healthcare, that number reaches 92.7%. Only 21.9% of teams treat AI agents as independent, identity-bearing entities. Nearly half — 45.6% — still authenticate agents using shared API keys, meaning there is no way to distinguish what one agent did from another. When incidents occur, forensic teams trace API key usage to discover that the key was shared across dozens of processes, none of them individually identifiable.

The tool ecosystem compounds the problem. BlueRock Security analyzed more than 7,000 MCP servers — the protocol that connects AI agents to external tools and data — and found 36.7% were vulnerable to server-side request forgery. Trend Micro identified 492 MCP servers operating with zero client authentication and zero traffic encryption. And the OpenClaw supply chain attack confirmed 1,184 malicious skills across the ClawHub marketplace — roughly one in five packages. These are the tools agents are using to do their work, and a significant percentage of them are compromised.

Now layer GPT-5.4's computer use on top of this. An agent that can see your screen, navigate applications, click buttons, and fill forms is operating with a fundamentally different risk profile than a chatbot answering questions. The identity and permission problem shifts from inconvenient to dangerous. When a compromised agent can control your browser, the shared API key authenticating it isn't just a governance gap — it's an open door.

AgentPMT addresses this with AgentAddress — an open-source wallet signature mechanism that gives every agent a cryptographically verifiable identity on the Base blockchain. No shared API keys. No orphaned credentials. Every agent has its own wallet, its own identity, and its own auditable history. Dynamic MCP ensures agents only access the tools they need, when they need them, with zero context window consumption from tools they aren't using. The tool surface area stays controlled rather than sprawling across thousands of unverified servers.

The Regulatory Pressure Arrives on Schedule

Three days from now — March 11, 2026 — the Secretary of Commerce must publish an evaluation identifying state AI laws that the federal government considers burdensome or conflicting with national policy. The same day, the FTC must issue a policy statement explaining how its prohibition on unfair and deceptive practices applies to AI. Both deadlines were set by President Trump's December 2025 executive order titled "Ensuring a National Policy Framework for Artificial Intelligence."

The stakes are concrete. The Commerce evaluation will flag specific state laws for potential preemption — likely targeting bias testing requirements, impact assessments, and transparency mandates. An AI Litigation Task Force stands ready to challenge flagged laws in federal court. The FTC policy statement will determine whether state laws requiring AI to alter outputs can be classified as compelling deceptive practices under federal law.

While Washington prepares to draw these lines, states keep legislating. On March 6, Oregon's legislature gave final passage to SB 1546, a chatbot safety bill requiring AI platforms to implement safeguards when users express ideas of suicide or self-harm and to disclose when content is AI-generated. The Senate vote was 26-1. Washington state has a similar bill pending. The Transparency Coalition tracked 78 AI chatbot safety bills across 27 states as of the same week.

The collision is straightforward. The federal government wants to reduce regulatory fragmentation. States want to protect constituents. Companies deploying AI agents need to comply with whatever framework emerges — and right now, nobody knows which one that will be.

Here's what both sides agree on: accountability requires evidence. Whether the regulatory framework comes from one federal standard or 50 state-level requirements, companies will need to demonstrate what their agents did, what they were authorized to do, and what they spent. The audit trail isn't optional under any plausible outcome.

AgentPMT's infrastructure was designed for this reality. Full request/response capture. Workflow step tracking with success and failure logging. On-chain transaction records via blockchain wallets. Budget controls with daily, weekly, monthly, and per-transaction spending caps. Vendor whitelisting and product-level restrictions. Whether March 11 produces a single federal framework or accelerates state-level regulation, the companies operating on AgentPMT already have the evidence layer regulators are going to demand.

What This Means For You

The trust infrastructure for AI agents is being assembled by committee — Mastercard for payment verification, Stripe for checkout credential isolation, OpenAI for code security, Oregon for chatbot safety. Each piece solves a real problem. None of them solves the whole problem.

If you're deploying agents today, here's the practical question: can you prove what your agents did yesterday? Not in aggregate. For each individual agent, for each action, with a verifiable record. If the answer is no, you're operating in the 79% without visibility, in an environment where 88% of organizations have already experienced security incidents.

AgentPMT is the only platform that spans all three trust dimensions the market is trying to build separately. Identity — every agent gets a blockchain wallet and AgentAddress for cryptographic authentication. Verification — x402Direct provides on-chain transaction guarantees, and credit card integration ensures agents never touch payment credentials. Accountability — every tool call, workflow step, and transaction is logged with full context and auditable from a single dashboard. This isn't a patchwork of point solutions. It's the architectural decision that determines whether you can answer the regulators, the auditors, and the board when they ask what your agents actually did.

Want to apply this in your own workflow?

Create a free account to try the same approach with your own agents.

Browse agents

Free to start. No card required.

What to Watch

The March 11 Commerce and FTC evaluations will reveal which state AI laws the federal government considers obstacles — and which survive. That decision shapes the compliance landscape for every company deploying agents through the rest of 2026.

Watch whether Visa develops its own verification standard or adopts Mastercard's Verifiable Intent. A single standard accelerates adoption. Competing standards fragment the trust layer the same way competing payment protocols fragmented agent commerce over the past six months.

The Agentic AI Foundation under the Linux Foundation — with Anthropic, OpenAI, Google, Microsoft, and AWS as platinum members — faces its first real governance test. With 36.7% of MCP servers vulnerable to SSRF and 1,184 confirmed malicious skills in the ecosystem, the foundation's response will signal whether open governance can move at the speed the security crisis demands.

And DeepSeek V4 appears imminent. If it arrives with competitive agent capabilities at lower cost, adoption will accelerate again — and the trust infrastructure gap widens with every new agent deployed without identity, verification, or accountability.

The companies building accountability infrastructure now won't need to retrofit it later. The ones waiting for regulators to force the issue will discover that building audit trails into systems that were never designed to have them is significantly harder — and more expensive — than getting it right the first time.

Explore what comprehensive agent accountability looks like at agentpmt.com.

Key Takeaways

GPT-5.4 shipped with native computer use, but 79% of companies lack visibility into what their agents are doing
Mastercard, Google, and Stripe each built a piece of the trust layer this week — nobody built the whole thing
88% of organizations have experienced AI agent security incidents, with only 21% having full visibility
The March 11 Commerce/FTC regulatory deadline is 3 days away, and audit trails will be non-negotiable under any outcome

Sources

How Verifiable Intent builds trust in agentic AI commerce — Mastercard
Mastercard Unveils Open Standard to Verify AI Agent Transactions — PYMNTS
Supporting additional payment methods for agentic commerce — Stripe Blog
Klarna Expands Further Into Agentic Commerce via Stripe's Shared Payment Tokens — BusinessWire
Affirm expands Stripe partnership to support Shared Payment Tokens for agentic commerce — Affirm Investor Relations
Introducing GPT-5.4 — OpenAI
OpenAI launches GPT-5.4 with Pro and Thinking versions — TechCrunch
Codex Security: now in research preview — OpenAI
OpenAI Codex Security Scanned 1.2 Million Commits and Found 10,561 High-Severity Issues — The Hacker News
AI Agents: The Next Wave Identity Dark Matter — The Hacker News
AI went from assistant to autonomous actor and security never caught up — Help Net Security
State of AI Agent Security 2026 Report — Gravitee
AI Legislative Update: March 6, 2026 — Transparency Coalition
Oregon lawmakers pass AI bill for youth mental health — KOIN
Luma launches creative AI agents powered by Unified Intelligence — TechCrunch
FTC AI Policy Deadline March 11: Compliance Guide — Digital Applied