AI Agents Need Accountability Before They Can Become Coworkers

Enterprise AI will not be defined only by model capability. The harder question is whether companies can make AI work traceable, governable, auditable, and economically sustainable.

Share
AI Agents Need Accountability Before They Can Become Coworkers

What happened

AI agents are moving from demos into workflows. They write code, classify customer requests, draft contracts, summarize financial data, monitor exceptions, and increasingly participate in business processes that used to be handled only by human employees or traditional software.

The phrase "software as a coworker" captures the shift well. Software is no longer only a tool that waits for a human command. In more workflows, it can take an input, decide what to do next, produce an output, and hand that output back into the organization.

But the coworker metaphor also hides the most important difference. A human coworker can explain a decision, escalate uncertainty, accept correction, and sit inside a chain of responsibility. An AI agent cannot carry legal, managerial, or fiduciary responsibility. If the agent makes a mistake, the company still owns the consequence.

That is why the enterprise AI question is not only, "How much work can we give to AI?" The harder question is, "Can the company take responsibility for the work AI performs?"

Why this matters

In consumer use cases, "mostly right" can be useful. A draft can be edited. A summary can be checked. A brainstorm can be imperfect and still valuable. Enterprise workflows are different.

Payments, credit decisions, insurance claims, regulatory reporting, financial controls, legal review, procurement approvals, and customer remediation processes all carry consequences. A small error can become a financial loss, a compliance issue, a reputational problem, or a governance failure.

This is where many AI projects hit the gap between technical capability and organizational readiness. Gartner has warned that at least 30% of generative AI projects may be abandoned after proof of concept by the end of 2025, citing issues such as poor data quality, inadequate risk controls, escalating costs, and unclear business value. That list is useful because it shows that the bottleneck is not only model performance. It is the operating environment around the model.

Japan's Financial Services Agency has also framed AI in finance as a dual challenge. Its AI discussion paper recognizes the risks of complex AI, including generative AI, while also warning against the risk of not challenging new technology. That is the right tension. The answer is not to avoid AI. The answer is to make AI usable in a form the institution can explain, monitor, and govern.

How I see it

My view is that enterprise AI will not be divided simply between companies that "use AI" and companies that do not. The more important divide will be between companies that can make AI accountable inside workflows and companies that cannot.

Accountability does not mean asking the AI to be responsible. It means designing the process so that the company can remain responsible. That requires at least three layers.

The first is data control. If an AI system uses enterprise data to make or support a decision, the company needs to know what data was used, where it came from, whether it was reliable, and whether it was allowed to be used in that context. Without data control, the organization cannot explain the basis of the output.

The second is auditability. The company needs records of what the AI did, what it referenced, what it produced, who reviewed it, and where it entered the workflow. In other words, the output needs a trace. If the company cannot reconstruct the decision path, the AI output remains a suggestion, not a responsible business process.

The third is human intervention. Not every case should be automated end to end. High-value transactions, unusual fact patterns, low-confidence outputs, regulated decisions, and customer-impacting exceptions should have clear escalation gates. The goal is not to keep humans in every loop. The goal is to know which loops still require humans.

Implications

For founders, this changes what it means to build an enterprise AI company. The product cannot only be a smarter model interface. It needs to fit into authorization, logging, exception handling, policy management, data permissioning, and review workflows. In many categories, the winning product may be the one that makes AI governable, not simply the one that makes AI impressive.

For corporate buyers, the implication is that AI vendor evaluation should move beyond accuracy demos. The real questions are operational. Can the system explain why it produced an answer? Can it show what data it used? Can it route exceptions? Can it preserve logs? Can it separate low-risk automation from high-risk judgment? Can it be governed by compliance, risk, legal, and business teams together?

For investors, this suggests a different lens on AI infrastructure and application software. If AI agents become part of core workflows, value may accrue to companies that own the responsible execution layer: data access, model routing, audit trails, human approval gates, cost controls, and workflow integration. The agent itself may be visible, but the governance layer may be where enterprise adoption becomes durable.

Inference cost also belongs in this discussion. Gartner has projected major declines in the cost of performing inference on very large models by 2030, but it also notes that agentic models can require many more tokens per task than standard chatbots. Lower unit costs do not automatically remove the economic problem. If AI agents perform more tasks, and more complex tasks, total consumption can still rise.

That means cost control is not just a finance issue. It is part of responsibility design. A company needs to know which workflows justify expensive reasoning, which can use smaller or domain-specific models, and when cost pressure might degrade judgment quality. A process that changes model quality silently because costs are rising is not a reliable business process.

Open question

The question I would watch is whether enterprise AI companies can move from capability claims to accountability infrastructure.

If AI agents remain mostly impressive assistants, adoption will continue but may stay around the edges of core operations. If companies can make AI work traceable, governable, auditable, and economically sustainable, then software can become a real coworker in the enterprise sense. Not because it carries responsibility like a person, but because the organization has designed a way to take responsibility for the work it performs.

References