Saturday, March 21, 2026

"Not Seeing AI Productivity" Storyline is Inevitable

It’s inevitable that we will keep seeing headlines, and seeing, hearing and reading stories about how many businesses are not seeing financial returns from their investments in artificial intelligence


Important new technologies rarely show up in the bottom line immediately, and the issues are structural.


First of all, business processes have to be recreated to harness the innovations. 


When electricity first entered factories, managers simply replaced their massive central steam engine with one massive electric motor. Productivity didn't move. Only after firms discovered they could put a small motor on every individual machine (the "unit drive") were they able to redesign the factory floor. 


In 2026 companies are using AI to "chat with docs" or "summarize emails" (overlaying tech on old habits) rather than redesigning their entire supply chains. That will take some time. 


Also, firms must retrain workers and staffs. That imposes real costs (time and money) while possibly lowering productivity in the short run as time and effort is diverted to such training. So a "J-curve" of productivity will happen: lower productivity in the near term, with the benefits in the future. .


Then there are measurement issues, such as how to quantify the impact of quality, variety and speed. If an AI helps a legal team finish a contract in two hours instead of 10, but the firm still charges a flat fee, the "productivity" is invisible to the GDP, even though the human cost has plummeted.


Study

Technology Period

Key Finding

Duration of Lag

Paul David (1990)

Electricity (1890–1920)

Factories had to be physically demolished and rebuilt to utilize "unit drive" motors before TFP spiked.

~30–40 Years

Robert Solow (1987)

Computing (1970–1990)

The "Solow Paradox": You could see the computer age everywhere except in the productivity statistics.

~20 Years

Brynjolfsson et al. (2021)

AI & Software (2010–2021)

Formulated the "Productivity J-Curve"; firms must invest in unmeasured "intangible assets" that initially depress earnings.

Ongoing

NBER / Juhász et al. (2020)

19th Century France

Productivity in mechanized spinning was initially lower and more dispersed than hand-spinning due to the need for factory reorganization.

~15–20 Years

Man Group / Bara (2026)

Generative AI (2023–2026)

80% of firms report no macro productivity impact yet, despite task-level gains of 15-55%, due to "workflow friction."

Projected 5-10 Years


In the meantime, leaders will have to try and come up with some quantifiable metrics (directly related or not) to justify the investments. It won’t take too much imagination to realize that headcount reductions are one such way to “show” outcomes, even if AI and headcount are indirectly, loosely or even unrelated in the short term. 


In 1900 the "electricity bubble" looked real to everyone still using steam. By 1920, the steam users were bankrupt. 


So “productivity proxies” must be developed.


The most immediate impact of AI is the compression of time. 


Firms can measure the "distance" between an idea and its execution.Time-to-prototype can show how many days it takes to move from a natural language prompt or requirement to a functional, testable version.


Draft-to-final ratio might be used by marketing and legal firms to measure the time spent on the "first 80 percent" of a task versus the "final 20 percent" of human polishing.


For engineering teams, the metric isn't just "lines of code," but the number of successful pushes to production per developer per week. 


Larger firms might try to assess the reduction in total "human hours" spent in meetings.


Query-to-find latency is a measurement of how long it takes an employee to retrieve a specific piece of internal tribal knowledge. AI should reduce that latency. 


Admin-to-maker ratio tracks whether the percentage of an employee's day spent on "coordination" is shrinking in favor of "creation." 


“Agents” also will need new metrics that quantify AI outcomes as though it were a digital employee rather than a software tool.


Autonomous completion rate is the percentage of workflows that an AI agent initiates and completes without a human "click" or intervention.


Human-in-the-loop friction measures how often an agent has to "hand back" a task to a human because it hit a reasoning wall. A falling HITL rate is a leading indicator of future productivity.


Token efficiency per outcome calculates the cost of AI "thinking" (API/Compute costs) relative to a successful business outcome. 



Business Function

Traditional Metric (Lagging)

AI Proxy Metric (Leading)

Why it Matters

Software Engineering

Lines of Code, Story Points

PR Cycle Time

Measures how fast code is reviewed and merged, not just typed.

Legal, Compliance

Billable Hours

Review Velocity per Page

Shows the acceleration of document ingestion and risk flagging.

Customer Support

First Response Time

Resolution via Zero-Touch

Measures the percentage of issues solved entirely by agents.

R&D

Patents Filed, Products Launched

Iteration Cycles per Quarter

Shows how many "failed fast" experiments the firm can run.

Human Resources

Headcount Growth

Talent Density (Revenue/FTE)

Measures if the firm is scaling output without scaling people.


The “productivity lag” is entirely predictable. So are the storylines about it. Sure, it is a significant practical problem for those firms making the investments. But the “lag” storyline is entirely predictable.


No comments:

Anthropic Survey Finds 81% of Respondents Think AI is Creating Value

After surveying 80,508 people across 159 countries and 70 languages, here is how Anthropic assesses the hopes people have for artificial int...