An analysis of 4,500 work-related artificial intelligence use cases suggests we are only in the very-early stages of applying AI at work and that most of the use cases have not yet moved to a stage where we can measure return on investment or productivity impact.
That is worth keeping in mind.
Most use cases so far only affect speed or time savings. Few use cases are more-directly integrated into customer-facing revenue-generating activities.
The vast majority of use cases are very basic, says a Section AI report. Some 14 percent of workers say their most valuable AI use case is Google search replacement. As helpful as that might be, it is hard to measure productivity gains at this point.
About 17 percent of workers use AI for drafting, editing, and summarizing documents. Again, productivity improvements are difficult in those cases, but perhaps more measurable in terms of time savings.
So far, Section AI researchers found only two percent of users have built automations for copy generation, which would save more time, for example.
About three percent say their most valuable use case is data analysis or code generation, and there the ROI seems easiest to document in terms of time saved or effort avoided, rather than other revenue-generating metrics.
In fact, nearly a quarter of respondents say AI does not save them any time at all, which might seem odd unless those users are having to spend time learning how to use AI, which would, in fact, take more time.
In other cases, they might find they are having to spend time checking the answers and output, which again might take additional time.
The point is that we are in early stages of deployment, where it remains difficult to assess productivity gains.
As unhelpful as it might be, transformative technologies often fail to show up in productivity statistics for years, or even decades, after their introduction, as the Solow Paradox describes.
Measuring language model impact by "minutes saved per task" captures only the shallowest layer of value, many would argue. The reason is that what we can measure sometimes is not all that important.
Productivity metrics are generally designed to measure output per hour (quantity). They are notoriously bad at measuring quality.
If a model helps a software engineer write safer, more robust code, or helps a marketer generate a campaign that resonates better with customers, standard productivity metrics might show zero gain (or even a loss.
Also, In the early stages of adoption, productivity often dips, since firms and workers must invest time and capital into training, restructuring workflows, and figuring out how to use the new tools.
This "intangible capital" investment does not produce immediate revenue.
Also, as always, adopters are using language models to do existing tasks faster (writing emails). True productivity explosions only occur when businesses re-architect their entire workflows to do things that were previously impossible, rather than just speeding up legacy processes.
That sort of measurable productivity gain cannot be demonstrated so soon.