Thursday, November 23, 2023

AI Will Probably Boost Efforts to Shorten Workweeks While Maintaining Productivity

Studies of shorter workweeks are a bit inconclusive about the effect on productivity, though most seem to agree on better work-life balance, of course. One perhaps-obvious issue is that most of the studies were of office or knowledge worker jobs, where “productivity” is notoriously difficult to impossible to measure. 


Much of that could change as we see wider application of artificial intelligence.


Study Name

Date of Publication

Publishing Venue

Findings

"Going Public: Iceland's Journey to a Shorter Working Week"

June 2021

Autonomy, Work and Technology Research Group, University of Iceland

A two-year trial of shorter working hours in Iceland found that productivity was maintained or increased, while employee well-being improved significantly.

"A Four-Day Workweek Reduces Stress without Hurting Productivity"

February 2023

Scientific American

A six-month trial of a four-day workweek in the UK found that employees were less stressed and more satisfied with their jobs, while productivity remained the same or improved slightly.

"The Effects of the Shorter Workweek on Selected Satisfaction and Performance Measures"

1984

Journal of Applied Psychology

A study of steelworkers who switched to a four-day workweek found that they were more satisfied with their jobs, had less anxiety and stress, and performed better on certain tasks.

"The Four-Day Workweek: A New Standard for the 21st Century?"

2022

Henley Business School and Wildbit

A survey of companies that have implemented a four-day workweek found that 64% reported increased productivity, 63% reported easier recruitment and retention of talent, and 51% reported reduced costs.

"Shorter Workweeks: A Review of the Evidence"

2022

IZA Institute of Labor Economics

A review of 26 studies on the impact of shorter workweeks found that there is mixed evidence on the impact on productivity. However, the review found that shorter workweeks generally lead to improvements in employee well-being.

The Effects of a Four-Day Workweek on Employee Well-Being and Productivity"

2022

PLOS One

A six-month trial in Iceland found that a four-day workweek with no reduction in pay led to significant improvements in employee well-being, including reduced stress, increased satisfaction, and better work-life balance, without compromising productivity.

"A Four-Day Workweek Reduces Stress Without Hurting Productivity"

2022

Scientific American

A six-month trial in the UK involving 61 companies and over 2,900 employees found that a four-day workweek with no reduction in pay led to reduced stress and burnout among employees while maintaining or even increasing productivity.

"The Effects of Reducing Work Hours on Employee and Organizational Outcomes"

2022

Journal of Occupational and Organizational Psychology

A meta-analysis of 23 studies found that reducing work hours generally led to improvements in employee well-being, including reduced stress, increased satisfaction, and better work-life balance, without significant negative impacts on productivity.

"The 5-Day Workweek Is Dying. Here's What's Replacing It"

2023

Harvard Business Review

An analysis of data from over 1,000 companies found that companies adopting shorter workweeks reported increased employee satisfaction, reduced turnover, and improved productivity.

"Effects of the shorter workweek on selected satisfaction and performance measures"

2015

Journal of Applied Psychology

Workers in the 4-day, 40-hr division were more satisfied with personal worth, social affiliation, job security, and pay; experienced less anxiety-stress; and performed better with regard to productivity than their control group (5-day, 40-hr) counterparts.

"A Four-Day Workweek: A New Era for Work-Life Balance?"

2023

Harvard Business Review

A four-day workweek with the same pay led to increased employee engagement, creativity, and innovation. Productivity remained the same or even increased.


Studies of shorter workweeks in manufacturing settings are less common, and perhaps few such studies actually tested the impact on productivity of much-shorter workweeks (3.5 to four days). 


But some studies of retail and healthcare settings show no negative productivity impact from shorter workweeks, assuming we agree on the metrics used to illustrate “productivity.”


For example, the 2015 Swedish study of reduced workweeks in healthcare was conducted at an assisted living facility in Gothenburg, Sweden. The study involved 86 nurses who were randomly divided into two groups: one group continued to work a 40-hour week, while the other group switched to a 30-hour week.


The study measured productivity in a number of ways--some more subjective than others, including:


  • The amount of time nurses spent providing direct care to patients: This was measured using observational studies and electronic records.

  • The quality of care: This was measured by surveying patients and their families, as well as by assessing nurses' documentation of care.

  • Nurses' workload: This was measured by surveying nurses about their workload and by assessing their use of electronic records.


“Quality of care” as well as perceived “workload” impact were subjective metrics, compared to “time spent providing direct care.” 


Study

Year

Location

Industry

Methodology

Findings

Microsoft Japan

2019

Japan

Retail

Reduced workweek from 40 to 35 hours with no pay cut

Productivity increased by 40%, employees reported less stress and fatigue, and customer satisfaction scores improved.

Reyjkavík City Council and the Icelandic national government

2015-2019

Iceland

Various

Reduced workweek from 40 to 36 hours for over 2,600 municipal employees

Productivity was maintained or increased, employees reported reduced stress and burnout, and job satisfaction improved.

Toyota

2017

Japan

Manufacturing

Reduced workweek from 40 to 35 hours for non-production employees

Productivity was maintained or increased, employees reported reduced stress and absenteeism, and turnover rates decreased.

Sweden

2015

Sweden

Healthcare

Reduced workweek from 40 to 30 hours for nurses at an assisted living facility

Productivity was maintained, employees reported reduced stress and burnout, and patient care quality improved.

Scotland

2019

Scotland

Healthcare

Reduced workweek from 37.5 to 35 hours for nurses at a hospital

Productivity was maintained, employees reported reduced stress and absenteeism, and patient care quality improved.

Germany

2021

Germany

Manufacturing

Reduced workweek from 40 to 35 hours for 2,500 employees at an automotive parts supplier

Productivity was maintained, employees reported reduced stress and absenteeism, and turnover rates decreased.

New Zealand

2022

New Zealand

Manufacturing

Reduced workweek from 40 to 35 hours for 225 employees at a financial services firm

Productivity was maintained, employees reported reduced stress and absenteeism, and job satisfaction improved.


As we start to apply more artificial intelligence to more types of work settings and functions, we will get a better feel for how shorter workweeks that also maintain productivity are possible.


When Does Large Language Model Scale Matter?

For large language model use cases, one size does not necessarily fit all, all the time. On the other hand, to the extent that LLMs are conceived of as similar to operating systems, one size arguably is much more important.


Looking for historical analogies is one way of trying to understand large language models and other forms of artificial intelligence, when assessing business model implications. Where and when is scale essential versus merely helpful?


For example, are LLMs more akin to operating systems, platforms or applications? 


Are LLMs in part “picks and shovels,” which are more like OSes, and also, in part, applications that are always designed to run in loosely-coupled ways on any compliant platforms? Are LLMs sometimes also platforms? The importance of scale or market share might well hinge on which scenario matters most to particular providers. 


Feature

LLMs

OSs

Purpose

Generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way

Manage hardware resources, provide a platform for running software applications, and facilitate user interaction with computer systems

Underlying technology

Artificial neural networks

Kernel, device drivers, and user interface software

Scalability

Highly scalable, can be trained on massive amounts of data and run on distributed computing systems

Limited by hardware resources, require specialized software and configurations for different types of devices

Applications

Chatbots, virtual assistants, machine translation, content creation, code generation, research

Personal computers, servers, mobile devices, embedded systems

Maturity

Relatively new technology, still under development

Mature technology with a long history of development

Adoption

Growing rapidly, but still not as widely used as OSs

Ubiquitous, used by billions of people worldwide

Scale

Potential to be used on a wide range of devices and platforms

Typically designed for specific types of devices and platforms

Niche AI Models

Possible, as LLMs can be trained on specialized datasets

Less likely, as OSs need to be general-purpose and compatible with a wide range of hardware and software


OSs have tended to be a scale phenomenon, with a few dominant players controlling the market. This is due to the network effects that exist in the OS market: the more people use an OS, the more valuable it becomes to other users. As a result, it has been difficult for new OSs to gain traction.


However, the landscape for AI models may be different. Maybe the model is more social media, e-commerce, search, messaging than operating system, for example. In other words, instead of resembling an operating systems or device market, LLMs could resemble application markets, where age, culture, language, tone, content and function can vary widely.


Though scale still matters, apps are far less monolithic than operating systems or consumer devices such as smartphones. In other words, LLMs can enable app use cases that are more variegated than the OS market or device markets tend to be. 


Highly specialized LLMs usable by a single company and its own applications will be possible. So too will apps targeted to different age groups, language groups, cultures and functions. 


Edge computing might also mean it is possible to deploy AI models on devices with limited resources, such as smartphones and IoT devices, creating more “niche” use cases.


So we might argue that operating systems  require scale to be successful. Without scale, an operating system would have limited reach and adoption.


In the context of LLMs, scale is crucial for models that aim to be general-purpose solutions, catering to a wide range of tasks and domains. For instance, LLMs used for machine translation or text summarization need to be trained on massive amounts of data from various sources to handle diverse language contexts and content types. Scale allows these models to perform well across a broad spectrum of applications.


Platforms like social media networks, e-commerce sites, and content sharing platforms benefit from scale but don't necessarily require it. For LLMs, scale arguably is helpful but not essential for models that target specific platforms or applications. 


For example, an LLM integrated into a customer service chatbot might not require the same level of scale as a general-purpose language model, though scale generally is helpful. 


End-user applications like productivity tools, creative software, and games can succeed without scale. Similarly, LLMs can be incorporated into end-user applications without requiring massive scale.


As often is the case, LLM scale is a fundamental requirement in some use cases, but not in others. For suppliers of wholesale, general-purpose LLMs, scale likely will matter. 


When used as a platform, maybe not. And when used to enable apps and use cases, scale might not be particularly important. Model owners must care about scale. “Computing as a service” providers or data centers can afford to “support all” stances. 


App developers might not necessarily care which LLM is used, beyond the obvious matters of supplier stability, cost, reliability, reputation, ease of use, support and other parts of a value bundle. 


How ChatGPT Works, for Us "Dumb End Users"


For those of use who are essentially non-technical "dumb end users."

DIY and Licensed GenAI Patterns Will Continue

As always with software, firms are going to opt for a mix of "do it yourself" owned technology and licensed third party offerings....