Sunday, December 22, 2024

Agentic AI Could Change User Interface (Again)

The annual letter penned by Satya Nadella, Microsoft CEO, points out the hoped-for value of artificial intelligence agents which “can take action on our behalf.” 


That should have all sorts of implications. Today, users typically issue commands or input data, and software executes tasks. With agentic AI, software would do things on a user’s behalf without some amount of explicit work on the user’s behalf. 


When arranging a meeting about a subject, the agent might query attendee calendars, send out invites and prepare an agenda, instead of the many steps a human might otherwise undertake. 


That might change the way some types of software are created, allowing non-technical people to create apps. A user might tell an agent to “build a basic web app for a recipe database,” without coding knowledge. 


Lots of other manual tasks might also be automated. Think of photo tags. Instead of manual tag creation, an agent could automatically tag photos and create collections. 


Agents might draft routine reports or monitor and adjust system performance, without active human intervention. Where today software “waits” for a directive, agents would work in the background, anticipating what needs to be done, and often doing that. 


Agents could also enhance levels of personalization already based on user behavior and preferences that might not always be explicitly stated. 


There are several key changes in user interaction with computers and software. First, a shift in user interface: “a new natural user interface that is multimodal,” he says. 


Think back to the user interfaces of the past, and the progression. We started with command line interfaces requiring typing on a keyboard in a structured way. No audio, no video, no speech, no gestures, no mouse or pointing. 


Over time, we got graphical, “what you see is what you get” mouse-oriented interactions, which were a huge improvement over command line interfaces. Graphical interfaces meant people could use and control computers without the former technical knowledge. 


Era

Time Period

Interface Type

Key Features

Impact on Usability

Batch Processing

1940s–1950s

Punch Cards

Input via physical cards with holes representing data and commands.

Required specialized knowledge; interaction was slow and indirect.

Command-Line Interfaces (CLI)

1960s–1980s

Text-Based Commands

Typing commands into a terminal to execute programs or tasks.

Greater flexibility for users but required memorization and technical expertise.

Graphical User Interfaces (GUI)

1980s–1990s

Visual Desktop Interface

WYSIWYG (What You See Is What You Get) design; icons, windows, and mouse control.

Made computers accessible to non-technical users; revolutionized personal computing.

Web-Based Interfaces

1990s–2000s

Internet Browsers

Interfacing through websites using hyperlinks and forms.

Simplified information access and expanded computer use to online interactions.

Touchscreen Interfaces

2007–present

Multi-Touch Gestures

Direct manipulation of elements on-screen using fingers.

Intuitive for all age groups; foundational for smartphones and tablets.

Voice Interfaces

2010s–present

Natural Language Commands

Voice assistants like Siri, Alexa, and Google Assistant.

Enabled hands-free operation but often struggles with context and nuance.


Beyond that, AI should bring multimodal and multimedia input and output” speech, images, sound and video. Not just natural language interaction, but multimedia input and output as well.


Beyond that, software will become more anticipatory and more able to “do things” on a user’s behalf. 


Nadella places that within the broader sweep of computing. “Can computers understand us instead of us having to understand computers?”


“Can computers help us reason, plan, and act more effectively” as we digitize more of the world’s information?


The way people interact with software also could change. Instead of “using apps” we will more often “ask questions and get answers.”


No comments: