Tuesday, December 5, 2023

Large Language Model Adoption Metrics Will be Difficult

Large language model adoption and use might very well surpass all prior examples of internet app/site usage and engagement, in large part because virtually all existing sites and apps will embed LLM functionality into the fabric of their operations--and possibly into the core of their operations in some cases. 


In other words, the possible fundamental difference between LLM adoption and all other major successful apps and sites is that LLM can be immediately deployed into the operations of all sites, almost immediately. 


People might be using LLMs and not be aware of it. For example, we might estimate that natural language interfaces powered by AI already range between 20 percent and 40 percent of smartphone users. That is a proxy for daily active use, in all likelihood. 


So the issue will be how fast LLM is likewise incorporated into natural language interfaces for smartphone apps, for example. 


All of which points out the likely difference in metrics we might have to develop for use of LLMs. It has been relatively simple to track daily- or monthly active users for specific apps or sites. It will be harder to track engagement and usage with LLMs that are embedded into the fabric of other experiences. 


Some suppliers will likely use metrics such as application program interface calls or event logging as ways of illustrating usage or engagement. But most of the measurements are likely to be rather indirect. 


Examples might be changes (ncreases) in specific actions, including number of searches, completed forms, voice interactions) or time spent on relevant pages. 


All that noted, it might still take some time for any single LLM (there will be multiple contestants) to reach 10-percent adoption or usage levels. 


Taking nothing away from the breathtaking eruption of ChatGPT-3, not even ChatGPT-3 really emerged from “nowhere.” about three years elapsed between ChatGPT-1 and the popularization of ChatGPT-3. 


And as important as large language models might be, we likely are quite some years away from a point that even 10 percent of internet users avail themselves of an LLM on a daily basis. In fact, based on history, it could take five to 10 years for any single LLM to reach the level of 10-percent daily active users. 


The caveat is that most of the successful early apps or websites provided one main value: search, e-commerce, social networking or entertainment. LLMs are likely to be embedded into multiple functions for any business or consumer, and might happen “in the background, so users might have no idea they are “using” features of an LLM. 


So the adoption curve, and the time to reach 10-percent usage, might be shortened, as the cumulative use of LLMs will occur across a potentially large array of use cases, apps and website interactions, including customer service, search, e-commerce, natural language queries of any sort, any recommendations or queries. 


So all past experience with successful apps and sites might not be predictive. To the extent that LLMs underpin interactions and usage with virtually all major apps and sites, “adoption” might not be the relevant metric. 


Instead, some measure of indirect usage will probably be more important. 


source: Tooltester


That is not an unusual time frame, even if many make much of the initial ChatGPT attainment of one million users, total. The most-popular apps and sites launched since 2000 have generally required three to five years to reach 10 percent usage by the internet population. 


Ignoring “sampling” or “novelty” behavior, all large language models have some ways to go to reach 10-percent levels of regular use by internet users. There are 5.3 billion internet users globally. One million users barely registers. 


Any large language model would have to hit a level of about 53 million regular users to reach one-percent adoption. 


App/Website

Category

Launch Date

10% Users Reached

Time to Reach 10% (Years)

Facebook

Social Media

Feb 2004

Sept 2009

5.75

WhatsApp

Messaging

Jan 2009

Feb 2014

5

Messenger

Messaging

Aug 2011

Apr 2016

4.67

Instagram

Social Media

Oct 2010

Jun 2013

2.83

TikTok

Social Media

Sept 2016

Nov 2020

4.17

Amazon

E-commerce

July 1995

Jun 2008

12.92

eBay

E-commerce

Sept 1995

Dec 2000

5.33

Alibaba

E-commerce

Apr 1999

Nov 2013

14.67

Google Search

Search

Jan 1998

Dec 2002

4.83

Bing

Search

Jun 2009

Jul 2012

3.17

Baidu

Search

Jan 2000

Dec 2008

8.83

YouTube

Video Streaming

Feb 2005

Jul 2011

6.5

Netflix

Video Streaming

Jan 1997

Sep 2014

17.67

Disney+

Video Streaming

Nov 2019

Nov 2022

3

Spotify

Music Streaming

Oct 2008

May 2015

6.67

Apple Music

Music Streaming

Jun 2015

May 2020

5


Reaching a level of 10-percent adoption could take a while. Consider that it took Amazon nearly 25 years to reach a level of 10-percent DAUs. It took Amazon 13 years to rach a level of 10-percent monthly active users. 


Granted, that might be considered an outlier. But many other popular and successful apps still required eight to nine years to reach a 10-percent DAU level of usage, and a decade often was required to reach a level of 10-percent DAU. 


App/Website

Category

Launch Date

10% DAU Reached

Time to Reach 10% DAU (Years)

10% MAU Reached

Time to Reach 10% MAU (Years)

Facebook

Social Media

Feb 2004

Apr 2012

8.17

Sept 2009

5.75

WhatsApp

Messaging

Jan 2009

Feb 2020

11

Feb 2014

5

Messenger

Messaging

Aug 2011

Jul 2023

11.83

Apr 2016

4.67

Instagram

Social Media

Oct 2010

May 2018

7.5

Jun 2013

2.83

TikTok

Social Media

Sept 2016

Nov 2020

4.17

Nov 2020

4.17

Amazon

E-commerce

July 1995

Jun 2020

24.83

Jun 2008

12.92

eBay

E-commerce

Sept 1995

Dec 2016

21

Dec 2000

5.33

Alibaba

E-commerce

Apr 1999

Dec 2021

22.67

Nov 2013

14.67

Google Search

Search

Jan 1998

Dec 2007

9.83

Dec 2002

4.83

Bing

Search

Jun 2009

Aug 2018

9.17

Jul 2012

3.17

Baidu

Search

Jan 2000

Dec 2011

11.83

Dec 2008

8.83


Likewise, there is a difference between “daily active users ” and other measurements of usage, including:


  • New users

  • Retained users

  • Returning users / Resurrected users

  • Churned users

  • Cohort (useful for churn analysis)

  • Monthly Active users 

  • Daily Active users 


All these are ways of measuring engagement or usage. It is too early to cite accurate “daily active users” for any large language model such as ChatGPT, Bard or others. 


Metric

Description

Value (Example)

User Acquisition

Measures how users find and install the app or visit the website.

10,000 downloads per month, 500 organic website visits per day.

Active Users

Measures the number of users who interact with the app or website within a given timeframe.

5,000 daily active users (DAUs), 10,000 monthly active users (MAUs).

Engagement

Measures how users interact with the app or website.

30 minutes average session duration, 5 pages viewed per session, 20% app retention rate (returning users).

Traffic Sources

Measures where users come from to find the app or website.

40% organic search, 30% social media, 20% direct traffic (bookmarks, typed URL).

Conversion Rate

Measures the percentage of users who complete a desired action (e.g., purchase, sign-up).

2% conversion rate for free trial sign-ups, 5% conversion rate for product purchases.

User Flow

Measures the path users take through the app or website.

70% of users land on the homepage, 20% go directly to a product page, 10% bounce from the landing page.

Device Usage

Measures the types of devices users access the app or website from.

60% mobile, 30% desktop, 10% tablet.

Location Data

Measures where users are located when accessing the app or website.

50% users from the United States, 20% from Europe, 15% from Asia.

User Feedback

Measures user sentiment and satisfaction through surveys, reviews, and support tickets.

4.5-star average rating on app store, 80% positive sentiment in survey responses.


source: Tooltester


No comments:

Have LLMs Hit an Improvement Wall, or Not?

Some might argue it is way too early to worry about a slowdown in large language model performance improvement rates . But some already voic...