Saturday, June 14, 2025

Why Meta Invested in Scale AI

Language model “hallucinations” might always be an issue to some degree, but Meta’s recent investment in Scale AI shows the importance of techniques such as "human-in-the-loop" data labeling. 


“Edge cases” often are the issue, as human language is inherently ambiguous. A single word can have multiple meanings depending on context, tone, and cultural nuances. Machines struggle with this without explicit human guidance. And that’s where humans help. 


When undertaking tasks involving sentiment analysis, summarization or dialogue generation, subjectivity is involved. There isn't always one "correct" answer, and human guidance is helpful there. 


It often is noted that language models do not possess common sense or real-world knowledge in the way humans do, so HITL helps prevent models from generating nonsensical or logically flawed responses.


And while AI models are generally good at learning from patterns, they often struggle with "edge cases" involving unusual, rare, or complex scenarios that aren't well-represented in the training data.


Human annotators can identify, interpret, and correctly label these edge cases. 


Likewise, human-in-the-loop processes allow for the identification and mitigation of biases in the source data.


Also, HITL helps models LLM generate responses that are more aligned with human preferences and ethical guidelines: safe, useful and contextually appropriate for human users.


Friday, June 13, 2025

Zero-Click Already is Changing Search

The implications of zero-click search (where a search does not end in a click to one of the results on the search engine results page) for search providers, advertisers and content providers are understandably huge. 


Oddly enough, the data so far might suggest that zero-click has had a neutral impact on Google search revenue, for example, whatever the possible future impact. Advertisers are working to shift their ad buys in ways that feature inclusion in the AI summarizes, for example.


Content providers arguably have been hardest hit, as the decrease in Web traffic affects their ability to monetize content using advertising placements.


In 2024, 65 percent of all global searches on Google are estimated to be zero-click, according to Briskon. Mobile searches tend to feature more than 75 percent zero clicks. Ahrefs analyzed 300,000 keywords and found that the presence of an “AI Overview” in the search results correlated with a 34.5 percent lower average clickthrough rate for the top-ranking page, compared to similar informational keywords without an AI Overview. 


Bain’s February 2025 research suggests that about 80 percent of consumers now rely on “zero-click” results in at least 40 percent of their searches, reducing organic web traffic by an estimated 15 percent to 25 percent. 


It is not hard to understand why the trend exists. Some 40 percent to 70 percent of generative AI model users use the platforms to conduct research and summarize information (68 percent), understand the latest news and weather (48 percent), and ask for shopping recommendations (42 percent).

   

source: Bain


Thursday, June 12, 2025

Will AI Lead to Cognitive Costs "Near the Cost of Electricity?"

“The cost of intelligence should eventually converge to near the cost of electricity,” says OpenAI head Sam Altman. The economic implications could be quite significant. 


For example, the pricing of services that rely on intelligence (consulting, legal, creative) could approach the marginal cost of electricity, disrupting traditional business models. In other words, at least some cognitive tasks might become commoditized. 


Even if we might not be able to directly compare the “cost” of cognitive activity and equivalent operations conducted by an AI model, we might all agree that AI generally uses more energy and resources upfront than humans to achieve a similar single outcome, but then can scale to produce vastly more output with lower marginal cost. 


In other words, the AI advantage comes when we scale the activities. Looking at the matter in terms of water or electricity consumption, humans use relatively little energy and water in performing knowledge work, but throughput is limited and cost scales roughly linearly with quantity.


If one human produces one unit of work, then 10 units requires 10 humans. AI outperforms at scale. 


From an environmental perspective, a single human brain is “greener” than a single massive AI for one unit of task; however, to match an AI that can do one million tasks, you’d need an army of humans whose combined footprint (millions of computers, offices, and lives) might then rival or exceed the AI’s footprint.


There are other imponderables as well. At least some might speculate that we are entering an age where cognitive labor scales like software: infinite supply, zero distribution cost, and quality improving constantly. 


To be sure, the cost of employing cognitive workers is far more complicated than simple consumption of electricity and water. Still, at scale, AI impact on cognitive work does seemingly create economies of scale. 


By that logic, AI doesn’t just automate tasks; it commoditizes thinking. Of course, that looks only at cognitive input costs, not outputs. We probably are going to have to look at outcomes produced by using AI at scale, such as curing a particular disease or reducing production costs for some product. 


Still, it is shocking to ponder the economic implications of cognitive costs related to the cost of the electricity and water required to produce the models and inferences.


Waymo Features a Rider Cost "Premium" Compared to Uber or Lyft, By Design

It appears there is an early adopter pricing premium being paid for Wayno rides, compared to either Uber or Lyft, according to a study by Obi. 


To be sure, that is by design: Waymo entered the market with a premium pricing position. So going driverless doesn’t mean a cheaper ride; instead it is deliberately priced at a premium to either Uber or Lyft. 

source: Obi

Wednesday, June 11, 2025

Why AI Era of Computing is Different

If we can say computing has moved through distinct eras, each with distinct properties, it is not unreasonable to predict that artificial intelligence represents the next era. And though earlier generations are normally defined by hardware, that is less true of more-recent eras, where virtualization is prevalent and the focus is more on applications than hardware. 


But AI might shift matters further. 


Era

Key Feature

Key Technologies

Mainframe Era (1950s–1970s)

Centralized computing

IBM mainframes

Personal Computing Era (1980s–1990s)

Decentralization, personal access

PCs, MS-DOS, Windows

Internet Era (1990s–2000s)

Connectivity, information access

Web browsers, search engines

Mobile & Cloud Era (2000s–2020s)

Always-on, distributed services

Smartphones, AWS, Google Cloud


The AI era should feature software “learning” more than “programming.” Where traditional software follows explicit rules, AI models learn from data, discovering patterns without being explicitly programmed.


AI systems can generalize from experience and sometimes operate autonomously, as in the case of self-driving cars, recommendation systems or robotic process automation.


Voice assistants, chatbots, and multimodal systems mark a transition to more human-centric interfaces, moving beyond keyboards and GUIs.


AI can be considered a distinct era of computing, not because it introduces new tools, but because it changes the nature of computing itself, from explicitly coded systems to systems that evolve and learn.


Tuesday, June 10, 2025

Why Language Models Tackling Very-Complex Problems Often Provide Incorrect Answers

It perhaps already is clear that large language models using “reasoning” (chain of thought, for example) can provide much-better accuracy than non-reasoning models. 


Reasoning models consistently perform well on simple queries. Benchmarks such as MMLU (Massive Multitask Language Understanding) and ARC (Abstraction and Reasoning Corpus) show that models reach near-human or superhuman accuracy on tasks that do not require multi-step or abstract reasoning.


But reasoning models frequently exhibit "reasoning is not correctness" gaps as query complexity grows. Performance degrades sharply as step counts increase, as is required for complex tasks. Token usage grows three to five times for complex queries while accuracy drops 30 percent to 40 percent compared to simple tasks.


Reasoning models using techniques like Chain-of-Thought (CoT) can break down moderately complex queries into steps, improving accuracy and interpretability. However, this comes with increased latency and sometimes only modest gains in factuality or retrieval quality, especially when the domain requires specialized tool usage or external knowledge.


The biggest issues come with the most-complex tasks. Research shows that even state-of-the-art models experience a "reasoning does not equal experience" fallacy: while they can articulate step-by-step logic, they may lack the knowledge or procedural experience needed for domain-specific reasoning, for example. 


That happens because the reasoning models use logically coherent steps that contain critical factual errors. In scientific problem-solving, 40 percent of model-generated solutions pass syntactic checks but fail empirical validation, for example. 


Such issues are likely to be most concerning for some use cases, rather than others. For example, advanced mathematics problems such as proving novel theorems or solving high-dimensional optimization problems often require formal reasoning beyond pattern matching


Problems involving multiple interacting agents (economic simulations, game theory with many players) can overwhelm LLMs due to the exponential growth of possible outcomes.


In a complex negotiation scenario, an LLM might fail to account for second-order effects of one agent’s actions on others. 


Also, problems spanning multiple domains (designing a sustainable energy grid involving engineering, economics, and policy) require integrating diverse knowledge LLMs were not trained on. 


Of course, one might also counter that humans, working without LLMs, might very well also make mistakes when assessing complex problems, and also produce logically reasoned but still "incorrect" conclusions! 


But there are probably many complex queries that still will benefit, as most queries will not test the limits of advanced theorems, economic simulations, game theory, multiple domains and unexpected human behaviors. 


So for many use cases, even complexity might not be a practical issue for a reasoning LLM, even if they demonstrably become less proficient as problem complexity rises. And, of course, researchers are working on ways to ameliorate the issues. 


AI Text Only Suffers from Emotional Flatness In Some Use Cases

As with all generalizations, the claim that writing produced by artificial intelligence or generative AI suffers from a lack of emotion requires some elaboration. Not all writing tasks require or even allow much “emotional expression.”


Academic or essay writing; advertising and marketing content; history or instructional content might do just fine with a straightforward style. 


On the other hand, most of us might wonder how well present and future models will be able to handle fiction, where nuance, emotional depth, and subtlety or a unique voice do matter. 


AI might also not be so great for memoirs, reflective essays, or opinion pieces.


The reasons for the difference are pretty simple. Genres such as academic writing, advertising, and instructional content follow established structures and therefore are easier for AI to mimic.


Fiction and personal narratives require a level of creativity, empathy, and emotional understanding that AI systems currently struggle to replicate. AI can mimic certain tones and styles, but it often lacks the unique voice and perspective that human writers bring to their work.


The point is that AI content, which already is prevalent, will seem more appropriate in some genres, compared to others. One size, as they say, does not fit all. And as useful as AI might be for many humans, in many situations, writers are not going to stop writing because they could use AI for that purpose. 


No, writers write because they enjoy the craft of writing, just as musicians play music or artists paint. AI will not deter any of these creators from doing what they enjoy. My brother wouldn't get any enjoyment out of having AI paint a picture. My sister wouldn't prefer that an AI create and play music. I wouldn't be interested in using AI to write on my behalf. I write because I enjoy the process.


Still, to the extent that AI is a tool to automate or speed writing tasks for many, when it is simply a practical task, the inability to fully mimic human nuance will not be an issue. We don't expect nuance in our emails, ads, marketing copy, technical training manuals or instructional material, really. We never expect it for legal, academic or technical writing, either.


"Lack of emotion" is an issue mostly for creative or fiction writing or biographies; film and TV scripts and often musical lyrics.


Access Network Limitations are Not the Performance Gate, Anymore

In the communications connectivity business, mobile or fixed, “more bandwidth” is an unchallenged good. And, to be sure, higher speeds have ...