Our choices of “favored” language models will probably remain somewhat idiosyncratic for a while, until some winnowing of market leaders occurs and a stable structure emerges.
Most casual users will probably simply rely on ChatGPT and likely have no way to evaluate nuances of different engines. Others might have some familiarity with a few different models, but have difficulty explaining their impressions of the differences between models.
Also, models can change over time. Very early on, I found using Gemini frustrating. Though it was among the best for importing results to Google productivity apps, the other models have gradually gotten easier to use, in that regard. But ease of use was not the key performance indicator, at least for me.
Since I do lots of forecasting-type work, Gemini’s earlier versions were often frustrating for refusal to produce such content. That does not seem to be the case for the latest models, so I assume there were short-term guardrails put into place for essentially-regulatory or other business reasons (such as avoiding the embarrassment of nonsensical, libelous or dangerous answers).
For casual, everyday uses, though, I increasingly rely on Google search, sometimes with its AI Mode enabled, but often not even bothering to do so, as the results will include such results anyhow.
So much for language models “killing search.”
As Gemini’s performance has improved, I find it use it more, and ChatGPT less. Grok does seem to provide more punchy, interesting commentary, but Perplexity or Claude seem better when I need to document sources.
As a non-coder, I never can evaluate that use case.
But here’s another take on the strengths of various models.
source: Special Situations Research, Seeking Alpha, Bret Jensen
No comments:
Post a Comment