Neoskeptics: Once upon a recent time, there was a race between a tortoise named ChatGPT, a tortoise named Claude, and a very fast hare named Gemini

Last update: Wednesday 7/16/25

Once upon a recent time, there was a race between a tortoise named ChatGPT, another tortoise named Claude, and a fast moving hare named Gemini. At the start of the race, everybody knew who was supposed to win. But as quick as a wink, some unexpected things began to happen.

Click HERE to read more.

Note: The cartoon image for this note was composed via ChatGPT on GPT-4o.

Indeed when Sam Altman and Elon Musk founded OpenAI in 2015, they did so because they wanted to prevent super wealthy Google from buying its way to victory in the race to Artificial General Intelligence (AGI). They believed that the only way to provide Google with credible competition was for the rest of the AI community to work together using open source tools.

Then some unexpected things began to happen. OpenAI did not develop open source tools; it developed proprietary tools funded by the billions of dollars it received from Micro$oft, a company that was richer than Google. Shortly thereafter, a group of OpenAI's founding staff broke away to create Anthropic, a firm that also did not develop open source tools; Anthropic developed proprietary tools with the billions it received from super wealthy Amazon (and Google 😳).

A. Rankings | B. Explanations | C. Bottom Line

A. Ranking the chatbots

However, the next turn delivered the blow that caused the hare in the image to become perplexed, immobilized, and two headed, with each head tracking a different tortoise. ChatGPT and Claude pursued divergent paths to winning the race to AGI. So which tortoise should the hare be chasing? Gemini seems to be chasing both ... but with disappointing results, as will be shown in the following table.

The first column on the left side of the table contains a list of functions that savvy computer users -- who use chatbots and agents, but not APIs -- frequently ask chatbots to perform.

The next three columns note the editor’s ranking of the chatbots for each function as 1st, 2nd, or 3rd. A brief rationale for each ranking appears in the row beneath the ranking. More extensive explanations of the rankings are presented in section B of these notes.

ChatGPT is assumed to be running on GPT-4o; Claude is running on Sonnet; and Gemini is running on Plus, except for images, where it is assumed to be running on Flash because Plus cannot produce images.
Warning: The rankings in the table are current snapshots of rapidly evolving skill sets; four or five months from now the rankings may be quite different.

BackToTop

… Chatbots Ranked by Skills …

Skill	First	Second	Third
1. Memory	ChatGPT	Gemini	n/a
Rationale	ChatGPT acquired this skill in late 2024, so it knows more about its users	Gemini recently acquired this skill, so it is just getting to know its users	Claude cannot remember previous sessions ... yet

2. Search	Gemini	ChatGPT	Claude
Rationale	Uses Google search	Uses Bing	Uses Brave

3. Follow a link	Gemini	n/a	n/a
Rationale	Given a link, Gemini can read an unrestricted page	ChatGPT cannot follow arbitrary links to unrestricted pages	Claude cannot follow arbitrary links to unrestricted pages

4. Labeled images	ChatGPT	Gemini (on Flash)	n/a
Rationale	ChatGPT remembers its user’s preferred image styles	Flash doesn’t know what Plus has learned about its user’s preferences	Claude only produces text output

5. Complex reasoning	Claude	ChatGPT	Gemini
Rationale	See editor’s explanations	See editor’s explanations	See editor’s explanations

6. Write	Claude	ChatGPT	Gemini
Rationale	See editor’s explanations	See editor’s explanations	See editor’s explanations

7. Summary	Claude	ChatGPT	Gemini
Rationale	Top reasoning and writing	Mid reasoning and writing	Low reasoning and writing

8. Tutor	ChatGPT	Claude	Gemini
Rationale	Top memory + mid reason and writing	No memory, but top reason and writing	Mid memory and low reason and writing

BackToTop

B. More extensive explanations of the rankings

1. Memory

This skill is first on the list because it enhances the effectiveness of all the other skills that a chatbot might possess.

Upon receiving the same prompt from different users, chatbots with memory can customize their responses to each user’s prior knowledge and preferences as revealed by its previous interactions with each user, even if those interactions occurred weeks ago.
A chatbot’s memory of previous interactions with a user also facilitates spontaneous conversations wherein users ask questions as they come to mind, rather than taking time out to “engineer” carefully prepared prompts. But if a spontaneous prompt solicits an off-topic response, the user can quickly issue an immediate correction, saying something like, “Here’s a better way for me to ask that question”, the same way that the user would modify a question when conversing with another human.

2. Search

It is an indisputable fact of Internet life that Google’s search engine is still the best, Bing is second, and Brave is a distant third. Gemini uses Google, ChatGPT uses Bing, and Claude uses Brave.

3. Follow a link

Google’s technology enables Gemini to follow any link to any unrestricted page. The other chatbots can follow some links, but not all.

4. Labeled images

Gemini’s underlying Pro model, like Claude’s underlying Sonnet model, can only produce text; neither model can generate images.

But Gemini can also access Flash, a model that can produce images. Unfortunately, Flash does not have memory nor does it have access to Pro’s memory of previous interactions with a user, so Flash requires a carefully engineered prompt that contains explicit statements of all of a user’s visual preferences.

BackToTop

5. Complex reasoning

Their underlying models enable all three chatbots to engage in complex reasoning, a process which they can display in a “chain of thoughts” before they provide their responses to a user’s prompt. However the personal experience of the editor of this blog suggests that Claude is the “smartest”, that Gemini is the most confused, and ChatGPT lies somewhere in between.

Of course, the editor can’t prove his judgment, but he can provide links to verbatim quotes of the detailed responses of Claude and Gemini to the same two questions as one example among many that he has recently encountered that supports his judgement. The reader should compare the straight-forward logic of Claude’s responses to muddled logic of Gemini’s.

Click here for Claude's response to Question 1
Click here for Gemini's response to Question 1
Click here for Claude's response to Question 2
Click here for Gemini's response to Question 2

6. Writing

The reader is asked to compare the responses of Claude and Gemini cited in the previous section once again, this time with an eye towards the specific wording of the responses. Good writing begins with clear thinking, so it should come as no surprise that Claude's writing is "better" than Gemini's. However, the crisp elegance of Claude's phrasing is unexpected. As for Claude's writing skills vs. ChatGPT's, the editor's nod to Claude is but one of many such nods on the Internet.

7. Summarizing

As noted in the table, Claude's top slots in reasoning and writing skills should enable it to produce "better" summaries. The editor has personally observed Claude's superiority in the last few weeks whenever he asked Claude and ChatGPT to summarize complex procedures in online manuals that were required for specific tasks. ChatGPT tended to produce lengthier general summaries that sometimes failed; Claude's concise summaries that focused on the editor's specific objectives always worked.

Question: Regular readers of this blog may wonder why the editor uses TL;DR summaries of each week's top stories that were written by ChatGPT. Why doesn't he use Claude?

Answer: The editor uses ChatGPT because he didn't recognize Claude's superior summary skills until recently. Claude's lack of memory will require the editor to construct carefully engineered prompts to elicit its best response ... but the editor will do so, starting next week ... 😎

8. Tutoring

By definition, tutors teach one student at a time by presenting explanations that are customized to that student’s current interests and understanding. Chatbots with memory, like ChatGPT, become ideal tutors, patiently responding to their students’ questions in as much or as little detail as their student desires or requires.

Chatbots can help their users learn new concepts and procedures quickly and effectively. As a tutor, ChatGPT’s memory, i.e., its understanding of its students’ current knowledge and interests, trumps Claude’s eloquence.

BackToTop

C. Bottom line ... "A mixture of chatbots"

ChatGPT = 3 Firsts, 4 Seconds
Claude = 3 Firsts, 1 Second, 1 Third
Gemini = 2 Firsts, 2 Seconds, 4 Thirds

At this time, no chatbot is the best at everything. Computer savvy users should deploy whatever chatbot has the best skills needed to achieve the user's objectives. The chosen chatbot will therefore vary from one problem to the next.

BackToTop

_____________________________

Links to related notes on this blog:

"Apple's dilemma: Pleasing its iPhone users AND its long term investors", 6/12/25

Section C of this note contains a long verbatim quote from Claude in which Claude provided a rationale for the editor's proposed solution to Apple's dilemma, a rationale that was far more eloquent and more persuasive than the editor's own drafts.

Neoskeptics

Pages

Tuesday, July 15, 2025

Once upon a recent time, there was a race between a tortoise named ChatGPT, a tortoise named Claude, and a very fast hare named Gemini

No comments:

Post a Comment