Neoskeptics: Building the most powerful LLMs -- Amazon/Anthropic, Databricks, Musk, and Microsoft/OpenAI ... TL;DR + podcast 31Mar24

Last update: Sunday 3/31/24

Welcome to our 31Mar24 podcast + TL;DR summary of the past week's top AI stories on our "Useful AI News" page ➡ 1) Amazon adds $2.75 billion to Anthropic 2) DBRX = most powerful open source LLM (3) Elon Musk's Grok-1.5 LLM and (4) Microsoft/OpenAI Plot $100 Billion Stargate AI Supercomputer,

Audio podcast ... 9 min

If audio fails to start, or gets stuck, try reloading the page

TL;DR link ➡ HERE

A. TL;DR summary of Top 4 stories

1. Anthropic | 2. DBRX | 3. Grok-1.5 | 4. AI Supercomputer

All four of this week's top stories are about building the most powerful LLMs

1) Amazon adds $2.75 billion to Anthropic

According to The NY Times:

"Amazon said on Wednesday that it had added $2.75 billion to its investment in Anthropic ... Six months ago, Amazon invested $1.25 billion in Anthropic, making the San Francisco start-up Amazon’s most important A.I. partner. Amazon said at the time that it had the option to bring its total investment to $4 billion. It had until the end of March to do so, according to financial filings

Last month, the Times reported that Anthropic had secured $7.3 billion from investors in 2023. Why? Because the reigning LLM dogma among Big Tech is that bigger is better, i.e., bigger LLMs will usually be more powerful LLMs. Bigger LLMs require more high powered chips, more AI experts, and more electrical energy -- all of which cost more money. So creative Small Tech start-ups, like OpenAI and Anthropic, need rich Big Tech sponsors, like Microsoft and Amazon. Only Big Tech Google and Elon Musk are funding their own LLM's ... (Yes, Elon Musk is a Big Tech)

BackToTop

2) DBRX = most powerful open source LLM

Databricks is a Little Tech company that did not have a BigTech partner to pay its bills, so it did not try to build a multi billion dollar LLM that was bigger than everyone else's. Instead, it built the best LLM (136 billion parameters) that it could afford to build with "only" $10 million of its own funds.

While reading Wired's spirited "Inside Story" about how Databricks built its LLM, the editor of this blog found himself quietly cheering in the same way he cheered when another "inside" account was read to him as a child. "I think I can, I think I can, I think I can" as the Little Engine That Could chugged its way up the steep hill to victory in the classic children's story.

So he was delighted to read that Databricks, "The Little Tech That Could", produced the DBRX LLM that aced its benchmark tests to become the most powerful open source LLM, beating all of the other open source contenders, including Elon Musk's Grok-1.

As noted in one of our recent TL;DRs -- Does generative AI really boost profits? -- Big Tech has conceded that the costs of current LLMs are way too high to provide cost-effective generative AI services for their customers. Nevertheless, Big Tech's reigning dogma decrees that bigger LLMs are better LLMs. Therefore LLMs that cost 10 times as much to develop and operate as today's over hyped LLMs will somehow be more cost-effective. Why? Because cost-effectiveness is an emergent property of GenAI, right? Hmmmmmmm ...

Meanwhile, common sense suggests that smaller, focused, open source LLMs, like DBRX, that were based on high quality data would cost less and, yield more reliable output. Being open source, they can be customized to meet a corporate customer's specific needs. Moreover, they can be run more securely on the customer's own servers than on remote servers in the clould. So here's three cheers for the "Little Tech That Could" and did ... :-)

BackToTop

3) Elon Musk's Grok-1.5 LLM

Last week Musk declared that Grok-1, the LLM currently underlying his Grok chatbot, was now open source when he published its structure and various weights.

This week's news is that xAI, Musk's AI development company, has been working on Grok-1.5, a more powerful, but proprietary LLM. xAI's blog posted four benchmark tests that show Grok-1.5 doing almost as well as Microsoft/OpenAI's GPT-4, Google's Gemini Pro 1.5, and Anthropic's Claude 3 Opus on three tests, but slightly better on the coding test. Grok-1.5 has a 128,000 token context window for long prompts. The blog note promises that Grok-1.5 will be accessible to "our early testers and existing Grok users on the 𝕏 platform in the coming days."

In a brief 3/28/24 post on X, Musk noted that xAI was already working on Grok-2 and that "Grok 2 should exceed current AI on all metrics. In training now." ... This post was viewed 18.5 million times by noon on Saturday 3/30/24. It has game-changing implications if xAI delivers on this commitment and Musk converts Grok-2 into an open source model available to everyone.

BackToTop

4) Microsoft/OpenAI Plot $100 Billion Stargate AI Supercomputer

Once again, the well connected publication The Information scored another exclusive "scoop", this time about Microsoft and OpenAI's plans to build Stargate, the world's largest AI supercomputer, by 2028. The article's opening paragraph captures the essence of the story:

"Executives at Microsoft and OpenAI have been drawing up plans for a data center project that would contain a supercomputer with millions of specialized server chips to power OpenAI’s artificial intelligence, according to three people who have been involved in the private conversations about the proposal. The project could cost as much as $100 billion, according to a person who spoke to OpenAI CEO Sam Altman about it and a person who has viewed some of Microsoft’s initial cost estimates."

The article goes on to report that the partners are in a the midst of a five phase development plan. Phases 1 and 2 were not described; the partners are currently in Phase 3; the first AI supercomputer for OpenAi will be built in Phase 4; and the massive Stargate, containing millions of super chips, will be launched in phase 5. Microsoft will provide the funding, but is hedging its bets:

"Microsoft’s willingness to go ahead with the Stargate plan depends in part on OpenAI’s ability to meaningfully improve the capabilities of its AI, one of these people said. OpenAI last year failed to deliver a new model it had promised to Microsoft, showing how difficult the AI frontier can be to predict. Still, OpenAI CEO Sam Altman has said publicly that the main bottleneck holding up better AI is a lack of sufficient servers to develop it."

Indeed, Microsoft seems to placing bets on a wide range of options. Readers of this blog may recall our TL;DR in early January 2024, Microsoft creates a new small language model (SLM) team. This high powered team was organized because Microsoft Research had discovered that small language models had "surprising power".

BackToTop

B. Top 4 stories in past week on "Useful AI News"

Other Models
"Amazon Adds $2.75 Billion to Its Stake in the A.I. Start-Up Anthropic", Karen Weise, NY Times, 3/27/24 ***
-- This story also covered by Bloomberg, VentureBeat, The Information, Wall Street Journal, TechCrunch,
Other Models
"Inside the Creation of the World’s Most Powerful Open Source AI Model", Will Knight, Wired, 3/27/24 ***
-- Model is called DBRX ... 136 billion parameters
-- This story also covered by The Information, Business Insider ... and Databricks
Other Models
"Elon Musk announces Grok-1.5, nearing GPT-4 level performance", Shubham Sharma, VentureBeat, 3/29/24 ***
-- This story also covered by TechCrunch, Wall Street Journal, Reuters, Endgadget, ... and xAI (Musk)
Microsoft
"Microsoft and OpenAI Plot $100 Billion Stargate AI Supercomputer", Anissa Gardizy and Amir Efrati, The Information, 3/29/24 ***
-- This story also covered by Gizmodo, Reuters,

BackToTop

C. Dozen Basic AI FAQs ➡ HERE

This page contains links to responses by Google's Bard chatbot running Gemini Pro to 12 questions that should be asked more frequently, but aren't. As consequence, too many readily understood AI terms have become meaningless buzzwords in the media.

BackToTop

Neoskeptics

Pages

Sunday, March 31, 2024

Building the most powerful LLMs -- Amazon/Anthropic, Databricks, Musk, and Microsoft/OpenAI ... TL;DR + podcast 31Mar24

No comments:

Post a Comment