Tuesday, January 9, 2024

Happy New Year!!! Top AI stories of 2023 ... TL;DR and podcast 9Jan24

Last update: Friday 1/12/24 
Happy New Year!!! ... And welcome to our 
TL;DR summary + podcast about the top AI stories that were posted on our "Useful AI News" page in 2023. We selected the stories that we thought will prove to be the most useful for predicting what the top AI stories will be in 2024.

Our first top story is OpenAI's movement of ChatGPT to its new GPT-4 model in March 2023, a move that led to ChatGPT's domination of the AI news sector for the rest of the year. Our second top story is about the most significant Big Tech challenges to OpenAI's supremacy in the final weeks of 2023, i.e., challenges from Amazon, Google, and Microsoft ... yes, Microsoft.

These top stories framed our prediction of the top stories for 2024. Spoiler alert, here's our prediction: 
Whereas large language models (LLMs) dominated AI news throughout 2023, the rapid and widespread proliferation of small language models (SLMs) will be the dominant AI news stories throughout 2024. 

Click the "start" button in the audio control (below) to hear the podcast ...  If audio fails to start, or gets stuck, try reloading the page.
 
TL;DR link  HERE

A. TL;DR ... Top AI stories in 2023

1. OpenAI releases ChatGPT's new GPT-4 model
When OpenAi released ChatGPT running on the GPT-3.5 model in November 2022, it was so impressive that one hundred million users subscribed to its services within three months. 

When OpenAI introduced its new GPT-4 model in March 2023, it quickly blew GPT- 3.5 out of the water because its cognitive skills were far more powerful. We were then told that GPT-4 was far more powerful than GPT-3.5 because it had been trained on far more data. Bigger was better.

Unfortunately, this "explanation" didn't really explain the new cognitive skills that users of these models have found to be most impressive and/or the most useful, e.g., the models' ability to answer complex questions, solve problems, write poems about a suggested theme in a specified literary style, crack jokes in the style of well known comedians, debug computer code,  summarize documents, etc,  etc, etc.

Fortunately, many AI experts involved in the creation of these and other large language models frankly admitted that the training of these models was not designed to imbue the models with these higher level skills. Their sudden unexpected emergence is still a mystery. Note that "Emergent" skills are skills that a model somehow acquires, but was not trained to acquire
  • "From GPT-1 to GPT-4: A Comprehensive Analysis and Comparison of OpenAI’s Evolving Language Models", Dhanshree Shripad Shenwai, Marktechpost,, 7/5/23


2. Big Tech's separate challenges to OpenAI's dominance

2a) 
Microsoft's most important announcements at its Ignite 2023 conference for developers and IT professionals (Nov 14-17, Seattle, WA)
  • Rather than use ChatGPT as the orimary user interface for its LLM, Microsoft confirmed that it would provide a customized interface, called a "Copilot", for each of its office productivity apps. The copilots would help their  users to use each productivity app more effectively. So there would be a copilot for Word, for Excel, for Teams, for GitHub, etc. Enterprise customers who subscribed to Microsoft's office apps would need an additional subscription for each copilot
  • It renamed "Bing Chat" to "Microsoft Copilot" and subsequently announced that Microsoft Copilot running GPT-4 would be free, whereas OpenAI charged its "plus" users a $20 per month subscription fee.
  • It announced its intention to build its own AI chips rather than buy AI chips from Nvidia that were in short supply
  • It enabled Microsoft's Azure cloud to support open source LLMs, e.g., Facebook's LLaMa
These initiatives enhanced Microsoft's strategic independence from OpenA's LLMs and Nvidia's chips, and offered potential opportunities for Microsoft to earn fees from open source LLMs that used its Azure cloud. 
  • "The Inside Story of Microsoft’s Partnership with OpenAI", Charles Duhigg, The New Yorker, 12/1/22

2b) The "AI Big Bang" a/k/a The fall and rise of Sam Altman ... 11/17/23 to 11/21/23
... Note that this event is only included as a top AI story of 2023 because of its potential influence on Microsoft's actions in 2024  ....

  • The tumultuous long weekend began on Friday evening when the OpenAI board abruptly notified Sam Altman that he was fired. In its dismissal notice the OpenAI board said that it no longer trusted Sam Altman, but it did not say why. However, the Washington Post reported that

    "Four years ago, one of Altman’s mentors, Y Combinator founder Paul Graham, flew from the United Kingdom to San Francisco to give his protégé the boot, according to three people familiar with the incident, which has not been previously reported."

    "Graham had surprised the tech world in 2014 by tapping Altman, then in his 20s, to lead the vaunted Silicon Valley incubator. Five years later, he flew across the Atlantic with concerns that the company’s president put his own interests ahead of the organization — worries that would be echoed by OpenAI’s board."

  • On Tuesday evening, Altman returned to OpenAI as CEO; three out of the four members of the board were dismissed; and two new outsiders were appointed to the board. 
Microsoft's CEO Satya Nadella's confident leadership won this "battle" handily, especially his reassuring assertions that Microsoft's $13 billion investment in OpenAI gave it full access to OpenAI's revolutionary LLM technology, regardless of what happened to Sam Altman or to OpenAI. Indeed, Microsoft's stock hit an all-time high after Altman's return was announced.  Of course Sam Altman won, but the three board members who were dismissed lost, and Open AI lost. OpenAI lost because it failed to fulfill its non-profit mission. OpenAI is now a de facto profit-oriented operation beholden to Microsoft and other investors.

2c) Amazon's "re:Invent 2023" customer conference (Nov 27-Dec 1, Las Vegas, NV)  
Amazon played catchup with regards to the deployment of generative AI technology; more specifically, Amazon played catchup to Microsoft. 

Few were surprised by Amazon's announcements  of an array of generative AI services that it would provide for enterprise customers of its AWS cloud services. Its most memorable announcement were for its chatbot and its image generator:
  • Amazon's chatbot is called "Q"
  • Its image generator is called "Titan Image Generator"
Note that "Titan" is the name of Amazon's LLM. Also note that Amazon had previously announced its intention to produce its own GPU chips called "Trainium". Amazon also announced its development of more powerful "Trainium" chips 

2d) Google announced "Gemini", its new language models, on 12/6/23
Gemini comes in three versions: Gemini Nano, Gemini Pro, and Gemini Ultra
  • Nano, the lightest version, is a small language model (SLM) that will eventually run on all Android devices, but currently only runs on Google's Pixel smartphones. It will not require Internet connectivity

  • Pro is an LLM, a more powerful model than Nano, that and currently runs Bard on all devices. Bard can be accessed at bard.google.com

  • Ultra is the most powerful LLM, but Google has not yet announced a release date
At present Nano and Pro accept text input, then deliver text output, but all versions will eventually be multimodal, accepting input as text, images, video, audio, and code.

All models were trained on Google's own chips (Tensor Processing Units) and will run in data centers on those chips

Google's official announcement presented the results of 18 benchmarks tests that compared Gemini Ultra with GPT-4. Gemini performed better than GPT-4 by small margins on all but one test.
  • Stock market ... the most important benchmark???
    "Alphabet soars as Wall Street cheers arrival of AI model Gemini", Aditya Soni, Reuters, 12/7/23

2e) Microsoft announced Phi-2, a small language model (SLM) on 12/12/23
... Note that this event is included as a top AI event of 2023 because of its potential influence on everybody's actions in 2024 ...

Microsoft published the results of a Phi-2 vs. Ultra test on its blog in a note titled: "The surprising power of small language models". Phi-2 is not a large language model (LLM); Phi-2 is a small language model (SLM). Whereas Gemini Ultra contains 1.56 trillion parameters, Phi-2 only contains 2.7 billion parameters; so Ultra is about 580 times as large as Phi-2.

Microsoft posed the kind of question to Phi-2 and to Gemini Ultra that students might encounter in an introduction to physics course, specifically: to calculate the speed of a skier when it reaches the bottom of a hill, given the skier's mass, the gravitational constant, and the height of the hill. Phi-2 calculated the correct answer, but Ultra's answer was wrong. Adding insult to injury, Microsoft asked Phi-2 to identify the logical error in Ultra's calculations ... which it did. How could this happen?  The answer implied by Microsoft's blog note is plausible,  
"The massive increase in the size of language models to hundreds of billions of parameters has unlocked a host of emerging capabilities that have redefined the landscape of natural language processing. A question remains whether such emergent abilities can be achieved at a smaller scale using strategic choices for training, e.g., data selection ...
... training data quality plays a critical role in model performance. This has been known for decades, but we take this insight to its extreme by focusing on “textbook-quality” data, following upon our prior work “Textbooks Are All You Need.” 
By spending a couple of hours searching the Internet for other notes about small language models, the editor of this blog discovered that many other researchers in the AI community had published similar findings about the surprisng power of SLMs. Contrary to the conventional wisdom that dominated most discussions of language models throughout 2023, bigger is not necessarily better.
  • "The Rise of Small Language Models— Efficient & Customizable", Bijit Ghosh, Medium, 11/26/23
  • "9 Best Small Language Models Released in 2023", Sandhra Jayan, AIM, 12/7/23
  • "Small language models an emerging GenAI force", Antone Gonsalves, TechTarget, 12/15/23
  • "Everything You Need to Know about Small Language Models (SLM) and its Applications", Tanya Malhotra, MarTechPost, 12/5/23
  • "7 Steps to Running a Small Language Model on a Local CPU", Aryan Garg, KDnuggets, 11/14/23


B. Our prediction for top AI stories in 2024 ...

Whereas large language models (LLMs), especially ChatGPT's GPT-4, dominated AI news in 2023, the rapid and widespread proliferation of small language models (SLMs) will be the dominant AI news stories throughout 2024
.

LLMs know everything about everything, so they currently have trillions of parameters. By contrast, SLMs are focused; their knowledge domains are limited; so they only need billions, not trillions, of parameters. Nevertheless, as noted in 
Microsoft's report and elsewhere, SLMs have surprising power, sometimes comparable to LLMs.

-- SLMs should cost a fraction of the cost of LLMs to develop and operate. Whereas LLMs cost billions to develop and operate, one would expect that SLMs could be developed and operated for hundreds of millions, perhaps tens of millions.

-- SLMs can be trained on high quality private data which their developers license from the owners of the data, licenses that do not violate property rights or invade individual privacy.

Therefore Big Tech's expensive LLMs will no longer dominate the language model space. Less expensive, more cost-effective SLM's will be developed by scores of smaller firms, e.g., Anthropic, firms that can afford to develop highly effective, but far less expensive smaller models

With so many potential competitors, the shared development of open source platforms offers greater potential profitability to the contributors than if each developer pursued the separate development of its own closed platform. Indeed, open source development would be more likely to resolve the mystery of emergent properties.

None of the above implies that the Big Tech firms will not produce their own SLMs in 2024. Indeed, Google's Nano is an SLM. Nevertheless, Big Tech will face stiff competition from Smaller Tech. However, given its public confirmation of the surprising power of SLMs, it is surprising that Microsoft did not announce its own commercial SLM. Instead, the Microsoft report makes the following declarations:
"With its compact size, Phi-2 is an ideal playground for researchers, including for exploration around mechanistic interpretability, safety improvements, or fine-tuning experimentation on a variety of tasks. We have made Phi-2(opens in new tab) available in the Azure AI Studio model catalog to foster research and development on language models."

Given that Microsoft, in effect, had outsourced a $13 billion contract to OpenAI to develop a commercial LLM for Microsoft, why didn't Microsoftt award another contract to OpenAI to develop a commercial SLM? Perhaps Microsoft is  reluctant to extend its dependence on OpenAI at this time, given the stunning instability that OpenAI displayed during the "AI Big Bang" weekend just a few weeks ago. 


C. Top stories of 2023 ...
  1. OpenAI releases ChatGPT's new GPT-4 model
    "GPT-4 has arrived. It will blow ChatGPT out of the water.", Drew Harwell and Nitasha Tiku, Washington Post3/14/23
    -- This story also covered by NY TimesThe AtlanticMIT Tech Review, TechCrunchThe Verge, ForbesNature  ... and OpenAI 

  2. Big Tech's most significant challenges to OpenAI and GPT-4 in the final weeks of 2023 ... Only our main headline stories are displayed, but links to related articles can be found on the TL;DR+ Podcast page for each week

    a) Microsoft's Ignite 2023 conference for developers and IT professionals (Nov 14-17, Seattle, WA)
    -- Overview: "Microsoft Ignite 2023: Copilot AI expansions, custom chips and all the other announcements", TechCrunch, 11/15 & 16/23 ... This page contains TechCrunch summaries of every announcement made by Microsoft.  
    -- Other related stofies  TL;DR + Podcast 19Nov23

    b) The "AI Big Bang" a/k/a The fall and rise of Sam Altman
    -- "OpenAI Ousts CEO Sam Altman", Will Knight, Wired, 11/17/23 
    -- "Sam Altman Is Reinstated as OpenAI’s Chief Executive", Cade Metz, Mike Isaac, Tripp Mickle, Karen Weise and Kevin Roose, NY Times, 11/22/23
    -- Other related stofies TL;DR and podcast 26Nov23

    c) 
    Amazon's re:Invent 2023 conference (Nov 27-Dec 1, Las Vegas, NV) 
    -- Overview "Here’s everything Amazon Web Services announced at AWS re:Invent", Christine Hall, TechCrunch, 11/29/23
    -- Other related stofies TL;DR and Podcast 3Dec23

    d)
     Google announced "Gemini"... its new LLM
    -- "Google DeepMind's Demis Hassabis Says Gemini Is a New Breed of AI", Will Knight, Wired, 12/6/23 
    -- Other related stories TL;DR and podcast 10Dec23


    e) Small Language Models (SLMs)
    -- "Microsoft releases Phi-2, a small language model AI that outperforms Llama 2, Mistral 7B", Carl Franzen, VentureBeat, 12/12/23 ... Note: Phi-2 also performed better than Gemini Ultra on a test of logical reasoning.
    -- "The surprising power of Small Language Models (SLMs)", Mojan Javaheripi and Sébastien Bubeck, Microsoft Research, 12/12/23
    -- Other related stofies  TL;DR and podcast 21Dec23

This page contains links to responses by Google's Bard chatbot running Gemini Pro to 12 questions that should be asked more frequently, but aren't. As consequence, too many readily understood AI terms have become meaningless buzzwords in the media.

No comments:

Post a Comment

Your comments will be greatly appreciated ... Or just click the "Like" button above the comments section if you enjoyed this blog note.