Neoskeptics: Google's new Gemini 1.5 ... plus OpenAI's disruptions ... TL;DR 18Feb24

Last update: Monday 2/19/24

Welcome to our 18Feb24 TL;DR summary about the past week's top AI stories on our "Useful AI News" page ➡ 1) Google's new Gemini 1.5, (2) ChatGPT's personal memory, (3) OpenAI's Sora video generator, and 4) OpenAI's Internet search

TL;DR link ➡ HERE

A. TL;DR ... Top 4 stories in past week ...

Most readers of this blog know that Google has been striving mightily to catch up to the Microsoft/OpenAi partnership ever since the partners released their GPT-4 language model in March 2023. Google made game changing progress in December 2023 when it released its new Gemini family of language models. It made additional progress in early February 2024 by offering subscriptions to users that provided them with access to Gemini enhancements to the applications in Google's Workspace, enhancements that were akin to Microsoft's Copilots

However, Google surprised the Ai community this week by releasing Gemini 1.5, a powerful upgrade to its initial model. Unfortunately for Google, this substantial effort to close the remaining gap with the Microsoft/OpenAI partnership was offset by Open/AI's simultaneous announcements of comparably powerful upgrades to GPT-4. Indeed, these disruptive enhancements are the most significant upgrades to the partners' language model since they moved from GPT-3.5 to GPT-4.

1) Google's new Gemini 1.5

According to Demis Hassabis, CEO of Google DeepMind,

"The first Gemini 1.5 model we’re releasing for early testing is Gemini 1.5 Pro. It’s a mid-size multimodal model, optimized for scaling across a wide-range of tasks, and performs at a similar level to 1.0 Ultra, our largest model to date. It also introduces a breakthrough experimental feature in long-context understanding.

Gemini 1.5 Pro comes with a standard 128,000 token context window. But starting today, a limited group of developers and enterprise customers can try it with a context window of up to 1 million tokens via AI Studio and Vertex AI in private preview

... 1.5 Pro can seamlessly analyze, classify and summarize large amounts of content within a given prompt. For example, when given the 402-page transcripts from Apollo 11’s mission to the moon, it can reason about conversations, events and details found across the document

... 1.5 Pro can perform more relevant problem-solving tasks across longer blocks of code. When given a prompt with more than 100,000 lines of code, it can better reason across examples, suggest helpful modifications and give explanations about how different parts of the code works...

... 1.5 Pro can perform highly-sophisticated understanding and reasoning tasks for different modalities, including video. For instance, when given a 44-minute silent Buster Keaton movie, the model can accurately analyze various plot points and events, and even reason about small details in the movie that could easily be missed."

Impressive, very impressive.

2) ChatGPT's personal memory

According to OpenAI:

"We’re testing memory with ChatGPT. Remembering things you discuss across all chats saves you from having to repeat information and makes future conversations more helpful ...

... As you chat with ChatGPT, you can ask it to remember something specific or let it pick up details itself. ChatGPT’s memory will get better the more you use it and you'll start to notice the improvements over time ...

... You can turn off memory at any time (Settings > Personalization > Memory). While memory is off, you won't create or use memories."

When the editor of this blog read the first report in tech media about this enhancement (see our fourth top event this week, below), he dismissed it as "interesting", "useful", and possibly "intrusive" ... then he encountered reports that OpenAi was developing a "search" function that would challenge Google's dominance in search and it hit him: Wow!!! Telling ChatGPT what you like or dislike, or merely letting ChatGPT infer your preferences from your queries might provide an OpenAI search engine with a substantial advantage over Google's search engine with regards to issues or items that were important enough for you to ask ChatGPT to explain them to you.

3) OpenAI's Sora video generator

Here's a quote from OpenAI's introduction to Sora:

"Introducing Sora, our text-to-video model. Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt."

Given the many absurdly inaccurate still images that DALL-E has produced, the photo-realism of Sora's videos is astounding. Here are links to a few examples of the prompts and resulting videos:

a short fluffy monster ... Wired
Prompt = "animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. the art style is 3d and realistic, with a focus on lighting and texture. the mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with wide eyes and open mouth. its pose and expression convey a sense of innocence and playfulness, as if it is exploring the world around it for the first time. the use of warm colors and dramatic lighting further enhances the cozy atmosphere of the image.”
giant wooly mammoths ... Wired
Prompt = "several giant wooly mammoths approach treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow covered trees and dramatic snow capped mountains in the distance, mid afternoon light with wispy clouds and a sun high in the distance creates a warm glow, the low camera view is stunning capturing the large furry mammal with beautiful photography, depth of field.”
Tokyo with snowflakes and cherry blossoms ... Wired
Prompt = "Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes."

4) OpenAI's Internet search

According to an exclusive report in The Information:

"OpenAI has been developing a web search product that would bring the Microsoft-backed startup into more direct competition with Google, according to someone with knowledge of OpenAI’s plans. The search service would be partly powered by Bing, this person said.

The move to launch a search app comes a year after Microsoft CEO Satya Nadella said his company would “make Google dance” by incorporating artificial intelligence from OpenAI into Microsoft’s Bing search engine. That partnership has failed to dent Google’s search dominance."

In other words, OpenAI is developing a search engine, but has not yet completed this project. Its serious commitment plus Microsoft's vast resources do not guarantee that this project will succeed; but it does guarantee that Google must maintain the "Code Red" high alert that it declared in March 2023 when it first perceived that the Microsoft/OpenAi partnership posed an existential threat.

B. Top 4 stories in past week ...

Google
"Google unveils Gemini 1.5, a next-gen AI model with million-token context window", Michael Nuñez, VentureBeat, 2/15/24 ***
-- This story also covered by The Verge, TechCrunch, Bloomberg, Wired, Mashable ... and Google

OpenAI
"ChatGPT is getting ‘memory’ to remember who you are and what you like" , David Pierce, The Verge, 2/13/24 ***
-- This story also covered by Bloomberg, Wired, NY Times ... and OpenAI
OpenAI
"OpenAI’s Sora Turns AI Prompts Into Photorealistic Videos", Steven Levy, Wired, 2/13/24 ***
-- This story also covered by VentureBeat, The Verge, Bloomberg, TechCrunch, Engadget, Mashable, NY Times
OpenAI
"OpenAI Develops Web Search Product in Challenge to Google", The Information, 2/13/24 ... Link downloads the full exclusive paywalled article for 3 free reads ***
-- This story also covered by Bloomberg, Gizmodo,

C. ➡ Dozen Basic AI FAQs

This page contains links to responses by Google's Bard chatbot running Gemini Pro to 12 questions that should be asked more frequently, but aren't. As consequence, too many readily understood AI terms have become meaningless buzzwords in the media.

Neoskeptics

Pages

Monday, February 19, 2024

Google's new Gemini 1.5 ... plus OpenAI's disruptions ... TL;DR 18Feb24

No comments:

Post a Comment