Monday, February 19, 2024

Google's new Gemini 1.5 ... plus OpenAI's disruptions ... TL;DR and podcast 18Feb24

Last update: Monday 2/19/24 
Welcome to our 
18Feb24 TL;DR summary + podcast about the past week's top AI stories on our "Useful AI News" page   1) Google's new Gemini 1.5, (2) ChatGPT's personal memory, (3) OpenAI's Sora video generator, and 4) OpenAI's Internet search 


Click link for podcast (opens in new tab) 
Click the "start" button the when the podcast page is loaded
... 
 If audio fails to start, or gets stuck, try reloading that page
TL;DR link  HERE
 
A. TL;DR ... Top 4 stories in past week  ...
Most readers of this blog know that Google has been striving mightily to catch up to the Microsoft/OpenAi partnership ever since the partners released their GPT-4 language model in March 2023. Google made game changing progress in December 2023 when it released its new Gemini family of language models. It made additional progress in early February 2024 by offering subscriptions to users that provided them with access to Gemini enhancements to the applications in Google's Workspace, enhancements that were akin to Microsoft's Copilots

However, Google surprised the Ai community this week by releasing Gemini 1.5, a powerful upgrade to its initial model.  Unfortunately for Google, this substantial effort to close the remaining gap with the Microsoft/OpenAI partnership was offset by Open/AI's simultaneous announcements of comparably powerful upgrades to GPT-4. Indeed, these disruptive enhancements are the most significant upgrades to the partners' language model since they moved from GPT-3.5 to GPT-4.

1)  Google's new Gemini 1.5
"The first Gemini 1.5 model we’re releasing for early testing is Gemini 1.5 Pro. It’s a mid-size multimodal model, optimized for scaling across a wide-range of tasks, and performs at a similar level to 1.0 Ultra, our largest model to date. It also introduces a breakthrough experimental feature in long-context understanding.
Gemini 1.5 Pro comes with a standard 128,000 token context window. But starting today, a limited group of developers and enterprise customers can try it with a context window of up to 1 million tokens via AI Studio and Vertex AI in private preview 
... 1.5 Pro can seamlessly analyze, classify and summarize large amounts of content within a given prompt. For example, when given the 402-page transcripts from Apollo 11’s mission to the moon, it can reason about conversations, events and details found across the document 
... 1.5 Pro can perform more relevant problem-solving tasks across longer blocks of code. When given a prompt with more than 100,000 lines of code, it can better reason across examples, suggest helpful modifications and give explanations about how different parts of the code works... 
... 1.5 Pro can perform highly-sophisticated understanding and reasoning tasks for different modalities, including video. For instance, when given a 44-minute silent Buster Keaton movie, the model can accurately analyze various plot points and events, and even reason about small details in the movie that could easily be missed."
Impressive, very impressive.

2) ChatGPT's personal memory
According to OpenAI:
"We’re testing memory with ChatGPT. Remembering things you discuss across all chats saves you from having to repeat information and makes future conversations more helpful ...  
... As you chat with ChatGPT, you can ask it to remember something specific or let it pick up details itself. ChatGPT’s memory will get better the more you use it and you'll start to notice the improvements over time ...  
... You can turn off memory at any time (Settings > Personalization > Memory). While memory is off, you won't create or use memories."
When the editor of this blog read the first report in tech media about this enhancement (see our fourth top event this week, below), he dismissed it as "interesting", "useful", and possibly "intrusive" ... then he encountered reports that OpenAi was developing a "search" function that would challenge Google's dominance in search and it hit him: Wow!!! Telling ChatGPT what you like or dislike, or merely letting ChatGPT infer your preferences from your queries might provide an OpenAI search engine with a substantial advantage over Google's search engine with regards to issues or items that were important enough for you to ask ChatGPT to explain them to you.

3) OpenAI's Sora video generator
Here's a quote from OpenAI's introduction to Sora: 
"Introducing Sora, our text-to-video model. Sora can generate videos up to a minute long while maintaining visual quality and adherence to the user’s prompt."

Given the many absurdly inaccurate still images that DALL-E has produced, the photo-realism of Sora's videos is astounding. Here are links to a few examples of the prompts and resulting videos:

  • a short fluffy monster ... Wired
    Prompt = "animated scene features a close-up of a short fluffy monster kneeling beside a melting red candle. the art style is 3d and realistic, with a focus on lighting and texture. the mood of the painting is one of wonder and curiosity, as the monster gazes at the flame with wide eyes and open mouth. its pose and expression convey a sense of innocence and playfulness, as if it is exploring the world around it for the first time. the use of warm colors and dramatic lighting further enhances the cozy atmosphere of the image.”

  • giant wooly mammoths ... Wired
    Prompt = "several giant wooly mammoths approach treading through a snowy meadow, their long wooly fur lightly blows in the wind as they walk, snow covered trees and dramatic snow capped mountains in the distance, mid afternoon light with wispy clouds and a sun high in the distance creates a warm glow, the low camera view is stunning capturing the large furry mammal with beautiful photography, depth of field.”

  • Tokyo with snowflakes and cherry blossoms ... Wired
    Prompt = "Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes."

4) OpenAI's Internet search
According to an exclusive report in The Information:
"OpenAI has been developing a web search product that would bring the Microsoft-backed startup into more direct competition with Google, according to someone with knowledge of OpenAI’s plans. The search service would be partly powered by Bing, this person said.
The move to launch a search app comes a year after Microsoft CEO Satya Nadella said his company would “make Google dance” by incorporating artificial intelligence from OpenAI into Microsoft’s Bing search engine. That partnership has failed to dent Google’s search dominance."
In other words, OpenAI is developing a search engine, but has not yet completed this project. Its serious commitment plus Microsoft's vast resources do not guarantee that this project will succeed; but it does guarantee that Google must maintain the "Code Red" high alert that it declared in March 2023 when it first perceived that the Microsoft/OpenAi partnership posed an existential threat.


B. Top 4 stories in past week ... 
  1. Google
    "Google unveils Gemini 1.5, a next-gen AI model with million-token context window", Michael Nuñez, VentureBeat, 2/15/24 ***
    -- This story also covered by The Verge, TechCrunchBloombergWiredMashable ... and Google

This page contains links to responses by Google's Bard chatbot running Gemini Pro to 12 questions that should be asked more frequently, but aren't. As consequence, too many readily understood AI terms have become meaningless buzzwords in the media.

No comments:

Post a Comment

Your comments will be greatly appreciated ... Or just click the "Like" button above the comments section if you enjoyed this blog note.