From choosing what to watch to fact-checking videos: 6 impressive things you can do with Google’s Gemini 1.5 Pro

Earlier this month, of Alphabet Inc latest model of “experimental” artificial intelligence, Google Gemini 1.5 Pro, was released to select users such as developers and enterprise customers via the company’s GenAI development tool, AI Studio. The latest iteration of this model can go far beyond previous versions and do more than process 100,000 words at once.

What happened: During the weekend, Rowan Cheungthe founder of The crumbling AItook a Xpreviously Twitterand shared a series of posts explaining six “impressive capabilities” users can experience with Gemini 1.5 Pro.

See also: Google Prepares to Launch Gemini Subscription with ‘Enterprise’ Plans for Workspace: Report

First: understand long videos

Cheung uploaded “the entire NBA dunk contest” from Saturday night and asked Gemini 1.5 Pro to answer “which dunk had the highest score.”

According to the founder of Rundown AI, the model found “the perfect 50 dunk spec and the details only from understanding the long context video.”

Second: understand the complete transcripts of the films

The AI ​​master then asked Gemini 1.5 Pro to “compare and contrast” transcripts of the films “Interstellar” and “Ad Astra” to help it decide which one it should watch next.

Google’s latest AI model “was able to easily understand, compare and contrast entire transcripts of both films.” All he had to do was upload the transcripts and make the right suggestion.

Third: overcome the language barrier

Cheung successfully translated a language spoken by less than 2000 people using Gemini 1.5 Pro. He translated a newsletter from English to Saterlandic, “following a complete language manual when making inference.”

Fourth: Distinguish fake videos from real ones

Cheung also asked Google’s Gemini 1.5 Pro to discern whether a video from OpenAI’s Sora was produced by AI or not. All he had to do was upload the video and ask, “Could this video be generated by artificial intelligence?”

While Gemini 1.5 Pro didn’t provide an accurate answer, it highlighted “key factors as to why it might be AI-generated.”

Fifth: make it easier to understand the nuances of long documents

Cheung used the AI ​​model to extract “Table 8” from the Gemini 1.5 Pro document written by DeepMind. Gemini 1.5 Pro can find, understand and explain a “small figure in a long sheet”.

Sixth: Get a personalized review of a movie

Rundown’s AI chief also used Gemini 1.5 Pro to get a personalized review of Christopher Nolan’s “Interstellar.” He uploaded the transcript of the film and asked the AI ​​model to extract the three most significant quotes, which the model did brilliantly.

Because it is important: While not yet accessible to the public, Cheung has been granted early access to Gemini 1.5 Pro by Google DeepMind.

This advanced model boasts significantly greater data processing capacity than its predecessor, Gemini 1.0 Pro. It can handle approximately 700,000 words or 30,000 lines of code, marking a notable 35x increase over the capabilities of Gemini 1.0 Pro.

Furthermore, the functionality of the model goes beyond text processing. It can handle up to 11 hours of audio or one hour of video in multiple languages.

This expanded functionality is made possible by Gemini 1.5 Pro’s support for up to one million tokens, with Google reporting successful tests with up to 10 million tokens.

However, it’s worth noting that the current version of Gemini 1.5 Pro, accessible to most developers and customers, is limited to processing around 100,000 words at a time.

Check out more of Benzinga’s Consumer Tech coverage by following this link.

Read next: Google is rebranding its AI chatbot Bard as Gemini, with minor updates and a new mobile app

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *