Connect with us

Technology

Google Unveils VideoPrism: An All-Purpose AI Model for Video Comprehension

Published

on

VideoPrism is useful for a wide range of tasks related to understanding and analyzing videos, according to Google. The model performs exceptionally well in identifying objects and activities in videos, locating related videos, and, when paired with a language model, providing descriptions of video content and responding to queries about videos.

The foundation of VideoPrism is the Vision Transformer (ViT) architecture, which enables the model to process temporal and spatial information from videos.

The group used 36 million high-quality video-text pairs and 582 million video clips with noisy or artificially generated parallel text to train VideoPrism on a self-generated, sizable, and varied dataset. This dataset is the largest of its kind, according to Google.

According to Google, VideoPrism is distinct since it makes use of two complimentary pre-training signals: While the video content provides information about the visual dynamics, the text descriptions explain how the objects appear in the videos.

There were two stages to the training: Initially, the model was trained to identify videos that corresponded with descriptive text. After that, it developed the ability to anticipate video gaps.

VideoPrism used a single, frozen model to achieve state-of-the-art results in 30 out of 33 video comprehension benchmark evaluation cases with minimal adaptation effort.

It performed better in combination with large language models for video text retrieval, video captioning, and video question answering, and it outperformed other baseline video models in classification and localization tasks.

Additionally, VideoPrism outperformed models created especially for scientific applications like animal behavior analysis and ecology. Google sees this as a chance to enhance its video analytics in numerous ways.

In order to fully realize the potential of video models in fields like scientific research, education, and healthcare, the research team hopes that VideoPrism will open new avenues for advancements at the nexus of artificial intelligence and video analytics.

Technology

AI Features of the Google Pixel 8a Leaked before the Device’s Planned Release

Published

on

A new smartphone from Google is anticipated to be unveiled during its May 14–15 I/O conference. The forthcoming device, dubbed Pixel 8a, will be a more subdued version of the Pixel 8. Despite being frequently spotted online, the smartphone has not yet received any official announcements from the company. A promotional video that was leaked is showcasing the AI features of the Pixel 8a, just weeks before its much-anticipated release. Furthermore, internet leaks have disclosed software support and special features.

Tipster Steve Hemmerstoffer obtained a promotional video for the Pixel 8a through MySmartPrice. The forthcoming smartphone is anticipated to include certain Pixel-only features, some of which are demonstrated in the video. As per the video, the Pixel 8a will support Google’s Best Take feature, which substitutes faces from multiple group photos or burst photos to “replace” faces that have their eyes closed or display undesirable expressions.

There will be support for Circle to Search on the Pixel 8a, a feature that is presently present on some Pixel and Samsung Galaxy smartphones. Additionally, the leaked video implies that the smartphone will come equipped with Google’s Audio Magic Eraser, an artificial intelligence (AI) tool for eliminating unwanted background noise from recorded videos. In addition, as shown in the video, the Pixel 8a will support live translation during voice calls.

The phone will have “seven years of security updates” and the Tensor G3 chip, according to the leaked teasers. It’s unclear, though, if the phone will get the same amount of Android OS updates as the more expensive Pixel 8 series phones that have the same processor. In the days preceding its planned May 14 launch, the company is anticipated to disclose additional information about the device.

Continue Reading

Technology

Apple Unveils a new Artificial Intelligence Model Compatible with Laptops and Phones

Published

on

All of the major tech companies, with the exception of Apple, have made their generative AI models available for use in commercial settings. The business is, nevertheless, actively engaged in that area. Wednesday saw the release of Open-source Efficient Language Models (OpenELM), a collection of four incredibly compact language models—the Hugging Face model library—by its researchers. According to the company, OpenELM works incredibly well for text-related tasks like composing emails. The models are now ready for development and the company has maintained them as open source.

In comparison to models from other tech giants like Microsoft and Google, the model is extremely small, as previously mentioned. 270 million, 450 million, 1.1 billion, and 3 billion parameters are present in Apple’s latest models. On the other hand, Google’s Gemma model has 2 billion parameters, whereas Microsoft’s Phi-3 model has 3.8 billion. Minimal versions are compatible with phones and laptops and require less power to operate.

Apple CEO Tim Cook made a hint in February about the impending release of generative AI features on Apple products. He said that Apple has been working on this project for a long time. About the details of the AI features, there is, however, no more information available.

Apple, meanwhile, has declared that it will hold a press conference to introduce a few new items this month. Media invites to the “special Apple Event” on May 7 at 7 AM PT (7:30 PM IST) have already begun to arrive from the company. The invite’s image, which shows an Apple Pencil, suggests that the event will primarily focus on iPads.

It seems that Apple will host the event entirely online, following in the footsteps of October’s “Scary Fast” event. It is implied in every invitation that Apple has sent out that viewers will be able to watch the event online. Invitations for a live event have not yet been distributed.
Apple has released other AI models before this one. The business previously released the MGIE image editing model, which enables users to edit photos using prompts.

Continue Reading

Technology

Google Expands the Availability of AI Support with Gemini AI to Android 10 and 11

Published

on

Android 10 and 11 are now compatible with Google’s Gemini AI, which was previously limited to Android 12 and above. As noted by 9to5google, this modification greatly expands the pool of users who can take advantage of AI-powered support for their tablets and smartphones.

Due to a recent app update, Google has lowered the minimum requirement for Gemini, which now makes its advanced AI features accessible to a wider range of users. Previously, Gemini required Android 12 or later to function. The AI assistant can now be installed and used on Android 10 devices thanks to the updated Gemini app, version v1.0.626720042, which can be downloaded from the Google Play Store.

This expansion, which shows Google’s goal to make AI technology more inclusive, was first mentioned by Sumanta Das on X and then further highlighted by Artem Russakoviskii. Only the most recent versions of Android were compatible with Gemini when it was first released earlier this year. Google’s latest update demonstrates the company’s dedication to expanding the user base for its AI technology.

Gemini is now fully operational after updating the Google app and Play Services, according to testers using Android 10 devices. Tests conducted on an Android 10 Google Pixel revealed that Gemini functions seamlessly and a user experience akin to that of more recent models.

Because users with older Android devices will now have access to the same AI capabilities as those with more recent models, the wider compatibility has important implications for them. Expanding Gemini’s support further demonstrates Google’s dedication to making advanced AI accessible to a larger segment of the Android user base.

Users of Android 10 and 11 can now access Gemini, and they can anticipate regular updates and new features. This action marks a significant turning point in Google’s AI development and opens the door for future functional and accessibility enhancements, improving everyone’s Android experience.

Continue Reading

Trending

error: Content is protected !!