Connect with us

Technology

MM1, a Family of Multimodal AI Models with up to 30 billion Parameters, is being Developed by Apple Researchers

Published

on

In a pre-print paper, Apple researchers presented their work on developing a multimodal large language model (LLM) for artificial intelligence (AI). The paper describes how it was possible to achieve the advanced capabilities of multimodality and train the foundation model on both text-only data and images, and it was published on an online portal on March 14. The Cupertino-based tech giant has made new advances in AI in response to CEO Tim Cook’s statement during the company’s earnings calls, which stated that AI features might be released later this year.

ArXiv, an open-access online repository for scholarly papers, has published the research paper’s pre-print version. Peer review is not, however, applied to the papers that are posted here. The project is thought to be connected to Apple as well, even though the paper makes no mention of the company; this is because the majority of the researchers mentioned are connected to the machine learning (ML) division of Apple.

A family of multimodal models with up to 30 billion parameters, known as MM1, is the project that the researchers are currently working on. The paper’s authors referred to it as a “performant multimodal LLM (MLLM)” and noted that in order to build an AI model that can comprehend both text and image-based inputs, image encoders, the vision language connector, and other architecture elements and data decisions were made.

The paper provided an example in stating that “We demonstrate that achieving state-of-the-art (SOTA) few-shot results across multiple benchmarks, compared to other published pre-training results, requires a careful mix of image-caption, interleaved image-text, and text-only data for large-scale multimodal pre-training.”

To put it simply, the AI model has not received enough training to produce the intended results and is presently in the pre-training phase. This phase involves designing the model’s workflow and data processing eventually using the algorithm and AI architecture. The researchers at Apple were able to incorporate computer vision into the model by means of a vision language connector and image encoders. Upon conducting tests using a combination of image-only, image-text, and text-only data sets, the team discovered that the outcomes were comparable to those of other models at the same stage.

Although this is a significant breakthrough, there is insufficient evidence in this research paper to conclude that Apple will integrate a multimodal AI chatbot into its operating system. It’s difficult to even say at this point whether the AI model is multimodal in terms of receiving inputs or producing output (i.e., whether it can produce AI images or not). However, it can be said that the tech giant has made significant progress toward developing a native generative AI foundation model if the results are verified to be consistent following peer review.

Technology

AI Features of the Google Pixel 8a Leaked before the Device’s Planned Release

Published

on

A new smartphone from Google is anticipated to be unveiled during its May 14–15 I/O conference. The forthcoming device, dubbed Pixel 8a, will be a more subdued version of the Pixel 8. Despite being frequently spotted online, the smartphone has not yet received any official announcements from the company. A promotional video that was leaked is showcasing the AI features of the Pixel 8a, just weeks before its much-anticipated release. Furthermore, internet leaks have disclosed software support and special features.

Tipster Steve Hemmerstoffer obtained a promotional video for the Pixel 8a through MySmartPrice. The forthcoming smartphone is anticipated to include certain Pixel-only features, some of which are demonstrated in the video. As per the video, the Pixel 8a will support Google’s Best Take feature, which substitutes faces from multiple group photos or burst photos to “replace” faces that have their eyes closed or display undesirable expressions.

There will be support for Circle to Search on the Pixel 8a, a feature that is presently present on some Pixel and Samsung Galaxy smartphones. Additionally, the leaked video implies that the smartphone will come equipped with Google’s Audio Magic Eraser, an artificial intelligence (AI) tool for eliminating unwanted background noise from recorded videos. In addition, as shown in the video, the Pixel 8a will support live translation during voice calls.

The phone will have “seven years of security updates” and the Tensor G3 chip, according to the leaked teasers. It’s unclear, though, if the phone will get the same amount of Android OS updates as the more expensive Pixel 8 series phones that have the same processor. In the days preceding its planned May 14 launch, the company is anticipated to disclose additional information about the device.

Continue Reading

Technology

Apple Unveils a new Artificial Intelligence Model Compatible with Laptops and Phones

Published

on

All of the major tech companies, with the exception of Apple, have made their generative AI models available for use in commercial settings. The business is, nevertheless, actively engaged in that area. Wednesday saw the release of Open-source Efficient Language Models (OpenELM), a collection of four incredibly compact language models—the Hugging Face model library—by its researchers. According to the company, OpenELM works incredibly well for text-related tasks like composing emails. The models are now ready for development and the company has maintained them as open source.

In comparison to models from other tech giants like Microsoft and Google, the model is extremely small, as previously mentioned. 270 million, 450 million, 1.1 billion, and 3 billion parameters are present in Apple’s latest models. On the other hand, Google’s Gemma model has 2 billion parameters, whereas Microsoft’s Phi-3 model has 3.8 billion. Minimal versions are compatible with phones and laptops and require less power to operate.

Apple CEO Tim Cook made a hint in February about the impending release of generative AI features on Apple products. He said that Apple has been working on this project for a long time. About the details of the AI features, there is, however, no more information available.

Apple, meanwhile, has declared that it will hold a press conference to introduce a few new items this month. Media invites to the “special Apple Event” on May 7 at 7 AM PT (7:30 PM IST) have already begun to arrive from the company. The invite’s image, which shows an Apple Pencil, suggests that the event will primarily focus on iPads.

It seems that Apple will host the event entirely online, following in the footsteps of October’s “Scary Fast” event. It is implied in every invitation that Apple has sent out that viewers will be able to watch the event online. Invitations for a live event have not yet been distributed.
Apple has released other AI models before this one. The business previously released the MGIE image editing model, which enables users to edit photos using prompts.

Continue Reading

Technology

Google Expands the Availability of AI Support with Gemini AI to Android 10 and 11

Published

on

Android 10 and 11 are now compatible with Google’s Gemini AI, which was previously limited to Android 12 and above. As noted by 9to5google, this modification greatly expands the pool of users who can take advantage of AI-powered support for their tablets and smartphones.

Due to a recent app update, Google has lowered the minimum requirement for Gemini, which now makes its advanced AI features accessible to a wider range of users. Previously, Gemini required Android 12 or later to function. The AI assistant can now be installed and used on Android 10 devices thanks to the updated Gemini app, version v1.0.626720042, which can be downloaded from the Google Play Store.

This expansion, which shows Google’s goal to make AI technology more inclusive, was first mentioned by Sumanta Das on X and then further highlighted by Artem Russakoviskii. Only the most recent versions of Android were compatible with Gemini when it was first released earlier this year. Google’s latest update demonstrates the company’s dedication to expanding the user base for its AI technology.

Gemini is now fully operational after updating the Google app and Play Services, according to testers using Android 10 devices. Tests conducted on an Android 10 Google Pixel revealed that Gemini functions seamlessly and a user experience akin to that of more recent models.

Because users with older Android devices will now have access to the same AI capabilities as those with more recent models, the wider compatibility has important implications for them. Expanding Gemini’s support further demonstrates Google’s dedication to making advanced AI accessible to a larger segment of the Android user base.

Users of Android 10 and 11 can now access Gemini, and they can anticipate regular updates and new features. This action marks a significant turning point in Google’s AI development and opens the door for future functional and accessibility enhancements, improving everyone’s Android experience.

Continue Reading

Trending

error: Content is protected !!