Connect with us


MM1, a Family of Multimodal AI Models with up to 30 billion Parameters, is being Developed by Apple Researchers



In a pre-print paper, Apple researchers presented their work on developing a multimodal large language model (LLM) for artificial intelligence (AI). The paper describes how it was possible to achieve the advanced capabilities of multimodality and train the foundation model on both text-only data and images, and it was published on an online portal on March 14. The Cupertino-based tech giant has made new advances in AI in response to CEO Tim Cook’s statement during the company’s earnings calls, which stated that AI features might be released later this year.

ArXiv, an open-access online repository for scholarly papers, has published the research paper’s pre-print version. Peer review is not, however, applied to the papers that are posted here. The project is thought to be connected to Apple as well, even though the paper makes no mention of the company; this is because the majority of the researchers mentioned are connected to the machine learning (ML) division of Apple.

A family of multimodal models with up to 30 billion parameters, known as MM1, is the project that the researchers are currently working on. The paper’s authors referred to it as a “performant multimodal LLM (MLLM)” and noted that in order to build an AI model that can comprehend both text and image-based inputs, image encoders, the vision language connector, and other architecture elements and data decisions were made.

The paper provided an example in stating that “We demonstrate that achieving state-of-the-art (SOTA) few-shot results across multiple benchmarks, compared to other published pre-training results, requires a careful mix of image-caption, interleaved image-text, and text-only data for large-scale multimodal pre-training.”

To put it simply, the AI model has not received enough training to produce the intended results and is presently in the pre-training phase. This phase involves designing the model’s workflow and data processing eventually using the algorithm and AI architecture. The researchers at Apple were able to incorporate computer vision into the model by means of a vision language connector and image encoders. Upon conducting tests using a combination of image-only, image-text, and text-only data sets, the team discovered that the outcomes were comparable to those of other models at the same stage.

Although this is a significant breakthrough, there is insufficient evidence in this research paper to conclude that Apple will integrate a multimodal AI chatbot into its operating system. It’s difficult to even say at this point whether the AI model is multimodal in terms of receiving inputs or producing output (i.e., whether it can produce AI images or not). However, it can be said that the tech giant has made significant progress toward developing a native generative AI foundation model if the results are verified to be consistent following peer review.


Biden, Kishida Secure Support from Amazon and Nvidia for $50 Million Joint AI Research Program



As the two countries seek to enhance cooperation around the rapidly advancing technology, President Joe Biden and Japanese Prime Minister Fumio Kishida have enlisted Inc. and Nvidia Corp. to fund a new joint artificial intelligence research program.

A senior US official briefed reporters prior to Wednesday’s official visit at the White House, stating that the $50 million project will be a collaborative effort between Tsukuba University outside of Tokyo and the University of Washington in Seattle. A separate collaborative AI research program between Carnegie Mellon University in Pittsburgh and Tokyo’s Keio University is also being planned by the two nations.

The push for greater research into artificial intelligence comes as the Biden administration is weighing a series of new regulations designed to minimize the risks of AI technology, which has developed as a key focus for tech companies. The White House announced late last month that federal agencies have until the end of the year to determine how they will assess, test, and monitor the impact of government use of AI technology.

In addition to the university-led projects, Microsoft Corp. announced on Tuesday that it would invest $2.9 billion to expand its cloud computing and artificial intelligence infrastructure in Japan. Brad Smith, the president of Microsoft, met with Kishida on Tuesday. The company released a statement announcing its intention to establish a new AI and robotics lab in Japan.

Kishida, the second-largest economy in Asia, urged American business executives to invest more in Japan’s developing technologies on Tuesday.

“Your investments will enable Japan’s economic growth — which will also be capital for more investments from Japan to the US,” Kishida said at a roundtable with business leaders in Washington.

Continue Reading


OnePlus and OPPO Collaborate with Google to Introduce Gemini Models for Enhanced Smartphone AI



As anticipated, original equipment manufacturers, or OEMs, are heavily integrating AI into their products. Google is working with OnePlus, OPPO, and other companies to integrate Gemini models into their smartphones. They intend to introduce the Gemini models on smartphones later this year, becoming the first OEMs to do so. Gemini models will go on sale later in 2024, as announced at the Google Cloud Next 24 event. Gemini models are designed to provide users with an enhanced artificial intelligence (AI) experience on their gadgets.

Customers in China can now create AI content on-the-go with devices like the OnePlus 12 and OPPO Find X7 thanks to OnePlus and OPPO’s Generative AI models.

The AI Eraser tool was recently made available to all OnePlus customers worldwide. This AI-powered tool lets users remove unwanted objects from their photos. For OnePlus and OPPO, AI Eraser is only the beginning.

In the future, the businesses hope to add more AI-powered features like creating original social media content and summarizing news stories and audio.

AndesGPT LLM from OnePlus and OPPO powers AI Eraser. Even though the Samsung Galaxy S24 and Google Pixel 8 series already have this feature, it is still encouraging to see OnePlus and OPPO taking the initiative to include AI capabilities in their products.

OnePlus and OPPO devices will be able to provide customers with a more comprehensive and sophisticated AI experience with the release of the Gemini models. It is important to remember that OnePlus and OPPO already power the Trinity Engine, which makes using phones incredibly smooth, and use AI and computational mathematics to enhance mobile photography.

By 2024, more original equipment manufacturers should have AI capabilities on their products. This is probably going to help Google because OEMs will use Gemini as the foundation upon which to build their features.

Continue Reading


Meta Explores AI-Enabled Search Bar on Instagram



In an attempt to expand the user base for its generative AI-powered products, Meta is moving forward. The business is experimenting with inserting Meta AI into the Instagram search bar for both chat with AI and content discovery, in addition to testing the chatbot Meta AI with users in nations like India on WhatsApp.

When you type a query into the search bar, Meta AI initiates a direct message (DM) exchange in which you can ask questions or respond to pre-programmed prompts. Aravind Srinivas, CEO of Perplexity AI, pointed out that the prompt screen’s design is similar to the startup’s search screen.

Plus, it might make it easier for you to find fresh Instagram content. As demonstrated in a user-posted video on Threads, you can search for Reels related to a particular topic by tapping on a prompt such as “Beautiful Maui sunset Reels.”

Additionally, TechCrunch spoke with a few users who had the ability to instruct Meta AI to look for recommendations for Reels.

By using generative AI to surface new content from networks like Instagram, Meta hopes to go beyond text generation.

With TechCrunch, Meta verified the results of its Instagram AI experiment. But the company didn’t say whether or not it uses generative AI technology for search.

A Meta representative told TechCrunch, “We’re testing a range of our generative AI-powered experiences publicly in a limited capacity. They are under development in varying phases.”

There are a ton of posts available discussing Instagram search quality. It is therefore not surprising that Meta would want to enhance search through the use of generative AI.

Furthermore, Instagram should be easier to find than TikTok, according to Meta. In order to display results from Reddit and TikTok, Google unveiled a new perspectives feature last year. Instagram is developing a feature called “Visibility off Instagram” that could allow posts to appear in search engine results, according to reverse engineer Alessandro Paluzzi, who made this discovery earlier this week on X.

Continue Reading


error: Content is protected !!