Connect with us

Technology

Gen AI without the dangers

Published

on

It’s understandable that ChatGPT, Stable Diffusion, and DreamStudio-Generative AI are making headlines. The outcomes are striking and getting better geometrically. Already, search and information analysis, as well as code creation, network security, and article writing, are being revolutionized by intelligent assistants.

Gen AI will play a critical role in how businesses run and provide IT services, as well as how business users complete their tasks. There are countless options, but there are also countless dangers. Successful AI development and implementation can be a costly and risky process. Furthermore, the workloads associated with Gen AI and the large language models (LLMs) that drive it are extremely computationally demanding and energy-intensive.Dr. Sajjad Moazeni of the University of Washington estimates that training an LLM with 175 billion or more parameters requires an annual energy expenditure for 1,000 US households, though exact figures are unknown. Over 100 million generative AI questions answered daily equate to one gigawatt-hour of electricity use, or about 33,000 US households’ daily energy use.

How even hyperscalers can afford that much electricity is beyond me. It’s too expensive for the typical business. How can CIOs provide reliable, accurate AI without incurring the energy expenses and environmental impact of a small city?

Six pointers for implementing Gen AI economically and with less risk

Retraining generative AI to perform particular tasks is essential to its applicability in business settings. Expert models produced by retraining are smaller, more accurate, and require less processing power. So, in order to train their own AI models, does every business need to establish a specialized AI development team and a supercomputer? Not at all.

Here are six strategies to create and implement AI without spending a lot of money on expensive hardware or highly skilled personnel.

Start with a foundation model rather than creating the wheel.

A company might spend money creating custom models for its own use cases. But the expenditure on data scientists, HPC specialists, and supercomputing infrastructure is out of reach for all but the biggest government organizations, businesses, and hyperscalers.

Rather, begin with a foundation model that features a robust application portfolio and an active developer ecosystem. You could use an open-source model like Meta’s Llama 2, or a proprietary model like OpenAI’s ChatGPT. Hugging Face and other communities provide a vast array of open-source models and applications.

Align the model with the intended use

Models can be broadly applicable and computationally demanding, such as GPT, or more narrowly focused, like Med-BERT (an open-source LLM for medical literature). The time it takes to create a viable prototype can be shortened and months of training can be avoided by choosing the appropriate model early in the project.

However, exercise caution. Any model may exhibit biases in the data it uses to train, and generative AI models are capable of lying outright and fabricating responses. Seek models trained on clean, transparent data with well-defined governance and explicable decision-making for optimal trustworthiness.

Retrain to produce more accurate, smaller models

Retraining foundation models on particular datasets offers various advantages. The model sheds parameters it doesn’t need for the application as it gets more accurate on a smaller field. One way to trade a general skill like songwriting for the ability to assist a customer with a mortgage application would be to retrain an LLM in financial information.

With a more compact design, the new banking assistant would still be able to provide superb, extremely accurate services while operating on standard (current) hardware.

Make use of your current infrastructure

A supercomputer with 10,000 GPUs is too big for most businesses to set up. Fortunately, most practical AI training, retraining, and inference can be done without large GPU arrays.

  • Training up to 10 billion: at competitive price/performance points, contemporary CPUs with integrated AI acceleration can manage training loads in this range. For better performance and lower costs, train overnight during periods of low demand for data centers.
  • Retraining up to 10 billion models is possible with modern CPUs; no GPU is needed, and it takes only minutes.
  • With integrated CPUs, smaller models can operate on standalone edge devices, with inferencing ranging from millions to less than 20 billion. For models with less than 20 billion parameters, such as Llama 2, CPUs can respond as quickly and precisely as GPUs.

Execute inference with consideration for hardware

Applications for inference can be fine-tuned and optimized for improved performance on particular hardware configurations and features. Similar to training a model, optimizing one for a given application means striking a balance between processing efficiency, model size, and accuracy.

One way to increase inference speeds four times while maintaining accuracy is to round down a 32-bit floating point model to the nearest 8-bit fixed integer (INT8). Utilizing host accelerators such as integrated GPUs, Intel® Advanced Matrix Extensions (Intel® AMX), and Intel® Advanced Vector Extensions 512 (Intel® AVX-512), tools such as Intel® Distribution of OpenVINOTM toolkit manage optimization and build hardware-aware inference engines.

Monitor cloud utilization

A quick, dependable, and expandable route is to offer AI services through cloud-based AI applications and APIs. Customers and business users alike benefit from always-on AI from a service provider, but costs can rise suddenly. Everyone will use your AI service if it is well-liked by all.

Many businesses that began their AI journeys entirely in the cloud are returning workloads to their on-premises and co-located infrastructure that can function well there. Pay-as-you-go infrastructure-as-a-service is becoming a competitive option for cloud-native enterprises with minimal or no on-premises infrastructure in comparison to rising cloud costs.

You have choices when it comes to Gen AI. Generative AI is surrounded by a lot of hype and mystery, giving the impression that it’s a cutting-edge technology that’s only accessible to the wealthiest companies. Actually, on a typical CPU-based data center or cloud instance, hundreds of high-performance models, including LLMs for generative AI, are accurate and performant. Enterprise-grade generative AI experimentation, prototyping, and deployment tools are rapidly developing in both open-source and proprietary communities.

By utilizing all of their resources, astute CIOs can leverage AI that transforms businesses without incurring the expenses and hazards associated with in-house development.

Technology

AI Features of the Google Pixel 8a Leaked before the Device’s Planned Release

Published

on

A new smartphone from Google is anticipated to be unveiled during its May 14–15 I/O conference. The forthcoming device, dubbed Pixel 8a, will be a more subdued version of the Pixel 8. Despite being frequently spotted online, the smartphone has not yet received any official announcements from the company. A promotional video that was leaked is showcasing the AI features of the Pixel 8a, just weeks before its much-anticipated release. Furthermore, internet leaks have disclosed software support and special features.

Tipster Steve Hemmerstoffer obtained a promotional video for the Pixel 8a through MySmartPrice. The forthcoming smartphone is anticipated to include certain Pixel-only features, some of which are demonstrated in the video. As per the video, the Pixel 8a will support Google’s Best Take feature, which substitutes faces from multiple group photos or burst photos to “replace” faces that have their eyes closed or display undesirable expressions.

There will be support for Circle to Search on the Pixel 8a, a feature that is presently present on some Pixel and Samsung Galaxy smartphones. Additionally, the leaked video implies that the smartphone will come equipped with Google’s Audio Magic Eraser, an artificial intelligence (AI) tool for eliminating unwanted background noise from recorded videos. In addition, as shown in the video, the Pixel 8a will support live translation during voice calls.

The phone will have “seven years of security updates” and the Tensor G3 chip, according to the leaked teasers. It’s unclear, though, if the phone will get the same amount of Android OS updates as the more expensive Pixel 8 series phones that have the same processor. In the days preceding its planned May 14 launch, the company is anticipated to disclose additional information about the device.

Continue Reading

Technology

Apple Unveils a new Artificial Intelligence Model Compatible with Laptops and Phones

Published

on

All of the major tech companies, with the exception of Apple, have made their generative AI models available for use in commercial settings. The business is, nevertheless, actively engaged in that area. Wednesday saw the release of Open-source Efficient Language Models (OpenELM), a collection of four incredibly compact language models—the Hugging Face model library—by its researchers. According to the company, OpenELM works incredibly well for text-related tasks like composing emails. The models are now ready for development and the company has maintained them as open source.

In comparison to models from other tech giants like Microsoft and Google, the model is extremely small, as previously mentioned. 270 million, 450 million, 1.1 billion, and 3 billion parameters are present in Apple’s latest models. On the other hand, Google’s Gemma model has 2 billion parameters, whereas Microsoft’s Phi-3 model has 3.8 billion. Minimal versions are compatible with phones and laptops and require less power to operate.

Apple CEO Tim Cook made a hint in February about the impending release of generative AI features on Apple products. He said that Apple has been working on this project for a long time. About the details of the AI features, there is, however, no more information available.

Apple, meanwhile, has declared that it will hold a press conference to introduce a few new items this month. Media invites to the “special Apple Event” on May 7 at 7 AM PT (7:30 PM IST) have already begun to arrive from the company. The invite’s image, which shows an Apple Pencil, suggests that the event will primarily focus on iPads.

It seems that Apple will host the event entirely online, following in the footsteps of October’s “Scary Fast” event. It is implied in every invitation that Apple has sent out that viewers will be able to watch the event online. Invitations for a live event have not yet been distributed.
Apple has released other AI models before this one. The business previously released the MGIE image editing model, which enables users to edit photos using prompts.

Continue Reading

Technology

Google Expands the Availability of AI Support with Gemini AI to Android 10 and 11

Published

on

Android 10 and 11 are now compatible with Google’s Gemini AI, which was previously limited to Android 12 and above. As noted by 9to5google, this modification greatly expands the pool of users who can take advantage of AI-powered support for their tablets and smartphones.

Due to a recent app update, Google has lowered the minimum requirement for Gemini, which now makes its advanced AI features accessible to a wider range of users. Previously, Gemini required Android 12 or later to function. The AI assistant can now be installed and used on Android 10 devices thanks to the updated Gemini app, version v1.0.626720042, which can be downloaded from the Google Play Store.

This expansion, which shows Google’s goal to make AI technology more inclusive, was first mentioned by Sumanta Das on X and then further highlighted by Artem Russakoviskii. Only the most recent versions of Android were compatible with Gemini when it was first released earlier this year. Google’s latest update demonstrates the company’s dedication to expanding the user base for its AI technology.

Gemini is now fully operational after updating the Google app and Play Services, according to testers using Android 10 devices. Tests conducted on an Android 10 Google Pixel revealed that Gemini functions seamlessly and a user experience akin to that of more recent models.

Because users with older Android devices will now have access to the same AI capabilities as those with more recent models, the wider compatibility has important implications for them. Expanding Gemini’s support further demonstrates Google’s dedication to making advanced AI accessible to a larger segment of the Android user base.

Users of Android 10 and 11 can now access Gemini, and they can anticipate regular updates and new features. This action marks a significant turning point in Google’s AI development and opens the door for future functional and accessibility enhancements, improving everyone’s Android experience.

Continue Reading

Trending

error: Content is protected !!