Technology

Gen AI without the dangers

Published

8 months ago

November 28, 2023

Komal

It’s understandable that ChatGPT, Stable Diffusion, and DreamStudio-Generative AI are making headlines. The outcomes are striking and getting better geometrically. Already, search and information analysis, as well as code creation, network security, and article writing, are being revolutionized by intelligent assistants.

Gen AI will play a critical role in how businesses run and provide IT services, as well as how business users complete their tasks. There are countless options, but there are also countless dangers. Successful AI development and implementation can be a costly and risky process. Furthermore, the workloads associated with Gen AI and the large language models (LLMs) that drive it are extremely computationally demanding and energy-intensive.Dr. Sajjad Moazeni of the University of Washington estimates that training an LLM with 175 billion or more parameters requires an annual energy expenditure for 1,000 US households, though exact figures are unknown. Over 100 million generative AI questions answered daily equate to one gigawatt-hour of electricity use, or about 33,000 US households’ daily energy use.

How even hyperscalers can afford that much electricity is beyond me. It’s too expensive for the typical business. How can CIOs provide reliable, accurate AI without incurring the energy expenses and environmental impact of a small city?

Six pointers for implementing Gen AI economically and with less risk

Retraining generative AI to perform particular tasks is essential to its applicability in business settings. Expert models produced by retraining are smaller, more accurate, and require less processing power. So, in order to train their own AI models, does every business need to establish a specialized AI development team and a supercomputer? Not at all.

Here are six strategies to create and implement AI without spending a lot of money on expensive hardware or highly skilled personnel.

Start with a foundation model rather than creating the wheel.

A company might spend money creating custom models for its own use cases. But the expenditure on data scientists, HPC specialists, and supercomputing infrastructure is out of reach for all but the biggest government organizations, businesses, and hyperscalers.

Rather, begin with a foundation model that features a robust application portfolio and an active developer ecosystem. You could use an open-source model like Meta’s Llama 2, or a proprietary model like OpenAI’s ChatGPT. Hugging Face and other communities provide a vast array of open-source models and applications.

Align the model with the intended use

Models can be broadly applicable and computationally demanding, such as GPT, or more narrowly focused, like Med-BERT (an open-source LLM for medical literature). The time it takes to create a viable prototype can be shortened and months of training can be avoided by choosing the appropriate model early in the project.

However, exercise caution. Any model may exhibit biases in the data it uses to train, and generative AI models are capable of lying outright and fabricating responses. Seek models trained on clean, transparent data with well-defined governance and explicable decision-making for optimal trustworthiness.

Retrain to produce more accurate, smaller models

Retraining foundation models on particular datasets offers various advantages. The model sheds parameters it doesn’t need for the application as it gets more accurate on a smaller field. One way to trade a general skill like songwriting for the ability to assist a customer with a mortgage application would be to retrain an LLM in financial information.

With a more compact design, the new banking assistant would still be able to provide superb, extremely accurate services while operating on standard (current) hardware.

Make use of your current infrastructure

A supercomputer with 10,000 GPUs is too big for most businesses to set up. Fortunately, most practical AI training, retraining, and inference can be done without large GPU arrays.

Training up to 10 billion: at competitive price/performance points, contemporary CPUs with integrated AI acceleration can manage training loads in this range. For better performance and lower costs, train overnight during periods of low demand for data centers.
Retraining up to 10 billion models is possible with modern CPUs; no GPU is needed, and it takes only minutes.
With integrated CPUs, smaller models can operate on standalone edge devices, with inferencing ranging from millions to less than 20 billion. For models with less than 20 billion parameters, such as Llama 2, CPUs can respond as quickly and precisely as GPUs.

Execute inference with consideration for hardware

Applications for inference can be fine-tuned and optimized for improved performance on particular hardware configurations and features. Similar to training a model, optimizing one for a given application means striking a balance between processing efficiency, model size, and accuracy.

One way to increase inference speeds four times while maintaining accuracy is to round down a 32-bit floating point model to the nearest 8-bit fixed integer (INT8). Utilizing host accelerators such as integrated GPUs, Intel® Advanced Matrix Extensions (Intel® AMX), and Intel® Advanced Vector Extensions 512 (Intel® AVX-512), tools such as Intel® Distribution of OpenVINOTM toolkit manage optimization and build hardware-aware inference engines.

Monitor cloud utilization

A quick, dependable, and expandable route is to offer AI services through cloud-based AI applications and APIs. Customers and business users alike benefit from always-on AI from a service provider, but costs can rise suddenly. Everyone will use your AI service if it is well-liked by all.

Many businesses that began their AI journeys entirely in the cloud are returning workloads to their on-premises and co-located infrastructure that can function well there. Pay-as-you-go infrastructure-as-a-service is becoming a competitive option for cloud-native enterprises with minimal or no on-premises infrastructure in comparison to rising cloud costs.

You have choices when it comes to Gen AI. Generative AI is surrounded by a lot of hype and mystery, giving the impression that it’s a cutting-edge technology that’s only accessible to the wealthiest companies. Actually, on a typical CPU-based data center or cloud instance, hundreds of high-performance models, including LLMs for generative AI, are accurate and performant. Enterprise-grade generative AI experimentation, prototyping, and deployment tools are rapidly developing in both open-source and proprietary communities.

By utilizing all of their resources, astute CIOs can leverage AI that transforms businesses without incurring the expenses and hazards associated with in-house development.

Up Next

A security executive at Microsoft refers to generative AI as a “super power” in the industry

Don't Miss

According to a senior Google executive, the AI legal framework must foster innovation

Komal

Technology

OpenAI Launches SearchGPT, a Search Engine Driven by AI

Published

19 hours ago

July 26, 2024

Archana Suryawanshi

The highly anticipated launch of SearchGPT, an AI-powered search engine that provides real-time access to information on the internet, by OpenAI is being made public.

“What are you looking for?” appears in a huge text box at the top of the search engine. However, SearchGPT attempts to arrange and make sense of the links rather than just providing a bare list of them. In one instance from OpenAI, the search engine provides a synopsis of its discoveries regarding music festivals, accompanied by succinct summaries of the events and an attribution link.

Another example describes when to plant tomatoes before decomposing them into their individual types. You can click the sidebar to access more pertinent resources or pose follow-up questions once the results are displayed.

At present, SearchGPT is merely a “prototype.” According to OpenAI spokesman Kayla Wood, the service, which is powered by the GPT-4 family of models, will initially only be available to 10,000 test users. According to Wood, OpenAI uses direct content feeds and collaborates with outside partners to provide its search results. Eventually, the search functions should be integrated right into ChatGPT.

It’s the beginning of what may grow to be a significant challenge to Google, which has hurriedly integrated AI capabilities into its search engine out of concern that customers might swarm to rival firms that provide the tools first. Additionally, it places OpenAI more squarely against Perplexity, a business that markets itself as an AI “answer” engine. Publishers have recently accused Perplexity of outright copying their work through an AI summary tool.

OpenAI claims to be adopting a notably different strategy, suggesting that it has noticed the backlash. The business highlighted in a blog post that SearchGPT was created in cooperation with a number of news partners, including businesses such as Vox Media, the parent company of The Verge, and the owners of The Wall Street Journal and The Associated Press. “News partners gave valuable feedback, and we continue to seek their input,” says Wood.

According to the business, publishers would be able to “manage how they appear in OpenAI search features.” They still appear in search results, even if they choose not to have their content utilized to train OpenAI’s algorithms.

According to OpenAI’s blog post, “SearchGPT is designed to help users connect with publishers by prominently citing and linking to them in searches.” “Responses have clear, in-line, named attribution and links so users know where information is coming from and can quickly engage with even more results in a sidebar with source links.”

OpenAI gains from releasing its search engine in prototype form in several ways. Additionally, it’s possible to miscredit sources or even plagiarize entire articles, as Perplexity was said to have done.

There have been rumblings about this new product for several months now; in February, The Information reported on its development, and in May, Bloomberg reported even more. A new website that OpenAI has been developing that made reference to the transfer was also seen by certain X users.

ChatGPT has been gradually getting closer to the real-time web, thanks to OpenAI. The AI model was months old when GPT-3.5 was released. OpenAI introduced Browse with Bing, a method of internet browsing for ChatGPT, last September; yet, it seems far less sophisticated than SearchGPT.

OpenAI’s quick progress has brought millions of users to ChatGPT, but the company’s expenses are mounting. According to a story published in The Information this week, OpenAI’s expenses for AI training and inference might total $7 billion this year. Compute costs will also increase due to the millions of people using ChatGPT’s free edition. When SearchGPT first launches, it will be available for free. However, as of right now, it doesn’t seem to have any advertisements, so the company will need to find a way to make money soon.

Technology

Google Revokes its Intentions to stop Accepting Cookies from Marketers

Published

4 days ago

July 23, 2024

Archana Suryawanshi

Following years of delay, Google has announced that it will no longer allow advertisers to remove and replace third-party cookies from its Chrome web browser.

Cookies are text files that websites upload to a user’s browser so they can follow them around when they visit other websites. A large portion of the digital advertising ecosystem has been powered by this practice, which makes it possible to track people across many websites in order to target ads.

Google stated in 2020 that it would stop supporting certain cookies by the beginning of 2022 after determining how to meet the demands of users, publishers, and advertisers and developing solutions to make workarounds easier.

In order to do this, Google started the “Privacy Sandbox” project in an effort to find a way to safeguard user privacy while allowing material to be freely accessible on the public internet.

In January, Google declared that it was “extremely confident” in the advancement of its plans to replace cookies. One such proposal was “Federated Learning of Cohorts,” which would essentially group individuals based on similar browsing habits; thus, only “cohort IDs”—rather than individual user IDs—would be used to target them.

However, Google extended the deadline in June 2021 to allow the digital advertising sector more time to finalize strategies for better targeted ads that respect user privacy. Then, in 2022, the firm stated that feedback had indicated that advertisers required further time to make the switch to Google’s cookie replacement because some had resisted, arguing that it would have a major negative influence on their companies.

The business announced in a blog post on Monday that it has received input from regulators and advertisers, which has influenced its most recent decision to abandon its intention to remove third-party cookies from its browser.

According to the firm, testing revealed that the change would affect publishers, advertisers, and pretty much everyone involved in internet advertising and would require “significant work by many participants.”

Anthony Chavez, vice president of Privacy Sandbox, commented, “Instead of deprecating third-party cookies, we would introduce a new experience in Chrome that lets people make an informed choice that applies across their web browsing, and they’d be able to adjust that choice at any time.” “We’re discussing this new path with regulators and will engage with the industry as we roll it out.”

Technology

Samsung Galaxy Buds 3 Pro Launch Postponed Because of Problems with Quality Control

Published

7 days ago

July 20, 2024

Archana Suryawanshi

At its Unpacked presentation on July 10, Samsung also debuted its newest flagship buds, the Galaxy Buds 3 Pro, with the Galaxy Z Fold 6, Flip 6, and the Galaxy Watch 7. Similar to its other products, the firm immediately began taking preorders for the earphones following the event, and on July 26th, they will go on sale at retail. But the Korean behemoth was forced to postpone the release of the Galaxy Buds 3 Pro and delay preorder delivery due to quality control concerns.

The Galaxy Buds 3 Pro went on sale earlier this week in South Korea, Samsung’s home market, in contrast to the rest of the world. However, allegations of problems with quality control quickly surfaced. These included loose case hinges, earbud joints that did not sit flush, blue dye blotches, scratches or scuffs on the case cover, and so on. It appears that the issues are exclusive to the white Buds 3 Pro; the silver devices are working fine.

Samsung reportedly sent out an email to stop selling Galaxy Buds 3 Pros, according to a Reddit user. These problems appear to be a result of Samsung’s inadequate quality control inspections. Numerous user complaints can also be found on its Korean community forum, where one consumer claims that the firm would enhance quality control and reintroduce the earphones on July 24.

A Samsung official stated. “There have been reports relating to a limited number of early production Galaxy Buds 3 Pro devices. We are taking this matter very seriously and remain committed to meeting the highest quality standards of our products. We are urgently assessing and enhancing our quality control processes.”

“To ensure all products meet our quality standards, we have temporarily suspended deliveries of Galaxy Buds 3 Pro devices to distribution channels to conduct a full quality control evaluation before shipments to consumers take place. We sincerely apologize for any inconvenience this may cause.”

Should Korean customers encounter problems with their Buds 3 Pro devices after they have already received them, they should bring them to the closest service center for a replacement.

Possible postponement of the US debut of the Galaxy Buds 3 Pro

Samsung seems to have rescheduled the launch date and (some) presale deliveries of the Galaxy Buds 3 Pro in the US and other markets by one month. Inspect your earbuds carefully upon delivery to make sure there are no issues with quality control, especially if your order is still scheduled for July.

The Buds 3 Pro is currently scheduled for delivery in late August, one month after its launch date, on the company’s US store. Additionally, Best Buy no longer takes preorders for the earphones, and Amazon no longer lists them for sale.

There are no quality control difficulties affecting the Buds 3, and they are still scheduled for delivery by July 24, the day of launch. Customers of the original Galaxy Buds 3 Pro have reported that taking them out is easy to tear the ear tips. Samsung’s delay, though, doesn’t seem to be related to that issue.