Technology

Quantization of models and the emergence of edge AI

Published

5 months ago

December 26, 2023

Komal

Quantization of models and the emergence of edge AI

The amalgamation of edge computing and artificial intelligence holds the potential to revolutionize numerous industries. In this case, the quick development of model quantization—a method that increases portability and decreases model size to enable faster computation—is crucial.

When paired with appropriate methods and tools, edge AI has the potential to completely change how we interact with data and data-driven applications.

Why does AI edge?

Bringing data processing and models closer to the point of data generation—that is, to a remote server, tablet, IoT device, or smartphone—is the goal of edge AI. This makes real-time, low-latency AI possible. By 2025, deep neural networks will analyze more than half of all data at the edge, predicts Gartner. This paradigm change will have several benefits.

Decreased latency:

Edge AI eliminates the need to send data back and forth to the cloud by processing data directly on the device. Applications that need quick responses and rely on real-time data must take this into consideration.

Decreased complexity and costs:

Sending information back and forth doesn’t require costly data transfers when data is processed locally at the edge.

Data stays on the device, minimizing security risks related to data transmission and data leakage. This preserves privacy.

Improved scalability:

Applications can be scaled more easily without depending on a central server for processing power thanks to the decentralized strategy with edge AI.

Manufacturers can integrate edge AI, for instance, into their defect detection, quality control, and predictive maintenance procedures. Manufacturers can better utilize real-time data to decrease downtime and enhance production processes and efficiency by implementing AI and locally analyzing data from smart machines and sensors.

Model quantization’s function

AI models must be optimized for performance without sacrificing accuracy in order for edge AI to be successful. AI models are growing larger, more complex, and more intricate, which makes them more difficult to manage. This makes it difficult to deploy AI models at the edge, since edge devices frequently have low resources and are unable to support these kinds of models.

Model quantization makes the models lighter and more appropriate for deployment on resource-constrained devices like mobile phones, edge devices, and embedded systems by reducing the numerical precision of the model parameters (from 32-bit floating point to 8-bit integer, for example).

Three methods—GPTQ, LoRA, and QLoRA—have surfaced as possible game-changers in the field of model quantization:

Models are compressed as part of GPTQ after training. When deploying models in settings with constrained memory, it works perfectly.

Large pre-trained models must be adjusted for inferencing in LoRA. In particular, it adjusts the smaller matrices (called LoRA adapters) that comprise the large matrix of a model that has already been trained.

Using GPU memory for the pre-trained model makes QLoRA a more memory-efficient choice. When modifying models for new tasks or data sets with limited computational resources, LoRA and QLoRA are particularly helpful.

The particular requirements of the project, whether it is in the deployment or fine-tuning phase, and whether it has the computational resources available all play a significant role in the method selection. Developers can effectively push AI to the limit by utilizing these quantization techniques, striking a balance between efficiency and performance—a crucial aspect for many applications.

Edge platforms and use cases for AI

Edge AI has a wide range of uses. The possibilities are endless: wearable health devices that identify abnormalities in the wearer’s vitals; smart cameras that process images for rail car inspections at train stations; and smart sensors that keep an eye on inventory on store shelves. For this reason, IDC projects that spending on edge computing will amount to $317 billion by 2028. The edge is changing the way businesses handle data.

Strong edge inferencing databases and stacks will become more and more in demand as businesses realize the advantages of AI inferencing at the edge. These platforms offer all the benefits of edge AI, including lower latency and increased data privacy, while also facilitating local data processing.

Related Topics:EdgeAIRevolution EdgeAIUseCases EdgeInferencingPlatforms GPTQ LoRA ModelQuantization QLoRA

Up Next

Android users can now access Microsoft’s Copilot AI assistant

Don't Miss

Concerns about how AI will affect the 2024 election are growing

Komal

Technology

Techno and IBM Watsonx is a New Era of Reliable AI Announced by Mahindra

Published

1 hour ago

May 18, 2024

Kajal Chavan

Together with IBM, Tech Mahindra, a global leader in digital solutions and technology consulting, is assisting organizations in accelerating the adoption of generative AI in a sustainable manner around the globe.

This partnership combines IBM’s Watsonx AI and data platform with AI Assistants with Tech Mahindra’s array of AI products, TechM amplifAI0->∞.

Customers may now access a range of new generative AI services, frameworks, and solution architectures by combining Tech Mahindra’s AI engineering and consulting talents with IBM Watsonx’s capabilities. This makes it possible to create AI programs that let businesses automate operations using their reliable data. Additionally, it gives companies a foundation on which to build reliable AI models, encourages explainability to help control bias and risk, and permits scalable AI deployment in on-premises and hybrid cloud settings.

Chief digital services officer of Tech Mahindra Kunal Purohit says that in order to revitalize businesses, organizations should prioritize responsible AI practices and the integration of generative AI technology.

“Our partnership with IBM can facilitate digital transformation for businesses, the uptake of GenAI, modernization, and ultimately business expansion for our international clientele,” Purohit continued.

Tech Mahindra has created an operational virtual Watsonx Center of Excellence (CoE) to better improve business skills in AI. Using their combined competencies to produce unique offers and solutions, this CoE serves as a co-innovation center, with a dedicated team tasked with optimizing synergies between the two organizations.

The collaborative offerings and solutions developed through this partnership could help enterprises achieve their goals of constructing machine learning models using open-source frameworks while also enabling them to scale and accelerate the impact of generative AI. These AI-driven solutions have the potential to aid organisations enhance efficiency and productivity responsibly.

IBM Ecosystem General Manager Kate Woolley emphasized the potential of the partnership and added that, when generative AI is developed on a basis of explainability, openness, and trust, it may act as a catalyst for innovation and open up new market opportunities.

“Our partnership with Tech Mahindra is anticipated to broaden Watsonx’s user base and enable even more clients to develop reliable AI as we strive to integrate our know-how and technology to support enterprise use cases like digital labor, code modernization, and customer support,” stated Woolley.

This partnership is in line with Tech Mahindra’s ongoing efforts to revolutionize businesses through cutting-edge AI-led products and services. Some of their most recent offerings include Evangelize Pair Programming, Imaging amplifAIer, Operations amplifAIer, Email amplifAIer, Enterprise Knowledge Search, and Generative AI Studio.

The two businesses had previously worked together, which is noteworthy. On the company’s Singapore site, Tech Mahindra had announced earlier this year that it would be opening a Synergy Lounge in partnership with IBM. For APAC organizations, this lounge aims to expedite the adoption of digital. Technology like as artificial intelligence (AI), intelligent automation, edge computing, 5G, hybrid cloud, and cybersecurity can all be effectively implemented and utilized with its assistance.

In addition to Tech Mahindra, IBM Watsonx has been applied in other partnerships to expedite the application of generative artificial intelligence. Early in the year, the GSMA and IBM also announced a new cooperation to develop the GSMA Foundry Generative AI program and GSMA Advance’s AI Training program, respectively, to boost the use and capabilities of generative AI in the telecom industry.

The program is also available digitally, and it covers the technical underpinnings of generative AI in addition to its business strategy. For architects and developers looking for in-depth, useful expertise on generative AI, this program employs IBM Watsonx to deliver hands-on training.

Technology

OpenAI Enhances ChatGPT with Google Drive Integration, Streamlined File Access, and Advanced Analytics

Published

2 hours ago

May 18, 2024

Kajal Chavan

A major update to ChatGPT was released by OpenAI, enabling users to analyze data straight from OneDrive and Google Drive without having to download and upload. Over the following few weeks, this new feature—which is only available to ChatGPT subscribers who have paid—will be gradually added to the service with the goal of streamlining data analysis and saving customers time and trouble.

According to a blog post by OpenAI, “ChatGPT is now more connected to your data than ever before.” “With the integration of Google Drive and OneDrive, you can directly access and analyse your files – from Excel spreadsheets to PowerPoint presentations – within the chatbot.”

According to OpenAI, ChatGPT can analyze files “more quickly” because to this direct access, which is available to ChatGPT Plus, Enterprise, and Teams users. However, GPT-4o, the improved version of GPT-4 that powers ChatGPT’s premium tiers, is presently the only way to access the additional data analytics tools.

OpenAI has enhanced ChatGPT’s comprehension and manipulation of data, going beyond simple file access. Now, a variety of data-related operations may be carried out by users using natural language commands, such as:

Executing analytics-related Python code
Combining and streamlining datasets
Producing graphs with data from files

Additionally, ChatGPT’s charting capabilities have improved significantly. Now, users may expand their views, engage with the created tables and charts, and personalize the visualisations by altering the colors, posing queries about particular cells, and more. With the exception of several chart types, the chatbot can now create static versions of interactive bar, line, pie, and scatter plot charts.

Additionally, OpenAI emphasized the security of user data. Users of ChatGPT Teams and Enterprise will not have their data used to train AI models, and ChatGPT Plus members have the option to disable this capability.

Technology

India is the Most Adopting Country in Asia Pacific for Generative AI

Published

2 hours ago

May 18, 2024

Kajal Chavan

India’s use of Generative AI (GenAI) is demonstrated in a research produced by Deloitte titled Generative AI in Asia Pacific: Young Employees Lead as Employers Play Catch-Up. Out of 13 nations, India ranks first in terms of the use and adoption of GenAI, according to a poll conducted among 11,900 people in Asia Pacific. It is astounding to learn that 83% of Indian workers and 93% of students actively use this technology.

India has a good adoption rate of GenAI, which is driven by youthful, tech-savvy workers known as “Generation AI.” These young employees are increasing productivity, learning new skills, managing workloads, and saving time by utilizing GenAI. Employers are facing new opportunities and problems as a result of this shift.

The study estimates that within the following five years, everyday utilization of GenAI would rise by 182%. The belief that GenAI can increase the Asia-Pacific region’s contribution to the global economy is reflected in this growth. Eighty-three percent of Indians think it improves social results, and about seventy-five percent think it has economic benefits.

Important Discoveries:

Though only 50% of workers and students in Asia Pacific think their bosses are aware of their use, they are driving the GenAI revolution.
Seventeen percent of Asia Pacific’s working hours, or around 1.1 billion hours a year, could be impacted by GenAI.
More rapidly than industrialized economies, developing nations are implementing GenAI at a rate of thirty percent.
Around 6.3 hours are saved weekly by GenAI users in Asia Pacific, while 7.85 hours are saved by Indian users.
Work-life balance has been enhanced, according to 41% of time-saving GenAI users.
As per the staff of these businesses, seventy-five percent of them have not adopted GenAI yet.

The AI and data capability leader for Deloitte Asia Pacific, Chris Lewin, stated, “One of the most exciting things about working with GenAI is that it is happening to everything, everywhere, all at once, across the globe.” “Over the past twelve months, we have observed that teams in Italy and Ireland can very immediately relate to the issues that our clients in Indonesia or India are facing.” A crucial insight is that while the swift integration of AI won’t result in the immediate loss of jobs, companies that don’t adjust will bear the consequences. Competing companies that provide AI solutions that have the potential to completely change the nature of modern work will attract their employees, especially fresh talent.