X

Nvidia Unveils NIM for Seamless Deployment of AI Models in Production

Nvidia unveiled Nvidia NIM, a new software platform intended to speed up the deployment of personalized and pre-trained AI models into production environments, at its GTC conference today. By combining a model with an optimized inferencing engine and packing it into a container that can be accessed as a microservice, NIM takes the software work that Nvidia has done around inferencing and optimizing models and makes it easily accessible.

According to Nvidia, if the company had any internal AI talent at all, it would normally take developers weeks, if not months, to ship similar containers. For businesses looking to accelerate their AI roadmap, Nvidia’s NIM clearly aims to build an ecosystem of AI-ready containers that use its hardware as the base layer and these carefully chosen microservices as the main software layer.

Currently, NIM supports open models from Google, Hugging Face, Meta, Microsoft, Mistral AI, Stability AI, A121, Adept, Cohere, Getty Images, and Shutterstock in addition to models from NVIDIA. To make these NIM microservices available on SageMaker, Kubernetes Engine, and Azure AI, respectively, Nvidia is already collaborating with Amazon, Google, and Microsoft. Additionally, they’ll be incorporated into LlamaIndex, LangChain, and Deepset frameworks.

In a press conference held prior to today’s announcements, Manuvir Das, Nvidia’s head of enterprise computing, stated, “We believe that the Nvidia GPU is the best place to run inference of these models on […] and we believe that NVIDIA NIM is the best software package, the best runtime, for developers to build on top of so that they can focus on the enterprise applications — and just let Nvidia do the work to produce these models for them in the most efficient, enterprise-grade manner, so that they can just do the rest of their work.”“

TensorRT, TensorRT-LLM, and Triton Inference Server will be the inference engines used by Nvidia. Nvidia microservices that will be made available via NIM include the Earth-2 model for weather and climate simulations, cuOpt for routing optimizations, and Riva for customizing speech and translation models.

The Nvidia RAG LLM operator, for instance, will soon be available as a NIM, a move that the company hopes will simplify the process of creating generative AI chatbots that can extract unique data.

Without a few announcements from partners and customers, this wouldn’t be a developer conference. Presently, NIM’s clientele includes companies like Box, Cloudera, Cohesity, Datastax, Dropbox, and NetApp.

NVIDIA founder and CEO Jensen Huang stated, “Established enterprise platforms are sitting on a goldmine of data that can be transformed into generative AI copilots.” “These containerized AI microservices, developed with our partner ecosystem, are the building blocks for enterprises in every industry to become AI companies.”

Categories: Technology
Kajal Chavan:
X

Headline

You can control the ways in which we improve and personalize your experience. Please choose whether you wish to allow the following:

Privacy Settings

All rights received