NVIDIA Makes Case for Training AI Models On-Premises
NVIDIA has announced that a suite of tools and frameworks to train artificial intelligence (AI) models running on platforms deployed in on-premises IT environments is now generally available.
The NVIDIA AI Enterprise platform is designed to be deployed on instances of VMware deployed on servers from Atos, Dell Technologies, GIGABYTE, Hewlett Packard Enterprise (HPE), Inspur, Lenovo, and Supermicro that have all been certified by NVIDIA. In addition, Dell announced that Dell EMC VxRail has become the first hyperconverged platform to be certified to run NVIDIA AI Enterprise.
At the same time, Domino Data Lab revealed that its machine learning operations (MLOps) platform for managing the development of AI models has also been certified compatible with NVIDIA AI Enterprise.
While most AI models are trained in the cloud because of the cost of acquiring platforms based on graphical processor units (GPUs), there comes a time when the amount of data being processed to train an AI model makes it more cost-effective to deploy servers in an on-premises IT environment. The monthly costs associated with consuming GPUs via a cloud service quickly adds up, notes Manuvir Das, head of enterprise computing at NVIDIA.
Subscription licenses for NVIDIA AI Enterprise start at $2,000 per CPU socket for one year and include Business Standard Support. Perpetual licenses are $3,595 and require additional support purchase. Customers can also upgrade to Business Critical Support that is available on a 24×7 basis. Even though the platforms are based on GPUs, NVIDIA is licensing its software using a per CPU socket basis. “It’s a pricing model that enterprises are familiar with,” says Das.
It’s not clear at what precise point it makes more economic sense to train AI models in an on-premises IT environment. NVIDIA to lower the barrier for entry for training AI models recently allied with Equinix to make NVIDIA supercomputers for training AI models available for rent via a monthly subscription that starts at $90,000.
Also read: NVIDIA Looks to Lower AI Barrier to Entry
Reducing AI Modelling Costs
No matter how the training of AI models gets sliced up it’s expensive to build and maintain AI models. The good news is the amount of data required to train AI models continues to decline. However, just about every application at some point is going to need to incorporate an AI model, so the number of AI models that need to be trained, maintained and updated is only going to exponentially increase. Right now, many organizations over a year consider themselves lucky to deploy a handful of AI models in production environments.
The pressure to automate the building of AI models in ways that are not dependent on manual processes that require a team of data scientists and engineers is building. In fact, vendors such as GitLab are making a case treating AI models much like any other software artifact that should be managed within the context of a larger set of best DevOps practices.
Each IT team will need to decide how best to go about managing the building and deployment of AI models for themselves but the one thing that is certain is there will soon be a lot more of them floating around the enterprise than anyone might have initially imagined.
Read next: NVIDIA, VMware Create the AI-Ready Enterprise Platform at Cloud Scale