NVIDIA woos customers with massive AI training systems

During a presentation in front of machine learning and data scientists in Taiwan, NVIDIA CEO Jensen Huang announced on stage a new data center class system that promises to offer customers an unparalleled level of performance for artificial intelligence and HPC (high performance compute) applications.

NVIDIA’s leadership position in the world of machine learning and AI processing is a well understood topic at this point, though with continued pressure from companies like Intel and Google, the graphics giant must continue to prove to developers and investors that it is not losing a step to advancements other chip designers are making. With interest in the company’s capabilities in machine learning driving the stock up to all-time highs, every announcement and partnership is scrutinized.

The company’s data center revenue has increased from $151M in early 2017 to more than $700M last quarter, with an increase of 71% in the last year alone. Huang has put NVIDIA on the right path for growth in a market that it helped develop over the last decade, and with projected markets for AI and machine learning chips to exceed $10B by 2020, there is a much larger opportunity still on the horizon.

MW-GK215_nvidia_20180601134031_MG.jpg

The newly announced HGX-2 platform from NVIDIA combines sixteen of the fastest and most powerful Tesla V100 graphics chips in a single server. Using in-house designed high-speed interconnects called NVSwitch, the chips cans operate in tandem to act like a single massive source of compute, enabling customers and developers to solve some of the world’s toughest and largest AI and machine learning problems.

What makes the NVIDIA HGX-2 more interesting is that it represents a platform from which partner companies and system manufacturers can design and build their own machines. Leading server providers like Lenovo and Supermicro have already announced their intent to bring systems to market this year while top ODMs (design manufacturers) like Foxconn and Quanta are said to be building HGX-2 based configurations for deployment in some of the world’s largest cloud data centers.

Though no announcements for the higher performing HGX-2 have been made, NVIDIA had flagship customers of the previous generation HGX-1 design with Microsoft and Facebook. Both utilized the machine learning platform for advanced AI services.

For now, the only specific customer that is using the HGX-2 platform is NVIDIA itself, offering up the DGX-2 server for machine learning tasks for just south of $400,000.

Performance claims from NVIDIA paint a stark picture for the competition. The company says that the HGX-2 platform is 300x faster than a dual-chip Intel Xeon server for artificial intelligence training and 60x faster for HPC tasks like data analytics, oil & gas research, etc. These are incredibly bold claims that will likely shift as the development resources of someone like Intel turn and focus on AI implementations, but NVIDIA’s lead is hard to discredit today.

The HGX-2 is the fastest, and newest part of a portfolio of GPU-based server platforms NVIDIA offers to customers. These different configurations offer the ability to target workloads ranging from the most intense AI and deep learning datasets to speech and language processing to video and image inferencing and even fluid dynamics and defense. NVIDIA has produced a well-rounded collection of solutions for all manner of machine learning and AI tasks and that is a significant reason the company has solidified its stance in the market over several years.

NVIDIA will continue to see challenges to that stance ratchet up through the end of the decade. Intel, and even AMD, see the $10B market that artificial intelligence computing creates as the next technological battlefield. But any player wishing to catch up to NVIDIA’s lead will have to clear the sizeable hurdles of R&D, and luck, to have a shot.