AMD is setting stage for huge AI growth in 2024

Many of my recent opinion pieces on MarketWatch have centered around the world of AI and how computing companies like AMD, Nvidia, and Intel are addressing it. Sometimes I field questions asking me why, or if, I think the AI revolution we are witnessing is really that, or is it just a fad? I think AMD was under the same kind of pressure from its investors: is growth in the AI space really something that is sustainable and will bring value to shareholders?

Starting off the company’s “Accelerate AI” event held in San Jose yesterday, CEO Lisa Su tried to squelch some of that debate. One of the first slides she showed described how the projected TAM (total addressable market, or the market size to which AMD could potentially target with its chips) for data center AI accelerators through 2027 has increased from $150B to a projected $400B in just the last year. That is a growth rate of more than 70% for the next 4 years and justified the last words I heard from Lisa on stage: 

“AI is absolutely the #1 priority at AMD.”

And witnessing the change in the company’s direction, its acceleration in its AI software development and engagement with partners, it’s easy to believe she is telling the truth.

AMD has projected that the MI300 family of products will bring at least a $2B revenue uplift, a bold statement for a company that typically is timid about projecting that much monetary confidence.

But the AI-focused event this week detailed why AMD is confident in its products and its position versus competitors. Two new data center AI processors were announced, the Instinct MI300X and the MI300A. At a high level the MI300X is the competitor to traditional data center GPUs like the Nvidia H100 while the MI300A is a combination of CPU and GPU cores in a single package, creating a hybrid product similar to Grace Hopper from Nvidia and the delayed Falcon Shores project from Intel.

Focus was on the MI300X product as it is shipping and available today from several partners. AMD went into a lot of detail on its performance claims for the MI300X, all of which need some external third-party validation of course, including matching performance with the Nvidia H100 GPU in AI training workloads and 40-60% faster performance in AI inference workloads. The MI300X offers 192GB of memory per GPU, while the H100 is limited to 80GB and this is likely a big reason for the performance of AMD’s new chip.

Nvidia did announce the H200 just last month with up to 141GB of memory and better overall performance, but it isn’t available for testing yet so AMD couldn’t run any comparisons.

The software ecosystem has been a big focus for AMD over the last year, ever since Su stepped on stage at CES in January 2023 to hold up the MI300 chip for the first time, promising big things for its AI strategy. Nvidia’s lead with its CUDA development platform is still a significant hurdle, but AMD has made a lot of in-roads, and the evolution of the AI ecosystem to more standardized models and frameworks is also helping. The release of a new version of its CUDA competitor, ROCm 6, includes many model optimizations and development library improvements. OpenAI has signed up to support the MI300 in its standard release going forward and AMD had representatives from AI companies like databricks and Lamini on stage backing up the software progress from AMD, with one even claiming it had “moved beyond CUDA.”

AMD also trotted out on stage a bevy of big names, including Microsoft, Meta, Oracle, Dell, and others. Microsoft announced immediate availability of MI300X based instances in its Azure cloud infrastructure and Dell announced it was ready to take orders today. These are important milestones for AMD. Because it doesn’t have the market dominance that Nvidia does it requires the support of partners to help sell its product. Luckily for AMD the shortage of Nvidia GPUs in the market has created a void and plenty of hungry partners that need to fill orders.  

For Nvidia this announcement isn’t a surprise, and the company has been putting things in place for its arrival since AMD first announced the MI300 back in 2022. The H200 and the recent performance uplift announcements that came from Nvidia were clearly meant to numb the impact that AMD’s product release would have on its perceived leadership. And I don’t imagine we’ll see Nvidia chip sales slowdown because of the MI300X, but rather the AMD part will simply be there to fill in bubbles in the production pipeline with vendors like Microsoft, Dell, and Lenovo. As the size of the data center AI market continues to expand, more areas will open up for AMD to sellout its MI300 chips for the foreseeable future.

But that is dangerous for Nvidia long term. As more customers use the AMD MI300X, and find that it performs well, has the software support they need to be successful, and maybe even discover it provides a better performance per dollar, it opens the door for AMD in future generations of AI system integrations. When the market shortage pulls back, more cloud providers, more system builders, and more developers will consider AMD GPUs where they previously wouldn’t have wanted to take a chance on an unknown.

Intel has more at risk than Nvidia from MI300X family simply because this makes AMD the defacto second source for AI compute in the data center, if it wasn’t already. The efforts for Intel to push into the GPU space have seemingly slowed in recent quarters, with most of the emphasis for CEO Pat Gelsinger and team leaning into the Gaudi AI accelerators that are based on a very different architecture. The company is hosting its own AI event on December 14th so we’ll known soon how it plans to address the AI markets in both the data center and PC spaces.