An AI expert discusses the hardware and infrastructure needed to properly run and train AI models

Business Insider

Powerful servers and processors are the workhorses of AI. Behind the scenes, Intel® Xeon® processors handle high-performance computing for a wide range of tasks, including AI applications, while the Habana® Gaudi® accelerator tackles deep learning workloads. Together, this infrastructure duo is helping train and deploy revolutionary AI models.

Transcript

00:00 When you're looking at how to infuse AI in your application,

00:05 the hardware conversation happens later.

00:08 When you get to the point to where you understand what your workload is,

00:12 whether it's built in-house or developed externally,

00:16 you know what workload you want to run.

00:18 Then you start looking at, okay, infrastructure, what do I run this on?

00:23 AI is not necessarily a special built AI box

00:28 that you plug into the wall and it does your AI.

00:31 AI needs to run on your general purpose infrastructure.

00:35 I'm Monica Livingston, and I lead the AI Center of Excellence at Intel.

00:43 Being able to run your AI applications on general purpose infrastructure

00:48 is incredibly important because then your cost for additional infrastructure is reduced.

00:55 For Intel, we spend a lot of time adding AI performance features

01:00 into our Intel Xeon scalable processors.

01:03 For the types of AI that can't just simply run on a processor,

01:08 we are offering accelerators.

01:10 We have Intel discrete GPUs, and we have our Havana Gaudi product,

01:15 which is an AI ASIC specializing on deep learning, training, and inference.

01:21 When you're trying to think of Xeon and Gaudi,

01:24 you would use Gaudi to train very large models.

01:27 So you would have your Gaudi cluster and train a very large model,

01:32 hundreds of billions of parameters.

01:34 If your model is under 20 billion parameters,

01:38 generally that can run inference on Xeon.

01:41 So after you have trained your model to actually go and run it,

01:45 you can run that on your Xeon boxes.

01:48 This processor family has a number of AI optimizations in it.

01:53 The AMX feature, advanced matrix extensions, is our newest feature

01:58 that's in this current generation of Xeon processors.

02:01 And it enables us to run deep learning, training, and inference a lot faster on the CPUs.

02:09 And again, that built-in acceleration would enable a customer or an enterprise

02:14 to run these models on a CPU versus a more expensive compute.

02:21 The future is that you will have many generative AI models

02:27 for different types of purposes within your company.

02:30 And if all of them take millions to train,

02:33 that return on investment doesn't look as favorable

02:36 as if you had these smaller models, much more efficient models,

02:40 that can run on multipurpose architecture.

02:43 And so you're not having to stand up new infrastructure specifically for this.

02:48 (music)

02:51 (music)

02:54 (music)

02:57 (music)

03:00 (sonic logo)

Category

Transcript

Recommended