Powerful servers and processors are the workhorses of AI. Behind the scenes, Intel® Xeon® processors handle high-performance computing for a wide range of tasks, including AI applications, while the Habana® Gaudi® accelerator tackles deep learning workloads. Together, this infrastructure duo is helping train and deploy revolutionary AI models.
Category
🤖
TechTranscript
00:00 When you're looking at how to infuse AI in your application,
00:05 the hardware conversation happens later.
00:08 When you get to the point to where you understand what your workload is,
00:12 whether it's built in-house or developed externally,
00:16 you know what workload you want to run.
00:18 Then you start looking at, okay, infrastructure, what do I run this on?
00:23 AI is not necessarily a special built AI box
00:28 that you plug into the wall and it does your AI.
00:31 AI needs to run on your general purpose infrastructure.
00:35 I'm Monica Livingston, and I lead the AI Center of Excellence at Intel.
00:43 Being able to run your AI applications on general purpose infrastructure
00:48 is incredibly important because then your cost for additional infrastructure is reduced.
00:55 For Intel, we spend a lot of time adding AI performance features
01:00 into our Intel Xeon scalable processors.
01:03 For the types of AI that can't just simply run on a processor,
01:08 we are offering accelerators.
01:10 We have Intel discrete GPUs, and we have our Havana Gaudi product,
01:15 which is an AI ASIC specializing on deep learning, training, and inference.
01:21 When you're trying to think of Xeon and Gaudi,
01:24 you would use Gaudi to train very large models.
01:27 So you would have your Gaudi cluster and train a very large model,
01:32 hundreds of billions of parameters.
01:34 If your model is under 20 billion parameters,
01:38 generally that can run inference on Xeon.
01:41 So after you have trained your model to actually go and run it,
01:45 you can run that on your Xeon boxes.
01:48 This processor family has a number of AI optimizations in it.
01:53 The AMX feature, advanced matrix extensions, is our newest feature
01:58 that's in this current generation of Xeon processors.
02:01 And it enables us to run deep learning, training, and inference a lot faster on the CPUs.
02:09 And again, that built-in acceleration would enable a customer or an enterprise
02:14 to run these models on a CPU versus a more expensive compute.
02:21 The future is that you will have many generative AI models
02:27 for different types of purposes within your company.
02:30 And if all of them take millions to train,
02:33 that return on investment doesn't look as favorable
02:36 as if you had these smaller models, much more efficient models,
02:40 that can run on multipurpose architecture.
02:43 And so you're not having to stand up new infrastructure specifically for this.
02:48 (music)
02:51 (music)
02:54 (music)
02:57 (music)
03:00 (sonic logo)