Brainstorm AI 2024: Chipping Away at the Competition

Fortune

Andrew Feldman, Co-founder and CEO, Cerebras Systems, Mark Papermaster, Executive Vice President and Chief Technology Officer, Advanced Micro Devices (AMD), Sandra Rivera, Chief Executive Officer, Altera, Sharon Goldman, Fortune

Transcript

00:00Welcome everyone. So great to talk to you today. Mark, I'm actually going to turn

00:05right to you and I'm going to start with what is certainly the gorilla in the

00:09room and that is NVIDIA. Is there a way to dethrone NVIDIA or in AMD's case to

00:15at least close the gap? Well great question and you know Sharon we've been

00:20competing with NVIDIA for decades because we've been in the graphics

00:23business through our acquisition of ATI. But when you look at the story of AMD

00:28it's twofold. We started with focusing on CPUs, grow the revenue of the company

00:33but what people don't realize is under the cover we were working on

00:37heterogeneous computing and our open software stack, rock them underneath. Now

00:42we are competitive. We now we went from almost zero in 2023 of AI revenue to

00:47five billion projected this year. And so we're strong actually leadership and

00:53data center GPU inferencing and watch out on training. We're bringing the

00:57competition right to NVIDIA and training. The market needs competition

01:02and that's what we're focused on. Andrew and Sandra so this is a complex

01:08landscape and the competitive landscape is very nuanced whether it comes to

01:13training and inference and all sorts of areas of the semiconductor industry. I

01:17want to ask you also Andrew about the competitive landscape from other

01:21startups, hyperscalers and then Sandra I'd like to hear about the competitive

01:25space on your end. Andrew? Well I think there's so many players in the market

01:30because the markets are darn big. I think we have NVIDIA, we have AMD

01:37and under Lisa and Mark's leadership doing great things. But you also have the

01:42hyperscalers seeing opportunity explode and wanting to do their own chips. You

01:47have a collection of startups I'd like to think led by Cerebris doing

01:52pioneering work. I think when you have this sort of activity it's because you

01:58have this an explosion of demand and we're seeing it on the training side

02:03we're seeing it on the inferencing side and I think truth be told none of us can

02:08keep up. It's kind of a hunger games out there. Sandra what about you at Altera?

02:16Well I'll start by saying that it is still early days of AI and so there's so

02:21much innovation ahead of us and it really isn't you know how do you take on

02:26NVIDIA it is how you have the best device for the workload that you're

02:32driving and to Andrew's point the workloads the segments the customers are

02:36so diverse that it is not a one-size-fits-all and that's why you

02:41have CPUs and GPUs and accelerators that are bespoke and dedicated to AI and even

02:48in my field field programmable gate arrays FPGAs which are highly

02:52customizable highly flexible and highly valued in markets that are changing very

02:58dynamically very quickly like AI like cybersecurity certainly wireless

03:03standards and those types of applications lend themselves to devices

03:07that are highly programmable and flexible to the workload so I think

03:12there's a lot of opportunity there's still early days lots of innovation to

03:17come and you know Mark and I were commenting before the session of the

03:21fact that competition is good makes you sharper makes you more focused and it

03:26certainly addresses the very very broad customer requirements that are out there.

03:30But Mark but what would you say I mean NVIDIA is not just any competitor I mean

03:34like I had an analyst tell me that NVIDIA doesn't have any real predators

03:38in the wild do you see that or do you see that competition is growing and that

03:43there's really a place to close the gap and find that opening even more so for

03:48your company? Yeah I don't believe that whatsoever that's why we're making the

03:52gains that we are we're actually getting tremendous pull you think about the

03:56placements we have with first-party applications and several hyper scalers

04:01with our GPUs you saw Microsoft announced that they're in production

04:04with our mi300 GPU Meta announced that they're running the llama 3.1 470

04:12billion parameter model all the inferencing's running production on AMD

04:16and why is that because the market demands competition and so we're getting

04:22a tremendous pull we've not seen a moat again we're GPU just like NVIDIA so if a

04:27code was written low-level CUDA it can be in a straightforward way ported over

04:31and people are moving to Triton and Jack's and other higher-level vendor

04:38in you know unspecific coding and that's gonna really open up competition as well

04:43nobody wants to be locked into a particular hardware's that's right code

04:47yeah and I think this point on abstraction away from the underlying

04:51logic device is really powerful because most data scientists most subject matter

04:56experts most most of the programmers are really not at those low levels they're

05:03highly abstracted and they're just trying to get their workload deployed in

05:07the market at scale as quickly as possible so very much this is not so

05:11much what is the actual is it a GPU is it a CPU is it an AI accelerator and

05:17FPGA it's very much what is the system in the platform and then what is the

05:21software that enables that so that developers can actually be productive

05:25very very quickly right well this is the chips panel at brainstorm AI so I want I

05:29know folks will have questions out there and I'm gonna come in just a couple

05:33minutes so think it over but right now Andrew so IPO when I last wrote about I

05:40last wrote about cerebrus and and the potential IPO back in the beginning of

05:45October where are things at now can you share I cannot share this is gonna be

05:51very boring but I would like to Mark's an investor Mark would like to know I

06:02would say the following Mark's GPUs are running faster than than Nvidia's our

06:10accelerators are running in many instances 75 times faster than Nvidia's

06:16and you can go to artificial analysis and check daily scores on their

06:21benchmarks I think others not just ours others are running much faster as well I

06:28think what Mark said is is sort of fundamental and it's not just that it's

06:34better for us that there's competition I think it's better for you that there's

06:39competition it means that the AI in your applications will run faster and cost

06:43less and there aren't any markets where that proposition hasn't held and so I

06:50think that's the fundamental thing it's good for everybody if there's competition

06:54in AI if there's innovation at the underlying levels and I think that's

06:59we're seeing that across the board this battle for inference speed is really

07:04interesting there's like every week there seems like to be a new announcement

07:07including from cerebrus do you see that as a way to really diversify your

07:11customer base going forward I think the in 2017 open I published a paper which

07:18sort of predicted the vast rise in training compute and they called it the

07:24scaling laws and then in September they published a paper that found some of the

07:29same rules apply in inference which means we're gonna use a huge amount of

07:32inference compute in the years ahead an absurd amount of inference compute and

07:37we're gonna do inference not once but many times on the same query and if you

07:42can run fast you can deliver higher quality results in less time and so

07:47speed will be fundamental to the delivery of high quality inference in in

07:54a reasonable amount of time and so I think that's why you're seeing this

07:57extraordinary race to be fast at inference a need for speed you know I

08:04think many when I look out in the room I can see that there's a little bit of

08:09gray hair here and you remember when before we had broadband and trying to

08:14you know use the internet before broadband as a disaster and once we got

08:19broadband all of a sudden you had new applications you had streaming you had

08:23all these things that were fun and the engagement was what was high and I think

08:28that's what's happening right now with with AI is that as you get faster you

08:32move into the sort of the broadband era of AI inference and and things are

08:37engaging they're responsive and they're higher quality.

08:41Sandra obviously last week's news about Intel CEO resigning was big it was

08:47surprising and of course a lot of people want to know what that means for Altera

08:51I was wondering if you could share your thoughts on the on the future of our

08:55Altera does it change your strategy in any way? Yeah our plan has been to go out

09:01for an equity stake Intel is still planning to sell an equity stake in

09:05Altera and we're in the middle of that process now with a lot of great

09:07interest which is encouraging and we still plan for an IPO in 2026 that's

09:13that's been the plan and it really hasn't changed and the news from last

09:18week was you know sad on so many levels but what I try to keep the company and

09:24employees focused on is the things that we control because we really don't

09:27control any of that we control our commitments to customers our product

09:31execution our innovation and that's what we're focused on but the plan is the

09:36plan and it really it has not changed. Okay I'd love to see if there are any

09:40questions here okay yes do we have a mic mic person here we are and just state

09:47your name and your affiliation thank you yeah hi

09:51Pankaj Katia, X Intel and Qualcomm. Andrew just clarification this cerebrus

09:57training or inference or both? We do both. So my question is maybe Mark and

10:05Sandra AMD is on both sides of the wire right data center and inference same

10:15thing as X Intel or Intel fundamentally what advantage does that give an AMD and

10:24Intel and maybe cerebrus because when we think about Nvidia we think primarily

10:30about the data center on the edge they have graphics but they don't have the

10:36full SOC if you would right so fundamentally how does that better

10:44position the three companies end to end? I can start I think it's important to

10:52even think about the analogy that Andrew just stated it was back to the internet

10:57and thinking when all the applications came out and it went sort of vertical

11:01sector by vertical sector and it changed its technology needs were different

11:04that's what you're gonna see is AI explodes so of course there's training

11:09and there's different requirements for these you know these foundational large

11:13language models today they're all on a GPU and as I said earlier we're now

11:17bringing competition there but there's an advantage there when you want to run

11:20inference on those largest foundational models because now as a GPU you match up

11:26with the math constructs and how the transformers were deployed etc so there's

11:30there's a ease and a facility of getting a higher performance on those

11:35foundational models but inferencing now will will span out across sectors you

11:41know be different in unique enterprise it'll be different on the edge that's

11:45why we're very focused on diversity in our portfolio we've enabled AI across

11:49that portfolio our entire portfolio our CPUs are you know you look at our AI PCs

11:55and the AI enablement across the CPU GPU and neural net engines and all the way

12:00through our embedded and FPGA roadmap so I think it's you have to the advantage

12:05is what we're what we're gonna do is simply ease customer adoption a unified

12:11software stack on top and just ease the optimization but it's gonna be wild it

12:16is early days and I think you know yet to come is the myriad of inference

12:21applications across that strata yeah and I've just add that back to the right

12:27device for the right workload and that typically is a decision point around

12:32power performance area and cost and when you look at the the tremendous cost as

12:39required to build these foundational models it probably leads you to certain

12:43types of devices that can really do that workload and back to kind of the point

12:47you were making on training but when you look at the inference workloads and

12:52particularly data wants to compute he wants to compute near where it's created

12:58or where it's deployed and typically that is happening much more broadly at

13:05the edge what we call edge computing and those workloads really have finite power

13:11performance area requirements whether you're in a manufacturing location and

13:14industrial robotics type of implementation on autonomous vehicle you

13:19really don't have the luxury of you know hundreds of watts I mean in many

13:25cases it's single-digit watts if not lower and so that's why back to it is

13:31not a one-size-fits-all and much of the growth and opportunity is happening at

13:35the edge again where the data is created and consumed and that's why you see a

13:40diversity of devices for running those workloads whether it's a CPU GPU FPGA

13:45and AI accelerator just one small thing I think it's a mistake to think that

13:50that computed the edge comes at at the expense of the data center that's just

13:54not been the experience over the last 30 years as we got more compute in our

13:59cell phones data center demand for compute didn't go down it went up all

14:04right it went in exactly the opposite direction says we got more compute in

14:07our cars data center demand didn't go up it went down as we added more compute in

14:10our homes through Alexa and all sorts of devices data center demand didn't go

14:14down it went up as we get inference at the edge data center demand is going to

14:18go up models have to be trained the limitations exactly described here of

14:23limitations of power delivery running on a battery the amount of storage and the

14:29memory capacity of the device that lives at the edge mean that there's going to

14:33be good work there but some works going to go back to the data center and and so

14:37I think it's always framed as either or and in fact the rise of the edge drives

14:44continued growth in the data center well speaking of across the portfolio critics

14:51say that you know GPUs are a bit of an environmental disaster you know AI chips

14:58and and generative AI generally whether it's the hundreds of thousands of GPUs

15:05that train these large foundation models or the the kind of amount of compute

15:12that it takes for inference you know some I've heard numbers like an open AI

15:16query takes ten times as much compute as a as a Google search query what do you

15:22say about that how do you ensure that your technology does not do that does

15:26not contribute to that does the inference does the speed of inference

15:29help with that what else can help and what are you doing we have a way for

15:35scale solution and that means we built the biggest chip in the history of

15:39industry and we keep data on the chip where it's moved more quickly and uses

15:45less power so we use order a third the power of a GPU for a similar calculation

15:50but even at a third our industry is consuming a lot of power and I mean

15:55there's no way around that and I think there are a couple things we have to do

16:00one is we have to work at the algorithmic level to improve the use of

16:07the compute right many many machines multiply by zero it's called sparsity

16:13this is a waste you don't need to do that you don't need to spend the time

16:16and effort and power to do that because you know the result before you do the

16:19calculation there are a whole set of algorithmic techniques that are being

16:23explored right now in the industry to make the make the computation that pulls

16:28the power more efficient all right and so and then we got to get more benefit

16:33from our AI all right and as it moves to to have applications whether it be

16:39in the identification of drugs or in other things then it's the the question

16:44isn't oh look we're just pulling a lot of power but how did it compare to other

16:48choices we would have made in the development of this drug or to find this

16:54answer and so I think those are some things that come to my mind I couldn't

16:59agree more in fact I'll go further to say that to have energy-efficient AI

17:04computing it's fundamentally changed the way that we solve the problem I call it

17:09holistic design we can't anymore any one of us think about just one element of

17:14that computation chip that we're developing you know Andrew went to a you

17:20know wafer scale integration we were the first to adopt on packaging we do both

17:24horizontal two and a half D connectivity our mi 300 also has that

17:29and 3d stacking why because there's less energy as you as you're solving the

17:33problem and you're calculating on the models but it is more than that it is

17:37developing new algorithms new math approximations and it goes right up

17:41through the the rack level integration that's it's why if you look at our

17:46acquisitions including in progress CT systems we're not going to get in the

17:50system business but we need to optimize for power consumption all the way

17:54through the generation of AI racks and clusters the game has changed yeah and I

17:59think that that you know back to the adage of necessities and mother of

18:03invention clearly we will not have enough power to run all of the AI that

18:08the world wants to run on current course and speed and so you are going to see a

18:13lot more focus and a lot more innovation in battery technology in nuclear

18:16technology and deployments and trying to focus on green energy as part of that

18:24equation and when you deploy AI we know we need you compute we need data and we

18:32need algorithms and you know I will underscore Andrews point that compute

18:37will run its course in terms of Moore's Law or slowing down Moore's Law but you

18:43will get advancements their data there's there's so much data being created and

18:47it just continues to grow exponentially over time particularly exploding at the

18:52edge but but the real breakthroughs are going to come in both innovation from

18:56power perspective in terms of either battery or nuclear technology and from

19:03algorithms which can give you 10x 100x 1000x the type of breakthrough

19:08capability to get more out of that same platform of computing data any other

19:15questions from the audience about any of this I see a hand back there coming

19:18coming over hi Alexei Erezgevich with fortune I'm curious about the US chips

19:25act and from what I understand it's a lot more expensive to manufacture when

19:31these fabs come online in the US versus Taiwan and so I was curious what how

19:36that impacts your own plans for manufacturing leading-edge stuff yeah

19:42well maybe I'll just I'll just start out by saying that you know a globally

19:46diverse resilient supply chain is good for everyone I think we all learned that

19:50very painful lesson during the the pandemic and a lot of the supply

19:55constraints that that ensued and the the chips act is really just one step of I

20:02think what will be required it to be many steps moving forward to level the

20:07playing field because there are you know other parts of the world where

20:11governments support and subsidize through whether it's our D credits

20:16whether it's through tax policy whether it's just subsidies for companies to

20:21remain on that leading edge bleeding edge but I'll just go back to the fact

20:26that competition is good competition drives more innovation customers like

20:30choice and the semiconductor industry is so crucially important to the US and the

20:36West that the chips act not just in the US but in Europe as well I believe that

20:41there's a lot of energy and conviction around continuing with those types of

20:47policies to ensure that we can stay on that leading edge of innovation and that

20:52customers have choice from the perspective of where they actually

20:56fabricate their their semiconductors I'll add on the geographic diversity is

21:01certainly key we're number one or two customer of TSMC in Arizona the yields

21:08are coming out equal to what it was in Taiwan and so your question is it more

21:14costly that if you can get the yields equivalent the the cost amortizes over

21:20time and so it is going to be more but that you know in the blend of customer

21:25buy I don't think that's going to be a major factor again if you can get the

21:30yields up which TSMC has achieved and then more broadly I'm a member of the

21:34industry Advisory Council with Department of Commerce and I know myself

21:37and peers across the industry are all focused to generate more

21:42semiconductor design here and that's where the National Semiconductor

21:46Technology Center which is now being amped up and giving grants driving more

21:50chip development packaging development really spawning a rebirth of

21:55semiconductor development research both here and with our allies just one final

22:01thing when you guys think about fabs these are about the greatest things we

22:05make as humans these are 30 billion dollar factories that have five-year

22:09six-year lives they're extraordinary things and that there should not be

22:16cutting-edge fabs in the u.s. was bad industrial policy that's just bad

22:21industrial policy and that we can now through through the work of the various

22:26government acts bring bring these fabs to the u.s. is improves all our our

22:35industry well thank you so much we are out of time this was great thank you so

22:39much to Sandra Mark and Andrew

Category

Transcript

Recommended