Andrew Feldman, Co-founder and CEO, Cerebras Systems, Mark Papermaster, Executive Vice President and Chief Technology Officer, Advanced Micro Devices (AMD), Sandra Rivera, Chief Executive Officer, Altera, Sharon Goldman, Fortune
Category
🤖
TechTranscript
00:00Welcome everyone. So great to talk to you today. Mark, I'm actually going to turn
00:05right to you and I'm going to start with what is certainly the gorilla in the
00:09room and that is NVIDIA. Is there a way to dethrone NVIDIA or in AMD's case to
00:15at least close the gap? Well great question and you know Sharon we've been
00:20competing with NVIDIA for decades because we've been in the graphics
00:23business through our acquisition of ATI. But when you look at the story of AMD
00:28it's twofold. We started with focusing on CPUs, grow the revenue of the company
00:33but what people don't realize is under the cover we were working on
00:37heterogeneous computing and our open software stack, rock them underneath. Now
00:42we are competitive. We now we went from almost zero in 2023 of AI revenue to
00:47five billion projected this year. And so we're strong actually leadership and
00:53data center GPU inferencing and watch out on training. We're bringing the
00:57competition right to NVIDIA and training. The market needs competition
01:02and that's what we're focused on. Andrew and Sandra so this is a complex
01:08landscape and the competitive landscape is very nuanced whether it comes to
01:13training and inference and all sorts of areas of the semiconductor industry. I
01:17want to ask you also Andrew about the competitive landscape from other
01:21startups, hyperscalers and then Sandra I'd like to hear about the competitive
01:25space on your end. Andrew? Well I think there's so many players in the market
01:30because the markets are darn big. I think we have NVIDIA, we have AMD
01:37and under Lisa and Mark's leadership doing great things. But you also have the
01:42hyperscalers seeing opportunity explode and wanting to do their own chips. You
01:47have a collection of startups I'd like to think led by Cerebris doing
01:52pioneering work. I think when you have this sort of activity it's because you
01:58have this an explosion of demand and we're seeing it on the training side
02:03we're seeing it on the inferencing side and I think truth be told none of us can
02:08keep up. It's kind of a hunger games out there. Sandra what about you at Altera?
02:16Well I'll start by saying that it is still early days of AI and so there's so
02:21much innovation ahead of us and it really isn't you know how do you take on
02:26NVIDIA it is how you have the best device for the workload that you're
02:32driving and to Andrew's point the workloads the segments the customers are
02:36so diverse that it is not a one-size-fits-all and that's why you
02:41have CPUs and GPUs and accelerators that are bespoke and dedicated to AI and even
02:48in my field field programmable gate arrays FPGAs which are highly
02:52customizable highly flexible and highly valued in markets that are changing very
02:58dynamically very quickly like AI like cybersecurity certainly wireless
03:03standards and those types of applications lend themselves to devices
03:07that are highly programmable and flexible to the workload so I think
03:12there's a lot of opportunity there's still early days lots of innovation to
03:17come and you know Mark and I were commenting before the session of the
03:21fact that competition is good makes you sharper makes you more focused and it
03:26certainly addresses the very very broad customer requirements that are out there.
03:30But Mark but what would you say I mean NVIDIA is not just any competitor I mean
03:34like I had an analyst tell me that NVIDIA doesn't have any real predators
03:38in the wild do you see that or do you see that competition is growing and that
03:43there's really a place to close the gap and find that opening even more so for
03:48your company? Yeah I don't believe that whatsoever that's why we're making the
03:52gains that we are we're actually getting tremendous pull you think about the
03:56placements we have with first-party applications and several hyper scalers
04:01with our GPUs you saw Microsoft announced that they're in production
04:04with our mi300 GPU Meta announced that they're running the llama 3.1 470
04:12billion parameter model all the inferencing's running production on AMD
04:16and why is that because the market demands competition and so we're getting
04:22a tremendous pull we've not seen a moat again we're GPU just like NVIDIA so if a
04:27code was written low-level CUDA it can be in a straightforward way ported over
04:31and people are moving to Triton and Jack's and other higher-level vendor
04:38in you know unspecific coding and that's gonna really open up competition as well
04:43nobody wants to be locked into a particular hardware's that's right code
04:47yeah and I think this point on abstraction away from the underlying
04:51logic device is really powerful because most data scientists most subject matter
04:56experts most most of the programmers are really not at those low levels they're
05:03highly abstracted and they're just trying to get their workload deployed in
05:07the market at scale as quickly as possible so very much this is not so
05:11much what is the actual is it a GPU is it a CPU is it an AI accelerator and
05:17FPGA it's very much what is the system in the platform and then what is the
05:21software that enables that so that developers can actually be productive
05:25very very quickly right well this is the chips panel at brainstorm AI so I want I
05:29know folks will have questions out there and I'm gonna come in just a couple
05:33minutes so think it over but right now Andrew so IPO when I last wrote about I
05:40last wrote about cerebrus and and the potential IPO back in the beginning of
05:45October where are things at now can you share I cannot share this is gonna be
05:51very boring but I would like to Mark's an investor Mark would like to know I
06:02would say the following Mark's GPUs are running faster than than Nvidia's our
06:10accelerators are running in many instances 75 times faster than Nvidia's
06:16and you can go to artificial analysis and check daily scores on their
06:21benchmarks I think others not just ours others are running much faster as well I
06:28think what Mark said is is sort of fundamental and it's not just that it's
06:34better for us that there's competition I think it's better for you that there's
06:39competition it means that the AI in your applications will run faster and cost
06:43less and there aren't any markets where that proposition hasn't held and so I
06:50think that's the fundamental thing it's good for everybody if there's competition
06:54in AI if there's innovation at the underlying levels and I think that's
06:59we're seeing that across the board this battle for inference speed is really
07:04interesting there's like every week there seems like to be a new announcement
07:07including from cerebrus do you see that as a way to really diversify your
07:11customer base going forward I think the in 2017 open I published a paper which
07:18sort of predicted the vast rise in training compute and they called it the
07:24scaling laws and then in September they published a paper that found some of the
07:29same rules apply in inference which means we're gonna use a huge amount of
07:32inference compute in the years ahead an absurd amount of inference compute and
07:37we're gonna do inference not once but many times on the same query and if you
07:42can run fast you can deliver higher quality results in less time and so
07:47speed will be fundamental to the delivery of high quality inference in in
07:54a reasonable amount of time and so I think that's why you're seeing this
07:57extraordinary race to be fast at inference a need for speed you know I
08:04think many when I look out in the room I can see that there's a little bit of
08:09gray hair here and you remember when before we had broadband and trying to
08:14you know use the internet before broadband as a disaster and once we got
08:19broadband all of a sudden you had new applications you had streaming you had
08:23all these things that were fun and the engagement was what was high and I think
08:28that's what's happening right now with with AI is that as you get faster you
08:32move into the sort of the broadband era of AI inference and and things are
08:37engaging they're responsive and they're higher quality.
08:41Sandra obviously last week's news about Intel CEO resigning was big it was
08:47surprising and of course a lot of people want to know what that means for Altera
08:51I was wondering if you could share your thoughts on the on the future of our
08:55Altera does it change your strategy in any way? Yeah our plan has been to go out
09:01for an equity stake Intel is still planning to sell an equity stake in
09:05Altera and we're in the middle of that process now with a lot of great
09:07interest which is encouraging and we still plan for an IPO in 2026 that's
09:13that's been the plan and it really hasn't changed and the news from last
09:18week was you know sad on so many levels but what I try to keep the company and
09:24employees focused on is the things that we control because we really don't
09:27control any of that we control our commitments to customers our product
09:31execution our innovation and that's what we're focused on but the plan is the
09:36plan and it really it has not changed. Okay I'd love to see if there are any
09:40questions here okay yes do we have a mic mic person here we are and just state
09:47your name and your affiliation thank you yeah hi
09:51Pankaj Katia, X Intel and Qualcomm. Andrew just clarification this cerebrus
09:57training or inference or both? We do both. So my question is maybe Mark and
10:05Sandra AMD is on both sides of the wire right data center and inference same
10:15thing as X Intel or Intel fundamentally what advantage does that give an AMD and
10:24Intel and maybe cerebrus because when we think about Nvidia we think primarily
10:30about the data center on the edge they have graphics but they don't have the
10:36full SOC if you would right so fundamentally how does that better
10:44position the three companies end to end? I can start I think it's important to
10:52even think about the analogy that Andrew just stated it was back to the internet
10:57and thinking when all the applications came out and it went sort of vertical
11:01sector by vertical sector and it changed its technology needs were different
11:04that's what you're gonna see is AI explodes so of course there's training
11:09and there's different requirements for these you know these foundational large
11:13language models today they're all on a GPU and as I said earlier we're now
11:17bringing competition there but there's an advantage there when you want to run
11:20inference on those largest foundational models because now as a GPU you match up
11:26with the math constructs and how the transformers were deployed etc so there's
11:30there's a ease and a facility of getting a higher performance on those
11:35foundational models but inferencing now will will span out across sectors you
11:41know be different in unique enterprise it'll be different on the edge that's
11:45why we're very focused on diversity in our portfolio we've enabled AI across
11:49that portfolio our entire portfolio our CPUs are you know you look at our AI PCs
11:55and the AI enablement across the CPU GPU and neural net engines and all the way
12:00through our embedded and FPGA roadmap so I think it's you have to the advantage
12:05is what we're what we're gonna do is simply ease customer adoption a unified
12:11software stack on top and just ease the optimization but it's gonna be wild it
12:16is early days and I think you know yet to come is the myriad of inference
12:21applications across that strata yeah and I've just add that back to the right
12:27device for the right workload and that typically is a decision point around
12:32power performance area and cost and when you look at the the tremendous cost as
12:39required to build these foundational models it probably leads you to certain
12:43types of devices that can really do that workload and back to kind of the point
12:47you were making on training but when you look at the inference workloads and
12:52particularly data wants to compute he wants to compute near where it's created
12:58or where it's deployed and typically that is happening much more broadly at
13:05the edge what we call edge computing and those workloads really have finite power
13:11performance area requirements whether you're in a manufacturing location and
13:14industrial robotics type of implementation on autonomous vehicle you
13:19really don't have the luxury of you know hundreds of watts I mean in many
13:25cases it's single-digit watts if not lower and so that's why back to it is
13:31not a one-size-fits-all and much of the growth and opportunity is happening at
13:35the edge again where the data is created and consumed and that's why you see a
13:40diversity of devices for running those workloads whether it's a CPU GPU FPGA
13:45and AI accelerator just one small thing I think it's a mistake to think that
13:50that computed the edge comes at at the expense of the data center that's just
13:54not been the experience over the last 30 years as we got more compute in our
13:59cell phones data center demand for compute didn't go down it went up all
14:04right it went in exactly the opposite direction says we got more compute in
14:07our cars data center demand didn't go up it went down as we added more compute in
14:10our homes through Alexa and all sorts of devices data center demand didn't go
14:14down it went up as we get inference at the edge data center demand is going to
14:18go up models have to be trained the limitations exactly described here of
14:23limitations of power delivery running on a battery the amount of storage and the
14:29memory capacity of the device that lives at the edge mean that there's going to
14:33be good work there but some works going to go back to the data center and and so
14:37I think it's always framed as either or and in fact the rise of the edge drives
14:44continued growth in the data center well speaking of across the portfolio critics
14:51say that you know GPUs are a bit of an environmental disaster you know AI chips
14:58and and generative AI generally whether it's the hundreds of thousands of GPUs
15:05that train these large foundation models or the the kind of amount of compute
15:12that it takes for inference you know some I've heard numbers like an open AI
15:16query takes ten times as much compute as a as a Google search query what do you
15:22say about that how do you ensure that your technology does not do that does
15:26not contribute to that does the inference does the speed of inference
15:29help with that what else can help and what are you doing we have a way for
15:35scale solution and that means we built the biggest chip in the history of
15:39industry and we keep data on the chip where it's moved more quickly and uses
15:45less power so we use order a third the power of a GPU for a similar calculation
15:50but even at a third our industry is consuming a lot of power and I mean
15:55there's no way around that and I think there are a couple things we have to do
16:00one is we have to work at the algorithmic level to improve the use of
16:07the compute right many many machines multiply by zero it's called sparsity
16:13this is a waste you don't need to do that you don't need to spend the time
16:16and effort and power to do that because you know the result before you do the
16:19calculation there are a whole set of algorithmic techniques that are being
16:23explored right now in the industry to make the make the computation that pulls
16:28the power more efficient all right and so and then we got to get more benefit
16:33from our AI all right and as it moves to to have applications whether it be
16:39in the identification of drugs or in other things then it's the the question
16:44isn't oh look we're just pulling a lot of power but how did it compare to other
16:48choices we would have made in the development of this drug or to find this
16:54answer and so I think those are some things that come to my mind I couldn't
16:59agree more in fact I'll go further to say that to have energy-efficient AI
17:04computing it's fundamentally changed the way that we solve the problem I call it
17:09holistic design we can't anymore any one of us think about just one element of
17:14that computation chip that we're developing you know Andrew went to a you
17:20know wafer scale integration we were the first to adopt on packaging we do both
17:24horizontal two and a half D connectivity our mi 300 also has that
17:29and 3d stacking why because there's less energy as you as you're solving the
17:33problem and you're calculating on the models but it is more than that it is
17:37developing new algorithms new math approximations and it goes right up
17:41through the the rack level integration that's it's why if you look at our
17:46acquisitions including in progress CT systems we're not going to get in the
17:50system business but we need to optimize for power consumption all the way
17:54through the generation of AI racks and clusters the game has changed yeah and I
17:59think that that you know back to the adage of necessities and mother of
18:03invention clearly we will not have enough power to run all of the AI that
18:08the world wants to run on current course and speed and so you are going to see a
18:13lot more focus and a lot more innovation in battery technology in nuclear
18:16technology and deployments and trying to focus on green energy as part of that
18:24equation and when you deploy AI we know we need you compute we need data and we
18:32need algorithms and you know I will underscore Andrews point that compute
18:37will run its course in terms of Moore's Law or slowing down Moore's Law but you
18:43will get advancements their data there's there's so much data being created and
18:47it just continues to grow exponentially over time particularly exploding at the
18:52edge but but the real breakthroughs are going to come in both innovation from
18:56power perspective in terms of either battery or nuclear technology and from
19:03algorithms which can give you 10x 100x 1000x the type of breakthrough
19:08capability to get more out of that same platform of computing data any other
19:15questions from the audience about any of this I see a hand back there coming
19:18coming over hi Alexei Erezgevich with fortune I'm curious about the US chips
19:25act and from what I understand it's a lot more expensive to manufacture when
19:31these fabs come online in the US versus Taiwan and so I was curious what how
19:36that impacts your own plans for manufacturing leading-edge stuff yeah
19:42well maybe I'll just I'll just start out by saying that you know a globally
19:46diverse resilient supply chain is good for everyone I think we all learned that
19:50very painful lesson during the the pandemic and a lot of the supply
19:55constraints that that ensued and the the chips act is really just one step of I
20:02think what will be required it to be many steps moving forward to level the
20:07playing field because there are you know other parts of the world where
20:11governments support and subsidize through whether it's our D credits
20:16whether it's through tax policy whether it's just subsidies for companies to
20:21remain on that leading edge bleeding edge but I'll just go back to the fact
20:26that competition is good competition drives more innovation customers like
20:30choice and the semiconductor industry is so crucially important to the US and the
20:36West that the chips act not just in the US but in Europe as well I believe that
20:41there's a lot of energy and conviction around continuing with those types of
20:47policies to ensure that we can stay on that leading edge of innovation and that
20:52customers have choice from the perspective of where they actually
20:56fabricate their their semiconductors I'll add on the geographic diversity is
21:01certainly key we're number one or two customer of TSMC in Arizona the yields
21:08are coming out equal to what it was in Taiwan and so your question is it more
21:14costly that if you can get the yields equivalent the the cost amortizes over
21:20time and so it is going to be more but that you know in the blend of customer
21:25buy I don't think that's going to be a major factor again if you can get the
21:30yields up which TSMC has achieved and then more broadly I'm a member of the
21:34industry Advisory Council with Department of Commerce and I know myself
21:37and peers across the industry are all focused to generate more
21:42semiconductor design here and that's where the National Semiconductor
21:46Technology Center which is now being amped up and giving grants driving more
21:50chip development packaging development really spawning a rebirth of
21:55semiconductor development research both here and with our allies just one final
22:01thing when you guys think about fabs these are about the greatest things we
22:05make as humans these are 30 billion dollar factories that have five-year
22:09six-year lives they're extraordinary things and that there should not be
22:16cutting-edge fabs in the u.s. was bad industrial policy that's just bad
22:21industrial policy and that we can now through through the work of the various
22:26government acts bring bring these fabs to the u.s. is improves all our our
22:35industry well thank you so much we are out of time this was great thank you so
22:39much to Sandra Mark and Andrew