Category
🤖
TechTranscript
00:00 in an unmarked office building in Austin, Texas.
00:03 - Come on in.
00:04 - There are two small rooms
00:05 where a handful of employees peer into microscopes,
00:07 soldering irons, or tiny tweezers in hand.
00:10 They're designing two types of microchips
00:12 made to power data centers,
00:13 and more recently, the AI boom.
00:15 - When we get this, what do we do?
00:17 We test it.
00:19 First thing that we do, we test it.
00:21 - But these chips aren't coming from NVIDIA,
00:23 AMD, or any of the chip companies
00:25 that have been hitting headlines and market milestones
00:27 since Chad GPT burst on the scene late last year.
00:31 I'm here inside Amazon's Austin, Texas chip lab,
00:33 where it makes its own custom microchips
00:35 to compete with those from Intel, NVIDIA, and other giants.
00:38 And it's actually a way for them to save money
00:40 and boost performance,
00:41 because it's one of the biggest buyers
00:43 of data center chips in the world.
00:45 AWS CEO Adam Slipsky told CNBC
00:48 that the chips that we saw here today
00:50 are powering large language models
00:52 and more for the AI boom.
00:53 - The entire world would like more chips
00:56 for doing generative AI,
00:58 and whether that's GPUs,
01:01 or whether that's Amazon's own chips that we're designing,
01:04 I think that we're in a better position
01:06 than anybody else on earth
01:07 to supply the capacity
01:09 that our customers collectively are gonna want.
01:11 - Amazon Web Services
01:12 is the world's biggest cloud computing provider
01:14 and the most profitable arm of the retail giant,
01:17 with an operating income of 5.4 billion in Q2.
01:21 Although that number has been down year over year
01:23 for three quarters in a row,
01:25 AWS still accounted for 70%
01:27 of Amazon's overall 7.7 billion Q2 operating profit,
01:31 giving it the cash it needs
01:32 for the huge undertaking that is custom silicon,
01:35 and a growing portfolio of developer tools
01:37 that could eventually propel Amazon
01:39 to the center of all the AI buzz.
01:41 - Many of our AWS customers have terabytes,
01:44 or petabytes, or exabytes of data
01:46 already stored on AWS.
01:48 And they know that that data
01:51 is gonna help them customize
01:53 the models that they're using
01:54 to power their generative AI applications.
01:57 - And yet others have acted faster
01:59 and invested more to capture business
02:01 from the generative AI boom.
02:03 Think Microsoft's reported $13 billion investment
02:06 in chat GPT maker OpenAI,
02:08 and Google's chatbot, BARD,
02:09 followed by its $300 million investment
02:12 in OpenAI rival, Anthropic.
02:14 AWS's profit margins have historically been far higher
02:17 than those at Google Cloud,
02:19 but those margins have been narrowing.
02:21 And although AWS's growth is still impressive,
02:23 that's happening at a slower pace too.
02:25 - Amazon is not used to chasing markets.
02:30 Amazon is used to creating markets.
02:32 And I think for the first time in a long time,
02:34 they're finding themselves on the back foot,
02:36 and they are working to play catch up.
02:39 - CNBC sat down with top AWS executives and analysts
02:43 to ask about custom chips
02:44 and how it plans to make strides in generative AI
02:47 to catch Google and Microsoft,
02:49 and perhaps give a needed boost to AWS too.
02:51 - We end up with a package part like this, right?
03:09 And this is an actual machine learning accelerator
03:14 that was designed, and you can see Annapurna Labs on it.
03:17 - In 2015, Amazon bought Israeli startup, Annapurna Labs,
03:20 to accelerate its dive into the chip business.
03:23 In July, we went to AWS's Annapurna location in Austin
03:26 for an exclusive look at the chip design process
03:29 with lab director, Rami Sino.
03:31 AWS also designs chips in Silicon Valley, Canada,
03:34 and at a larger lab in Israel,
03:36 then sends them off to be made by chip manufacturers
03:38 like TSMC in Taiwan.
03:40 AWS quietly started production of custom silicon
03:43 back in 2013 with a piece of specialized hardware
03:46 called Nitro, now the highest volume AWS chip,
03:49 with more than 20 million in use in every AWS server.
03:53 Then at AWS's big annual customer conference,
03:55 reInvent, in 2018,
03:57 Amazon launched its ARM-based server chip, Graviton,
04:00 to rival x86 CPUs from giants like AMD and Intel.
04:04 - It's probably high single digit
04:05 to maybe 10% of total server sales are ARM,
04:08 and a good chunk of those are gonna be Amazon.
04:11 So on the CPU side, they've done quite well.
04:13 - We're into our third generation of our Graviton chip.
04:16 That provides acceleration in terms of speed
04:19 and cost efficiency and power
04:21 for very general kind of web-based workloads.
04:24 - After announcing Graviton in 2018,
04:26 AWS announced its first AI-focused chips.
04:29 VP of product, Matt Wood,
04:31 showed us the two AI chips it has today.
04:33 - This big one here is called Tranium,
04:35 and this small one here is called Inferentia.
04:38 - Inferentia, Amazon's first AI chip, launched in 2019.
04:41 - Which we're on our second generation of,
04:43 which allows customers to deliver very, very low cost,
04:47 high throughput, low latency machine learning inference,
04:50 which is all the predictions of the,
04:52 when you type in a prompt into your generative AI model,
04:56 that's where all that gets processed
04:57 to give you the response.
04:58 With Inferentia, you can get about four times more throughput
05:03 and 10 times lower latency using Inferentia
05:06 than anything else available on AWS today.
05:09 - Tranium came on the market in 2021.
05:11 - All right, so this is a package part.
05:13 And then let me show you the other side.
05:15 What you see here is all the interfaces.
05:18 - Machine learning breaks down
05:19 into these two different stages.
05:21 So you train the machine learning models,
05:23 and then you run inference against those trained models.
05:26 And so we see a lot of customers that are interested
05:29 in training their own machine learning models
05:32 and their own generative AI models.
05:33 And so that's where Tranium really, really helps.
05:36 Tranium provides about 50% improvement
05:40 in terms of price performance,
05:42 relative to any other way
05:43 of training machine learning models on AWS.
05:46 - But for now, NVIDIA's GPUs are still king
05:48 when it comes to training LLMs.
05:50 AWS itself just launched new AI acceleration hardware
05:53 powered by NVIDIA H100s.
05:56 - Accelerating performance by up to 6X
05:59 and reducing training costs by up to 40%
06:02 as compared to EC2P, for instance.
06:05 - NVIDIA chips have a massive software ecosystem
06:09 that's been built up around them over the last 15 years
06:11 that nobody else has.
06:12 The big winner from AI right now is NVIDIA.
06:15 That seems clear.
06:16 - Still, Amazon is not the only non-chip giant
06:18 getting into custom silicon.
06:20 Apple has its M-series of chips.
06:22 And a couple years before Amazon had AI chips,
06:24 Google launched its own cloud tensor processing units,
06:27 or TPUs.
06:28 - Nobody's at the same scale as Google.
06:30 Google's been deploying this stuff for like eight years.
06:32 My assumption is all of the hyperscalers,
06:34 whether they've announced it or not,
06:35 are all working on their own accelerators.
06:38 And by the way,
06:38 many are also working on their own CPUs as well.
06:41 - But when it comes to custom chips,
06:42 Microsoft is lagging behind Amazon and Google.
06:45 Microsoft has yet to announce the Athena AI chip
06:48 it's been working on,
06:48 reportedly in partnership with AMD.
06:51 - I think the true differentiation
06:52 is the technical capabilities that they're bringing to bear.
06:55 Because guess what?
06:56 Microsoft does not have Trinium or Influencia.
06:58 - Generative AI is the current craze,
07:05 but Amazon was building out a broader AI infrastructure
07:08 for machine learning with dozens of services
07:10 long before it made chips or used them to train LLMs.
07:13 - Late 1990s, we were the first one
07:15 to actually leverage machine learning-based technologies
07:19 to reinvent our recommendation engines.
07:22 And we leveraged machine learning
07:23 to do things like a better product search
07:26 and then automating leveraging robotics
07:29 and computer vision in our Amazon FCs,
07:32 our fulfillment centers,
07:33 to help products ship faster,
07:36 to actually reinventing
07:38 completely new customer experiences
07:40 with things like Amazon Alexa.
07:43 - But when OpenAI launched ChatGPT in November, 2022,
07:46 Microsoft was suddenly dominating the AI headlines,
07:49 followed by Google's BARD in February.
07:52 Two months later,
07:52 Amazon announced its own large language model
07:54 called Titan and Bedrock,
07:56 a cloud service to help developers enhance software
07:58 using generative AI.
08:00 - I think ChatGPT and Microsoft
08:02 rollout of their initiatives was so fast,
08:04 so aggressive, so quick.
08:05 It caught a lot of the market participants' flat-footed.
08:08 Amazon is trying to educate the market
08:11 in order to close the gap.
08:12 But frankly speaking,
08:14 it's going to take a couple of months.
08:15 - Let's rewind the clock even before ChatGPT.
08:17 It's not like after that happened,
08:19 suddenly we hurried and came up with a plan
08:21 because you can't engineer a chip in that quick a time.
08:26 And I think it actually accelerated
08:28 some of the customer conversation
08:29 and their keenness to actually move forward
08:32 with generative AI deployments.
08:34 - Meta also recently announced its own LLM, LLAMA2.
08:38 The open-source ChatGPT rival
08:39 is available on Microsoft's Azure cloud platform.
08:42 Now, a leaked internal email shows Amazon CEO Andy Jassy
08:46 is directly overseeing a new central team
08:48 that's building out expansive large language models.
08:51 But so far, AWS has focused on tools
08:54 instead of building a ChatGPT competitor.
08:56 - So if you look at the Bedrock strategy
08:58 that they are trying to focus on,
08:59 they are betting the farm on the fact that enterprises
09:03 might not necessarily be building out their own GPT models.
09:07 - Bedrock gives AWS customers access to LLMs
09:10 made by Anthropic, Stability AI, AI21,
09:13 and Amazon's own Titan.
09:15 - Titan is actually a family of foundational models.
09:17 We have text-based models,
09:19 which are great for generative text,
09:21 so creating marketing copy and advertising,
09:24 chatbots, those sorts of things.
09:25 And then we have an embedding model,
09:27 which is great for personalization
09:29 and ranking those sorts of use cases.
09:31 (upbeat music)
09:34 - Amazon says its AI products are being used
09:41 by numerous customers, like Philips, 3M,
09:44 Old Mutual, and HSBC.
09:46 In the Q2 earnings call,
09:47 it said a very significant amount of AWS business
09:50 is now driven by AI
09:51 and the 20-plus machine learning services it offers.
09:54 - We don't believe that one model
09:56 is going to rule the world.
09:58 We understand we want our customers
10:01 to have the state-of-the-art models
10:04 from multiple providers
10:05 because they are going to pick the right tool
10:06 for the right job.
10:08 - One of Amazon's new AI offerings is AWS HealthScribe,
10:11 a service unveiled in July
10:12 to help doctors automatically draft
10:14 patient visit summaries and more.
10:16 Another big tool in the AWS AI stack is CodeWhisperer.
10:19 - CodeWhisperer generates code recommendations
10:22 from natural language prompts
10:24 based on contextual information,
10:26 and it's a tool that helps customers
10:29 understand the context of the task.
10:31 - The number of participants who use CodeWhisperer
10:34 were 27% more likely to complete their task successfully,
10:39 and they did it 57% faster on average.
10:43 - Last year, Microsoft also reported productivity boosts
10:46 from its coding companion, GitHub Copilot.
10:48 And then there's SageMaker,
10:51 Amazon's machine learning hub
10:52 that offers algorithms, models, and more.
10:55 - It's a kind of an aircraft.
10:56 In one example there,
10:58 it was 45% lighter for a particular carrier.
11:02 - In June, AWS also announced
11:04 a $100 million generative AI innovation center.
11:07 - We have so many customers who are saying,
11:09 "I want to do generative AI,"
11:10 but they don't all necessarily know what that means for them
11:13 in the context of their own businesses.
11:15 And so we're going to bring in solutions architects
11:17 and engineers and strategists and data scientists
11:22 to work with them one-on-one.
11:24 (upbeat music)
11:27 - When companies are choosing between Amazon,
11:28 Google, and Microsoft for their generative AI needs,
11:31 some may choose Bedrock
11:32 because they're already familiar with AWS,
11:34 where they run other applications and store a ton of data.
11:37 - If you took the data that we have in Amazon S3
11:40 that's stored on devices like this,
11:43 you stack them up, one on top of another,
11:47 it would take you all the way
11:49 to the International Space Station
11:51 and almost all the way back,
11:53 and that is a lot of data.
11:56 - At the end of the day,
11:57 Amazon does not need to win headlines.
11:58 Amazon already has a really strong cloud install base.
12:01 All they need to do is to figure out
12:02 how to enable their existing customers
12:04 to expand into value creation motions using generative AI.
12:09 - So how many AWS customers are actually using it
12:11 for machine learning?
12:12 - We have over a hundred thousand customers today
12:15 that are using machine learning on AWS,
12:18 many of which have standardized
12:19 on our machine learning service, which is called SageMaker,
12:22 to build, train, and deploy their own custom models.
12:25 - But in reality, that's not a big percentage
12:27 of AWS's millions of customers.
12:29 Although most aren't tapping into it for AI yet,
12:31 that could change.
12:33 - What we are not seeing is enterprises saying,
12:35 oh, wait a minute, Microsoft is so ahead in generative AI,
12:38 let's just go out and let's switch our infrastructure
12:41 strategies, migrate everything to Microsoft.
12:43 That is not happening, because at the end of the day,
12:45 even if you're trying to create a chatbot,
12:47 if you're already an Amazon customer,
12:48 chances are you're likely going to explode
12:50 Amazon ecosystems quite extensively.
12:52 - How quickly can these companies move
12:56 to develop these generative AI applications
12:58 is driven by starting first on the data they have in AWS
13:03 and using compute and machine learning tools
13:08 that we provide.
13:09 Imagine you're cooking dinner and you're using a new recipe.
13:15 It is a lot faster to start with ingredients
13:18 that you already know that have been cut
13:20 and prepared to go ahead and put together the recipe
13:23 than if you have to research all the ingredients,
13:26 get familiar with them, and then learn
13:28 how you're going to put them together
13:29 and cook with them, right?
13:31 That's what AWS customers are doing.
13:33 They have all the different ingredients
13:35 that they're familiar with and they know how to use,
13:37 and whether that's storage or compute,
13:40 or it's machine learning tools like Amazon SageMaker
13:44 and Amazon Bedrock, and they're putting it together
13:46 that much faster.
13:48 - And as generative AI continues to accelerate,
13:50 all the big players are scrambling to establish
13:52 how to use these tools responsibly and securely.
13:55 - Can't tell you how many Fortune 500 companies
13:57 I've talked to who have banned Chachapiti.
14:00 So with our approach of generative AI
14:04 and our Bedrock service, anything you do,
14:07 any model you use through Bedrock
14:09 will be in your own isolated,
14:12 virtual private cloud environment.
14:14 It'll be encrypted.
14:15 It'll have the same AWS access controls.
14:19 - Szilipski joined six other AI players
14:21 at the White House in July to sign pledges
14:23 to ensure that AI tools are secure.
14:25 - There are open problems that still need to be solved,
14:28 especially when you're trying to deal
14:29 with highly regulated industries
14:30 in financial services, healthcare, and beyond.
14:33 We still do not have any well-thought-out
14:36 regulatory guardrails around data protection,
14:40 private information protection,
14:41 and responsible AI capabilities in the space.
14:44 - There's also national security concerns.
14:46 The Biden administration has proposed new rules
14:49 that would require US cloud providers
14:50 like Amazon and Microsoft to seek government permission
14:53 before providing China with cloud computing services
14:56 using AI chips.
14:58 But for now, there's no slowdown in sight
15:00 for the development of new generative AI applications
15:02 or the chips needed to power them,
15:04 and that race is just beginning.
15:06 - So let's say that we're three steps into a race
15:09 and we start asking, "Well, who's ahead?
15:11 Who's behind? How do the runners look?"
15:13 But then you look up and you realize that it's a 10K race.
15:17 And then you realize it's the wrong question to ask.
15:20 Who's where three steps into the race?
15:23 The real question is what's gonna happen
15:24 at the end of the 10K race?
15:26 In this case, we're just at the very dawn of generative AI.
15:31 (upbeat music)
15:33 (upbeat music)
15:36 (upbeat music)
15:39 (upbeat music)