• last year
It feels like overnight, everyone was talking about artificial intelligence. But why? This panel of industry insiders at Imagination In Action’s ‘Forging the Future of Business with AI’ Summit breaks down what factors made this moment in AI happen

Subscribe to FORBES: https://www.youtube.com/user/Forbes?sub_confirmation=1

Fuel your success with Forbes. Gain unlimited access to premium journalism, including breaking news, groundbreaking in-depth reported stories, daily digests and more. Plus, members get a front-row seat at members-only events with leading thinkers and doers, access to premium video that can help you get ahead, an ad-light experience, early access to select products including NFT drops and more:

https://account.forbes.com/membership/?utm_source=youtube&utm_medium=display&utm_campaign=growth_non-sub_paid_subscribe_ytdescript

Stay Connected
Forbes newsletters: https://newsletters.editorial.forbes.com
Forbes on Facebook: http://fb.com/forbes
Forbes Video on Twitter: http://www.twitter.com/forbes
Forbes Video on Instagram: http://instagram.com/forbes
More From Forbes: http://forbes.com

Forbes covers the intersection of entrepreneurship, wealth, technology, business and lifestyle with a focus on people and success.

Category

🤖
Tech
Transcript
00:00 Awesome.
00:01 Thank you so much.
00:03 It's great to be here today and to talk a little bit about generative AI futures.
00:08 You know, this really feels like an unprecedented time in the machine learning space and the
00:13 AI world.
00:14 I started working on this around 2009 and it's never been cool before.
00:19 So it's very, very interesting to have relatives back in Texas want to talk to me about that
00:24 generative AI stuff.
00:27 So excited to be here today to have a chance to kind of look under the hood a little bit
00:31 at some of the companies that are building out this generative AI future and to talk
00:37 about some things in the open source world and the large versus small model world.
00:43 And then also some of the implications for on-device machine learning.
00:47 Excellent.
00:48 And so with that, I'm going to go ahead and kind of introduce our great panelists.
00:55 My name is Paige.
00:56 I work on generative AI at Google, particularly our large language models, our large generative
01:02 models like Gemini.
01:04 I have here with me today Adi.
01:06 Do you want to do it to give a brief introduction?
01:08 Sure.
01:09 So my name is Aditi Joshi.
01:11 I work at Google.
01:12 I've been there for five years.
01:14 I focus on open source in our core ML group.
01:18 Excellent.
01:19 And Kevin Schatz, Cameron?
01:21 Yeah.
01:22 So another Googler.
01:23 So I work in Google Cloud.
01:24 I lead a team in our conversational and generative applied AI organization focused on agentic
01:30 type things.
01:31 Awesome.
01:32 And Sundararajan?
01:33 I'm at Microsoft.
01:34 I lead a team on AI incubations here down the street in Kendall Square.
01:39 Wonderful.
01:40 And then also Marcus.
01:41 Thank you.
01:42 Thank you for hosting me.
01:43 Yeah, Marcus Ruhl.
01:44 I lead the Intel Developer Cloud and I'm building out very large supercomputers that are focused
01:48 to host various startups, so established companies that are building a variety of large language
01:53 models.
01:54 Prior to that, I was at Nvidia, built out Nvidia's GPU cloud infrastructure.
01:58 Excellent.
01:59 And so are you based locally as well?
02:00 I'm from Silicon Valley.
02:02 Oh, gotcha.
02:03 So we have a nice mix of Silicon Valley and Boston coming here today.
02:07 And with that, I am going to get started.
02:11 I also was not trusting the Wi-Fi connection given how many people we have here in the
02:16 audience and how many people we have in all of the hallway conversations outside.
02:20 So we'll be reading some of the questions.
02:22 And if we have time, we might take a couple of audience questions.
02:25 But let's see how quickly we can get through these ideas.
02:32 So first off, we've already discussed generative AI is kind of at a peak in the hype cycle
02:38 that is unprecedented.
02:40 Why do y'all think this might be the case today?
02:45 And then Kevin, do you want to start?
02:46 Yeah, I'm happy to start.
02:47 I think, you know, in a past life, I spent a lot of time on this concept of digital transformation
02:52 that was like the big sledgehammer from top down of go transform, right?
02:58 Reorganize yourself, be agile, and all these other things.
03:02 And I think, obviously, it didn't work.
03:05 I think that there's the reason it's at a hype cycle is because I think there's a subtlety
03:10 to how it's getting adopted and integrated into applications and into general enterprise
03:15 tasks and just everyday life.
03:17 And it's not this big sledgehammer.
03:19 It's just all of a sudden, you start seeing these trailing indicators of improvement and
03:23 efficiency or new things getting created or rate of adoption of new applications and services.
03:29 And you kind of take that step back, like, why is that happening?
03:31 It's like, oh, gen AI is under the covers, right?
03:35 Software development, whatever we want to talk about.
03:37 I think it's so integrated into tasks that the bottoms up adoption is leading to a set
03:45 of transformations and subtlety in just about every job role.
03:49 And it's kind of neat to watch.
03:53 >> I think I can go next.
03:55 I think the obvious answer is just compute.
03:57 We have compute that's available to us, which allows us to do so much more than we could
04:02 have ever done before.
04:04 But I think what we're seeing now, and this was my lightning talk this morning, is how
04:08 it's really changed how we launch products.
04:11 And how has the role of AI product manager really changed along the way is the ability
04:16 to be able to use these models, to be able to use the tools that are available through
04:21 the use of AI for us as TPM copilots, your technical program manager.
04:26 So when you're going from your MVP, your minimum viable product, to your product market fit
04:30 from zero to one, it's a series of experiments that you're going through.
04:34 That rapid experimentation is so critical in making sure that a product launch is successful.
04:39 I think that's what we're really going to see, our ability to transform the way that
04:44 we launch our products and how fast we launch those products, incorporate the feedback that
04:49 we're getting early into the process, and then keep iterating on that at a scale that's
04:54 been I think unprecedented.
04:55 I think that's just about to get started.
04:59 I love the discussion around generating more interesting experiences in existing applications.
05:09 Some of the places where we've seen the broadest adoption of AI features are in things like
05:14 AI for software development.
05:16 So helping with things like code completion or code explanation in the context of an IDE,
05:21 helping with burning down bugs or code review, and then also in our developer product or
05:27 workplace productivity tools.
05:28 So things like Microsoft Word or Google Docs, Sheets, Excel, really meeting people where
05:35 they are and trying to accelerate the tasks that people would have been doing anyway.
05:40 So it's really, really cool to see.
05:42 Sundararajan?
05:43 I would say what Kostlak called it this morning, it's a chat GPT moment.
05:50 I think chat GPT really democratized the showcase ability for democratic access not only to
05:57 technology but also opportunities.
05:59 And it's also the fastest growing product in history of all products.
06:03 So experimentation, rapid prototyping, all proved in a matter of weeks to months.
06:10 But that's a combination of all the things that we are talking about.
06:13 The technologies have been evolving, bubbling up to this moment.
06:16 The mathematics have been bubbling up to this moment.
06:19 And now the opportunity and access has basically become overnight available, if not completely
06:26 accessible in terms of cost yet.
06:29 And maybe this is where the GPUs come in.
06:30 Indeed.
06:31 Indeed.
06:32 We like people who use a lot of compute.
06:35 Yeah, so I think at a high level, I think I'm always careful to try and predict the
06:38 peak of the hike cycle.
06:39 I've been around to block for a few years now, and I'm not sure it's the peak yet.
06:45 I think anytime you take the analogy back to late '90s when I was in the cost of communication
06:51 went down to near zero.
06:52 I grew up at a time when I remember making a phone call and I was worried about people
06:55 like how much it was costing me.
06:57 My parents would scold me for being on the phone for too long.
06:59 Growing up in Europe, a bit behind the US in that respect.
07:03 But what's happening right now is the cost of certain compute, of doing things in large
07:07 language models, all of a sudden certain tasks that you just couldn't afford.
07:09 Maybe the compute was available before, but you just couldn't afford it.
07:13 And all of a sudden, that compute is now available at the cost where it's just super attractive
07:17 to automate certain functions.
07:19 And I think it's not just the Moore's Law that, yes, you get more compute just from
07:23 the hardware, but it's just the rate at which these models are evolving and what you can
07:27 do with the same amount of compute power is just mind boggling.
07:30 I was just listening into a talk by Naveen Rao last week at a vision conference.
07:35 He was talking about a factor four of improvement per year, year over year.
07:39 So if you think about Moore's Law, we've been trying to squeeze more out of the physical
07:43 hardware, out of the compute, but you're getting twice the compute power every 18 months.
07:47 What we're seeing now is that, yes, you still get more out of that compute, but the speed
07:51 at which these algorithms are evolving is something like a factor four, which means
07:55 that if you're building a model today and you're spending $100 million building it,
07:58 you just only care about reducing the cost, but it's also the speed, the amount of effort
08:03 it takes to build this maybe in a year from now could be something like a quarter of that.
08:06 So the economics are just so mind boggling, and just the rapid, the pace at which things
08:10 are evolving is just so mind boggling.
08:12 It's incredible.
08:13 So I have a thought to add here, but I just want to ask, right, is it like in the move
08:17 to Intel that you're contractually obligated to name drop Moore's Law?
08:21 Part of my job description, I apologize.
08:27 Aside from that, I think it's accessibility in general, right?
08:30 Certainly compute matters, but just how accessible AI is because of large language models.
08:35 We spend so much time thinking about the great things that happen on the output side.
08:39 We tend to overlook how good large language models at understanding what I meant, right?
08:44 The intent portion of it.
08:46 To me, that's the most fascinating superpower of large language models and generative AI.
08:51 It's great that we can build things with it, and there's a lot of outcomes with it, but
08:54 just the fact that it understands human behavior and human intent and human speech, that's
08:58 the fascinating part.
09:00 I love this conversation about accessibility and about how huge pieces of the world who
09:06 never had access to play with large language models are now being able to open up their
09:12 browser and sort of use them for their experiments, use them as part of their work.
09:17 I remember at Google when the first Palm models just came out, being able to test them out
09:23 internally, being able to test out some of the image generation models, but having that
09:28 all is just kind of like a sandbox within the context of the company.
09:33 Whereas now, there's really massive potential for everyone in the world to start experimenting
09:39 with these to be able to understand how they could be creative and to accelerate this process
09:45 of getting their ideas out into the world.
09:48 It's scary to think that that was just about a year ago, isn't it?
09:51 Just about a year ago.
09:52 And how many iterations there's been.
09:55 Absolutely.
09:56 And how much more efficient the compute has gotten.
09:58 It's really, really awesome to see.
10:00 And this is a great segue into our next section, which is all about open source versus closed
10:06 source models.
10:08 And I think all of the folks on the panel have experience working with both.
10:14 But really, open source models have massive potential for customization, adaptation.
10:21 How have we started to see how open source is really driving some of the new features
10:27 and the new excitement around these models?
10:29 Adi, do you want to go first?
10:31 I know that you work on a lot of really great open source projects within the context of
10:35 Google.
10:36 But I would focus on OpenXLA, which is the middleware.
10:39 So whether you use PyTorch or Jax or TensorFlow, we're offering a very open source middleware
10:45 way to be able to access different hardware, like Intel CPUs and GPUs and TPUs as well.
10:52 So that's one aspect of it.
10:54 But I think it's that collective wisdom that open source offers with the crowd.
10:59 Because not one company can do all of it.
11:02 We're just a part of the puzzle.
11:04 So working together as an ecosystem, I think that's really the critical piece of it that's
11:08 going to help us to catapult to levels that we've never seen before.
11:12 Absolutely.
11:13 Kevin, do you want to?
11:16 Yeah.
11:17 I mean, look, our model garden supports 140-ish models.
11:21 I think less than 20 of them are built by Google.
11:24 And a large percentage of them are open source coming through communities like Hugging Face
11:28 and GitLab and various others.
11:31 I think open source is a massively important part of this community.
11:35 And I think we've spent so much time in other segments of technology evolution kind of thinking
11:42 about the path from research to applied research towards production-type use cases.
11:50 And I think the gap between research, applied research, and production is pretty well near
11:53 zero at this point.
11:55 We're iterating so quickly and trying to get things out so quickly.
11:58 So I think open source pushes closed source, closed source pushes open source.
12:03 And every day, I come from a different background than AI.
12:06 I spent 20 years in telecom.
12:08 And I could probably take two to three years off in telecom and nothing changed.
12:14 Here you kind of take two to three days off and state of the art looks different.
12:19 Speaking of which, if folks haven't already seen, Llama 3 got released today.
12:25 So already available for folks to use, I believe, on Azure and also on Hugging Face.
12:30 If you haven't had a chance to test it out, that's definitely something to take a look
12:35 at.
12:36 And then also, I love seeing how, you know, as these open source models get released,
12:43 they're increasingly pushing the boundaries in terms of model capabilities and performance.
12:48 Like the mixed-straw models that just recently got released.
12:52 And also it sounds like the largest versions of Llama 3, they're surpassing even GPT-4
12:58 in capabilities these days.
12:59 So it's pretty, pretty nifty.
13:01 Do you all want to add anything around open source?
13:05 >> Yeah.
13:06 So I think there are two aspects to these differences.
13:09 One is, as you mentioned, from a developer perspective, you want to have a mixture, you
13:13 want to provide as much variety and the right price point for all kinds of applications.
13:17 So that's just a model perspective.
13:20 But I think behind the hype, there's going to be this realization that we don't understand
13:24 a lot of how these things work.
13:26 And this is where the ability to leverage open source models is going to be a game changer.
13:31 So research has now taken a backseat to Kevin's point, because the ability for universities,
13:38 for example, to continue to innovate and expand the science behind it is gated on access to
13:44 certain GPUs and there's sort of a cap to it.
13:47 So open source is not only -- I think about it not only at the model level, but the entire
13:52 up and down the stack that is helping not only to build new applications.
13:56 I've been talking about, like, okay, you want to get your first application out with a closed
14:00 source model and then you want to use fine tuning with your open source model.
14:06 So that's sort of one aspect of it.
14:07 The other is, like, how do you understand where things fail, where things can get better?
14:11 And the only place you can do that is with open source.
14:15 >> I love that.
14:16 When I was first learning how to do machine learning, I relied so much on all of the great
14:21 resources that were put on GitHub, that were put online, and it was really only by reading
14:26 the documentation and reading through others' code that I was able to learn about model
14:30 internals.
14:32 So I think if we lose that, and if the world just shifts to closed source models, then
14:39 we've lost a really massive opportunity to help educate folks and to help build the next
14:45 generation of research scientists.
14:47 Marcus, do you want to?
14:49 >> Just a few examples.
14:50 If you go back in the history of software, there used to be a time when people used to
14:53 pay for the browsers, the web browsers.
14:55 There used to be a time when people paid for the operating systems in the data center,
14:58 they paid for the databases, and they paid huge premiums.
15:01 And it's just inevitable, I think, to all your points, it's just so much value to have
15:04 a platform that everybody can contribute to, academia, students, startups, anybody that
15:09 can contribute, that can improve it.
15:10 I think for a single company to outdo that in the R&D is going to be really, really difficult,
15:15 I think.
15:16 Again, it doesn't mean in the short term it can happen.
15:17 Again, there was many examples where it happened in the past.
15:20 People built something that nobody had at that point in time, but then quickly somebody
15:23 else started open sourcing.
15:24 Lambda 3 is a great example.
15:25 All of a sudden you have the whole world starting to build around that, and it's very hard to
15:29 keep up with that as a proprietary vendor.
15:32 >> I agree.
15:33 And that's also a great segue into the next section, which is talking about small models
15:38 versus sort of the larger models that we've seen on the most bleeding of edges.
15:43 I think what we've seen in the community is that oftentimes people will take the smaller
15:49 models, they might fine tune them for specific tasks or use cases, and they're also obviously
15:56 much more efficient to deploy, both from a latency perspective as well as a cost perspective,
16:02 and also like a hardware footprint perspective.
16:07 So if we wanted to discuss kind of the strengths and weaknesses of large and small models,
16:14 what have you all seen in terms of this within your own companies, and then also how you're
16:22 thinking about this in terms of generative AI futures?
16:25 Is it going to be one beating the other, or is it going to be a blend of both, as Sundar
16:31 Rajan mentioned?
16:32 >> Yeah, I'll go there.
16:35 I think it's going to be eventually a mixture.
16:37 There is definitely now a shift a little bit towards custom or fit-for-purpose type of
16:42 SLMs.
16:44 But another thing that maybe not many people know about SLMs is they tend to hallucinate
16:50 a little less, and they seem to offer, like initial research is showing that they seem
16:55 to offer better protection against harm content.
16:58 So there is little other benefits to the SLMs than just the resources.
17:04 But yeah, I think in the future it's going to be a mixture of all of the above.
17:08 >> Excellent.
17:09 And for SLM, just for folks who might not be aware, is that small language model?
17:14 >> Good.
17:15 >> Good.
17:16 Excellent.
17:17 >> So I think it's interesting because maybe also about a year ago there were a lot of
17:21 conversations about large generalist models versus highly specialized models.
17:26 I think actually Microsoft Research was one of the first to come out and say, look, you
17:29 can take a really large generalist model and hyper-tune it and it actually performs considerably
17:35 better than some smaller models.
17:37 >> Just even prompt engineering.
17:40 >> Even prompt engineering as well.
17:41 I think that's spot on, right?
17:43 And I think we kind of lost sight and a lot of the conversation was like, do I really
17:47 need my model to understand the latest Taylor Swift songs?
17:51 And it's not that, right?
17:52 It's really the fact that it's trained on all of human language that matters in terms
17:56 of understanding the intent to be able to execute against tasks.
17:59 >> Especially for the emergent capabilities.
18:01 So there is also that concern with SLMs that you need certain scale for the emergence capabilities
18:06 to show up more prominently.
18:08 >> 100%.
18:09 But I do think it will be a hybrid world.
18:12 I think on the small model side, some of the age old techniques around RNNs are starting
18:17 to prove really valuable to iterate and get high performance and high accuracy out of
18:21 smaller models.
18:22 And obviously some of the things that are happening with mixture of experts are really
18:26 driving down the latency of large models.
18:28 So we're going to live kind of in a hybrid world for a long period of time.
18:32 But once we solve the 10X compute problem, let's do everything large.
18:37 >> I love that you called out prompting strategies as well as retrieval methods.
18:43 I think there are a lot of great ways to augment the capabilities of models without necessarily
18:49 trying to bake in all of those smarts.
18:53 >> I think it really depends on open or close, it really depends on what's the pain point
18:57 that you're trying to solve for the customer at the end of the day.
19:01 Is there a problem?
19:02 Is there a pain point?
19:03 And then what kind of product or solution do you want to be able to build?
19:06 And then working backwards from there, depending on that particular use case, I think that's
19:11 what's really critical in deciding whether it's open or if it's closed source or not.
19:15 Now there's advantages and disadvantages to both.
19:17 Like if you use an open model, of course it's available out there, it's rapid experimentation.
19:22 Those are some of the pluses that you get with it.
19:24 But again, the drawback may be the fact that quality control might not really necessarily
19:29 exist or you have to build that in and that takes quite some time, it takes effort.
19:35 And then from an open perspective, a closed perspective, because they're large models,
19:39 if you do want to build something that requires multimodality, then you want to start thinking
19:42 about a closed model because it offers that multimodality, at least right now.
19:48 So it goes back to, again, the pain point and the customer and the solution that you
19:51 want to build for the pain point that they're experiencing.
19:55 I think it's also really interesting too in the sense that there are some companies that
20:01 are experimenting with releasing these much, much larger models as well as making them
20:10 Apache 2 licensed or making them licensed but only for academic use, in which case people
20:17 can experience the coolness of multimodality and all of these other use cases, even though
20:24 the models are open.
20:26 We had an awesome Jemma release yesterday on this.
20:30 Code Jemma or one of the other specialized versions?
20:32 One of the other specialized.
20:34 Code Jemma also, but there was another.
20:37 Excellent.
20:38 And Markus?
20:39 First of all, we like all language models, so long as they're on Intel hardware.
20:41 We're happy.
20:42 Aside from that, no, I think there's just, and I think this sort of bleeds into the next
20:46 discussion, I think the next version you have to read up about what runs on the client versus
20:51 what runs on the server.
20:52 And I think just simple things like, at some point just economics will dictate that.
20:57 Let's take a copilot as an example.
20:58 If I want to copilot, as a startup I want to give out a copilot.
21:01 If I can run this on the client, if the client has enough compute power to do that, I can
21:05 just offer this as a free service and still be highly, not lose too much money.
21:09 If I try to do this on the server side, it's just going to bankrupt me.
21:12 So and vice versa, if that same user then comes and says, I want to turn on all these
21:15 other new features, great, I can then provide them with additional compute power in the
21:19 cloud and I can augment that and all of a sudden they get a much higher level of service
21:23 and maybe it's going to cost them a hundred or something dollars a month.
21:26 But now it's a professional developer that can afford that.
21:28 So I think the hybrid of those two is ultimately what's going to dictate that.
21:32 So I don't think it's one or the other, it's one and the other.
21:35 Yeah, that's awesome.
21:36 And one of the most interesting patterns that I've seen in the generative AI space is having
21:41 kind of a larger model be the planner model and being able to kind of take in a high level
21:48 task, break it into subcomponents and then be clever about which model to assign to which
21:54 of those subtasks.
21:56 So it might be that for one of the subtasks, you know, you need a model that's a little
22:00 bit more impressive, perhaps a little bit more expensive, needs to do its computation
22:04 on the server side.
22:05 But then for others, you can use a much more small kind of model locally or even, you know,
22:11 tools that might be available locally as opposed to relying too much on model smarts.
22:17 And to y'all's points, that breaks down the cost, makes the latency go down as well because
22:22 you're not sending everything out into the world.
22:25 And it also means that at the end of the day, your users are getting a much better experience.
22:29 I think this is going to drive a lot more compute power locally on the client also,
22:32 because certainly these things you just can't afford as a, you just can't afford them as
22:36 a SaaS provider.
22:37 It's just too expensive.
22:40 That doesn't look good.
22:41 It looks like we're about to get cut off.
22:42 Oh, wow.
22:43 John, are we getting cut off?
22:47 Okay, excellent.
22:48 Well, so I apologize.
22:52 It looks like we had, it looks like we got too excited about the conversation, though
22:58 I think we had a really great discussion about the strengths and weaknesses today of generative
23:03 models.
23:04 If I could just ask one more question to everybody, very quick, one sentence only, top of mind,
23:09 rapid fire.
23:10 What are you most excited about for the next year of generative models, given everything
23:15 that's happened in this past year?
23:21 I'm excited about all the tooling that's surrounding it and bringing those, helping it bring those
23:25 capabilities to every user.
23:28 Excellent.
23:29 I think just the new use cases that we're going to be seeing.
23:31 There's just so much innovation going on, talking to so many startups.
23:33 It's just fascinating the speed at which things are evolving.
23:37 Again, back to the late nineties, once you have cost of communication set down to zero,
23:40 all of a sudden you see people placing their Ubers and all these things.
23:43 None of that existed.
23:44 So I think we'll see something similar over the next year or the next few years.
23:48 I think AI allows us to be much more rigorous and disciplined in our experimentation approach.
23:53 And I think that's going to revolutionize how we build products.
23:57 I think it's learning for me, right?
23:59 When I stepped into this event and I saw everyone here and I started to listen to a number of
24:03 the sessions, it became clear to me that I know a fraction of a percentage of what's
24:06 actually happening in AI right now.
24:10 Excellent.
24:11 So it sounds like generative AI is going to be even more exciting.
24:17 Perhaps we're not at peak hype cycle just yet.
24:19 We have a little bit of ways to go and we should all be excited about what's yet to
24:24 come.
24:25 So thank you so much.
24:26 Thank you to our panelists.
24:27 You all did a great job.
24:27 Thank you.
24:28 Thank you.
24:28 Thank you.
24:33 Thank you.
24:38 Thank you.
24:43 Thank you.
24:48 [BLANK_AUDIO]

Recommended