Shawn Jain, a former OpenAI staff member, talks about his time at OpenAI and how they worked to give AI the ability to understand what it’s seeing and what impact this could have on technology such as autonomous driving.
Subscribe to FORBES: https://www.youtube.com/user/Forbes?sub_confirmation=1
Fuel your success with Forbes. Gain unlimited access to premium journalism, including breaking news, groundbreaking in-depth reported stories, daily digests and more. Plus, members get a front-row seat at members-only events with leading thinkers and doers, access to premium video that can help you get ahead, an ad-light experience, early access to select products including NFT drops and more:
https://account.forbes.com/membership/?utm_source=youtube&utm_medium=display&utm_campaign=growth_non-sub_paid_subscribe_ytdescript
Stay Connected
Forbes newsletters: https://newsletters.editorial.forbes.com
Forbes on Facebook: http://fb.com/forbes
Forbes Video on Twitter: http://www.twitter.com/forbes
Forbes Video on Instagram: http://instagram.com/forbes
More From Forbes: http://forbes.com
Forbes covers the intersection of entrepreneurship, wealth, technology, business and lifestyle with a focus on people and success.
Subscribe to FORBES: https://www.youtube.com/user/Forbes?sub_confirmation=1
Fuel your success with Forbes. Gain unlimited access to premium journalism, including breaking news, groundbreaking in-depth reported stories, daily digests and more. Plus, members get a front-row seat at members-only events with leading thinkers and doers, access to premium video that can help you get ahead, an ad-light experience, early access to select products including NFT drops and more:
https://account.forbes.com/membership/?utm_source=youtube&utm_medium=display&utm_campaign=growth_non-sub_paid_subscribe_ytdescript
Stay Connected
Forbes newsletters: https://newsletters.editorial.forbes.com
Forbes on Facebook: http://fb.com/forbes
Forbes Video on Twitter: http://www.twitter.com/forbes
Forbes Video on Instagram: http://instagram.com/forbes
More From Forbes: http://forbes.com
Forbes covers the intersection of entrepreneurship, wealth, technology, business and lifestyle with a focus on people and success.
Category
🤖
TechTranscript
00:00 Tell us about your career in AI and how you found yourself at closed AI.
00:05 I mean open AI.
00:07 Thanks for having me here, John.
00:10 It's been a really great conference so far.
00:13 It's a summit like Mount Everest.
00:15 We're trying to go high and we're forging conferences.
00:18 That's where you try to hit your quarter and sell something.
00:20 That ain't this.
00:21 Well, good. I guess hopefully we're nearing the summit.
00:25 Yeah. So I'll tell you a little bit about my background.
00:27 I come from a totally science nerd family.
00:32 My dad was a programmer.
00:33 He started data analytics company.
00:35 My mom was an early employee at a startup that was acquired by SAP.
00:40 My sister studied computer science.
00:42 My grandfather was an amateur astronomer.
00:45 And to reduce the burden of all the calculations he had to do to understand the trajectory
00:52 of planets and everything else he was predicting, he actually wrote computer programs.
00:57 He was on punch cards.
00:59 And this is at a time in India when computers were not widely available.
01:04 And I think back then they were aligned with the Soviet Union to get the technology.
01:08 I don't know if you know this.
01:09 You're like eight years away from 30.
01:12 So you still have a bunch of years to be 30 under 30.
01:15 But yeah. So sorry.
01:18 So that's the kind of house I grew up in.
01:23 I grew up making Linux network servers, wiring my house with Cat5 Ethernet, fixing amplifiers
01:30 by changing out the capacitors and stuff like that.
01:32 And all I wanted to do was go to MIT because that's where the origin of all this amazing
01:36 technology was.
01:37 So I got the opportunity to go.
01:38 I studied Course 6.2 here.
01:41 While I was there -- while I was here, I got involved doing computer vision research.
01:46 Specifically, I got involved in doing scene understanding.
01:51 So this is making models that actually understand videos and create program representations of
01:56 what's going on.
01:57 And scene understanding is important because you can probably see a picture of me and John
02:02 up there.
02:03 These are objects.
02:04 But if there's multiple items in the photo, if there's multiple objects in the photo,
02:08 you need to understand their relationships to understand what's going on in the scene.
02:12 And so that's the project that we were working on at MIT.
02:16 After that, I went over to Uber ATG.
02:18 That's Uber's self-driving group, because that was the best place to apply computer
02:22 vision.
02:24 After I spent some time there working on using LIDAR to improve localization and perception,
02:31 I moved over to Microsoft Research, where I worked on multimodal models and efficient
02:36 models.
02:37 And because I worked on efficient models, I realized that deep learning models really
02:42 needed a lot of compute.
02:44 And that's how I ended up going over to OpenAI, because they had the most investment.
02:49 What year was that?
02:50 I moved over to OpenAI in January of '21.
02:56 So what does self-driving cars have to do with large language models?
03:02 And you kind of foreshadowed a little bit of your interest in this, but how did that
03:07 crystallize with your role at OpenAI?
03:12 Yeah, so I think that self-driving cars and language models are actually not that different
03:18 in a sense.
03:19 So they're both scaling laws problems, riding trends in deep learning.
03:23 And so they're both enjoying this virtuous cycle of more data, more compute leads to
03:27 better results, leads to more investment, and more data and more compute.
03:32 And also, my research work at MIT was about scene understanding.
03:36 And what I realized is that I think that scene understanding is actually a key enabling technology
03:41 for self-driving cars, because they are in scenes, they are in environments with potentially
03:45 hundreds of actors-- other cars, pedestrians, cyclists, bicyclists, and so on.
03:52 And so if you can actually describe a scene as a program, it's actually a form of compression,
03:57 which is actually a really, really good proxy for understanding.
04:01 If you think about large language models today, they actually create a compressed representation
04:06 of the language while they're training through this next token prediction objective function.
04:11 And they seem to demonstrate understanding.
04:13 So that's how I made this connection between scene understanding and self-driving cars.
04:19 Yeah.
04:20 So let's be voyeuristic.
04:21 Let's let these people know what your life was like at OpenAI.
04:25 So what was it like working there?
04:28 Did you realize the tools you were researching and the tools the organization was creating
04:34 were transformative?
04:35 Was it what you expected?
04:37 Were people surprised to the left and right of you?
04:43 And why did you leave?
04:45 It was so transformative technology.
04:49 What led to you deciding to say, you know what, I'm going to go it alone?
04:54 Well I went there because I thought it was interesting.
04:56 I wasn't sure if everybody else thought it was interesting.
05:00 And I think--
05:01 How many people worked there when you went there in '21?
05:03 I can't say.
05:04 OK.
05:05 So was it a little, a medium, or a lot?
05:08 Can you say that?
05:09 It was on the smaller side.
05:10 Smaller side.
05:11 OK.
05:12 All right.
05:13 So don't say anything that's going to have repercussions on your career.
05:16 But was it like-- were people looking at each other like, oh my god, this is really happening?
05:23 We're doing this?
05:24 Or everyone was like, yeah, this is what we expected.
05:26 Can you comment on that?
05:28 And say pass if you can.
05:30 Yeah.
05:31 I think we were getting some amazing results.
05:34 Good enough results that I would double and triple check if they were real.
05:39 That's how surprising they were to me.
05:42 And were you-- so you did research.
05:43 Were you in charge of triple checking?
05:45 Or they're like, holy shit, we better triple check this?
05:48 When you're doing research, you want to be confident that you're presenting good results.
05:52 So I would definitely triple check my own work.
05:55 So you left.
05:58 You're not there anymore.
05:59 You don't have a badge to get in the building.
06:01 They probably took all your files.
06:04 There are files.
06:05 You're welcome.
06:06 How-- was that an easy decision?
06:09 Or were you like, I need to-- I have had enough experience, like dog years.
06:13 Three times seven.
06:14 You've been there 27 years.
06:18 What's your calculation to step out the door?
06:21 There's a dog right over there.
06:22 Here, can you hold the dog up?
06:25 Yeah.
06:27 So what was your calculation to leave?
06:29 Did they ask you to leave?
06:30 They're like, we've had enough of you.
06:31 We don't need any more research.
06:32 Or they didn't like you.
06:34 Or you left because you're like, wait a second.
06:36 I know too much.
06:38 And I want to take this knowledge and do something with it.
06:41 Give us some insight there.
06:44 And if I'm asking two personal questions, say pass, and I'll go somewhere else.
06:50 I love research freedom.
06:52 And so that's the main reason.
06:54 Yeah.
06:55 So do you plan to do research freedom in whatever you do next?
06:58 Or you've done that enough, and you'll hire someone else to do that part of it.
07:02 And you want to do the other stuff?
07:04 Oh, I'm basking in the glory of having my freedom and independence.
07:08 So that's what I'm really enjoying.
07:10 You could read between the lines.
07:11 There's a lot right there that he didn't say that you know.
07:14 I just want to telegraph what just happened there.
07:18 So talk about what are things you'd like to do in this post-AI career that you have right
07:25 now?
07:26 Well, it's not a post-AI career.
07:28 It's very much in the middle of AI, just not at a--
07:31 I meant in terms of your LinkedIn, your job, your post-AI.
07:36 What's going to be the thing next?
07:38 In this next chapter, what do you want to do?
07:40 Well, first of all, I took a little breather to recollect myself after a tremendous sprint
07:47 that I had.
07:50 So now I'm exploring a few different startup ideas in robotics foundation models and in
07:56 time series data generation.
07:58 I'm looking for people who are experts in these areas to collaborate with, to work with.
08:04 If you're that kind of person, definitely want to chat with you.
08:07 So considering a few startup ideas, I was also very heads down in my research.
08:12 So now I'm also starting to reconnect with the startup community.
08:16 And that's what I'm seeing here, a lot of interesting startups.
08:18 Has it been a good experience here so far?
08:20 It's been a great experience.
08:22 So how do you use AI?
08:25 Can you tell us what tools you're using?
08:27 Can you tell us what you're using them for?
08:30 Like Sean Jane, what does your day look like?
08:35 What are you accomplishing collaborating with these tools?
08:40 If I am writing code, I'm definitely using some kind of...
08:43 10%, 50%, 80%.
08:45 How much is AI helping you?
08:49 I don't think I can quantify it in terms of how much percentage of my code is AI written
08:54 or not.
08:55 I would say it saves me time and saves me from bugs, especially once I learned how to
09:02 use it.
09:03 So say you get connected with someone who's graduated from MIT and you have something
09:10 in common.
09:11 You're a part of the same extracurricular team or club and they say, "Hey, give me some
09:15 advice.
09:16 I'm about to enter the real world.
09:17 You've been out there and I want to do similar stuff to you did."
09:21 Given what you know now, what would you tell them?
09:23 Do this or don't do this.
09:25 Like there's a guy who said, "Go get Wes, young man."
09:27 That was a long time ago.
09:28 What would you say to yourself a long time ago who's just graduating, given the AI world
09:35 that's going on, what would you say?
09:38 I think you should work on foundational technologies, technologies that are enabling, that are a
09:43 platform for other people to create all kinds of different startups and other new technologies.
09:48 So I've always chosen to work on what I believe are foundational technologies like computer
09:52 vision, like language models, like better sensors in LIDARs.
09:56 And I think that each of these technologies are now you're seeing creating whole new markets
10:02 for startups in robotics and code assistants or code generation and everywhere.
10:08 So if you are an expert in the foundational technology, you're really, really well placed
10:12 to create the startup that commercializes that technology as well.
10:17 So LIDAR has become cheaper.
10:20 Is that a fact?
10:21 Yeah, LIDAR has become cheaper.
10:23 Now that it's on every iPhone, new iPhone has LIDAR technology and as more cars have
10:29 it, it gets cheaper.
10:33 What can we do with LIDAR that we couldn't do before?
10:37 And then my next question is physical AI.
10:39 Is that a term that you are familiar with and do you think about that and does that
10:43 have anything to do with some of the things you're interested in?
10:46 Or that's sort of an area that's off adjacent to what you're thinking about.
10:52 So two questions there.
10:53 Got it.
10:54 So the first one about LIDAR, it's an interesting one.
10:57 So LIDAR is a laser scanner.
10:58 It shoots out a beam of light, non-visible light, and you measure the time of flight
11:05 until that beam of light reflects back to you.
11:08 And from that, you can calculate the distance using like a distance rate time, simple calculation
11:14 divided by the speed of light.
11:16 So it allows you to create really dense 3D point clouds or 3D if you have a spinning
11:24 LIDAR and probably 2D if you have a 2D LIDAR.
11:30 And I think the interesting thing about LIDAR is because it's a new kind of sensor, the
11:37 applications of it are still being discovered.
11:39 So because LIDAR is on the iPhone, I believe it helps the portrait mode feature in iPhone,
11:46 but it also helps self-driving cars.
11:48 So I'm actually looking forward to seeing what other new cool applications of LIDAR
11:53 actually come out.
11:54 I think they could be on mobile autonomous robots in factories in the homes very soon.
11:59 So physical AI, a term you're familiar with or it's not your thing?
12:06 Do you mean embodied AI?
12:07 Well, I mean like a lot of people interact with AI that's on a 2D screen, but now you
12:11 can throw AI beyond the screen.
12:14 So maybe pass on that one.
12:17 All right.
12:18 I have, you created a feature in chat GPT and you wrote a white paper when you were
12:23 at OpenAI.
12:24 Can you talk about either of those or that top secret I shouldn't tell people that?
12:28 No, we can talk about the paper.
12:29 Okay.
12:30 But people can't know about the feature you made?
12:32 No, you can use the feature.
12:33 It's out there.
12:34 Yeah.
12:35 Enlighten us.
12:36 So what is it?
12:37 Well, I think you guys can do your own research on it, but it's the advanced data analysis
12:42 feature in chat GPT.
12:43 I was a contributor to it.
12:44 I didn't invent it by myself.
12:46 It allows a model to write code, execute it and observe the results and debug it.
12:53 So it's kind of, it's a step beyond just completing code.
12:58 Thank you.
13:00 You're welcome.
13:02 Like I said, not just my thing.
13:04 I was a contributor.
13:05 Great.
13:06 You can't tell us how many people worked on it, I bet, right?
13:09 Sorry.
13:10 Yeah.
13:11 No, I saw that coming.
13:12 Yeah.
13:13 All right.
13:14 Did you want to talk, did you want to say about the research paper?
13:16 Yeah, I can talk about the paper.
13:18 That's public work.
13:19 So me and also other co-authors published a paper called Evolution Through Large Models
13:25 when I was at OpenAI.
13:27 And it's a really interesting paper because we tackled a very general, but very real problem
13:32 in this paper, which is how can you get a language model to generate code in a domain
13:38 in which it has little to no training data?
13:41 And in this paper, we actually showed how synthetic data can actually improve code generation
13:46 ability and that the synthetic data can be generated through an evolutionary algorithm.
13:53 And the evolutionary algorithm is actually powered by the language model.
13:57 So if you don't know what an evolutionary algorithm is, it's basically a biologically
14:01 inspired algorithm for optimization.
14:04 You have a set of candidate solutions, and these are called the population.
14:09 You evaluate each candidate's fitness.
14:11 So how good is this particular solution?
14:13 So in the robotics domain, it might be how far does this robot walk using this particular
14:20 control algorithm?
14:23 And then you select the fittest individuals from your population, and then you produce
14:26 offspring, children, from that solution via mutation.
14:31 The interesting thing we did in this paper is that we actually use a language model as
14:34 a mutation operator.
14:36 It's an intelligent mutation operator instead of a random mutation operator.
14:41 And this process was able to generate synthetic data, which we were actually able to train
14:48 the language model on.
14:50 And by training the language model on this data, it actually improved its ability to
14:55 generate more data.
14:58 And this was a really, really interesting paper.
15:02 I think it's gotten, unfortunately, less attention out there than it should have.
15:08 It was situated in the robotics domain, so the code that we were writing was demonstrated
15:12 in a robotics simulator.
15:15 But the technique is widely applicable.
15:17 Okay, last question.
15:20 A lot of MIT students are getting involved in AI and are going to play probably a disproportionate
15:27 leadership role in what comes next.
15:30 How did your MIT experience shape you to be able to evaluate opportunities, to deal with
15:37 ethical challenges?
15:39 There's a bunch of Pandora's boxes that are opening here.
15:43 Did you feel well-served, and are you hopeful?
15:47 And just give us a little insight.
15:49 You're not at the pinnacle of your career on the way down.
15:51 You're just getting going.
15:53 And I want this audience to kind of understand who are these young kids that are in the room
16:00 when it happened, are contributing to the features, are writing the white papers, are
16:06 going it alone to create the next company that could be a major player, could be the
16:11 next trillion-dollar company.
16:13 Lex Friedman said on this stage, there's going to be a bunch of trillion-dollar companies
16:17 right now that are going to play a big role in our society.
16:21 And your profile is of that.
16:25 And you're not just going in there saying, I'm going to work for the man.
16:28 I'm going to go there, and I'm going to be the guy.
16:32 And I'm excited for what's next.
16:34 But how well are you prepared for that?
16:36 And where do you feel maybe you're lacking, and you need to complement that?
16:40 Maybe someone in here could complement that for you.
16:43 But yeah, give us a little insight into that.
16:45 Yeah.
16:46 I mean, I think going to MIT, I learned a lot of different things.
16:50 I think one of the best parts about an MIT education is that there is no black box that
16:57 you don't open.
16:58 So if you're creating a high-performance ML system, you need to have a full stack understanding
17:05 of what's going on.
17:06 So you need to understand the chips, the memory, the assembly code, the primitives that that
17:11 chip executes well.
17:13 You might need to know about the kernels, the compiler, the front end, and then finally
17:16 the model, too.
17:19 So even though I, at the moment, am not an expert in every one of these things, I do
17:25 feel that because of my MIT education, I can actually go in and learn about each of these
17:31 topics to a level of working understanding that will actually complement all my other
17:35 knowledge.
17:36 So I think that's what I learned from the technical side at MIT.
17:40 And in terms of how I would lead, I would say if I was to create a research group right
17:46 now, the number one thing that I would do is make sure that it's a high-trust environment.
17:51 Because in research, oftentimes you're iterating and running experiments for weeks and weeks
17:55 and weeks and have no concrete results other than to say that I think I understand the
18:00 problem better, but I don't have a result that you can deliver, that you can productionize,
18:05 that you can release.
18:07 So creating a high-trust environment in which people are not afraid to ask questions, in
18:15 which people are not afraid to go out there and experiment, I think is really essential.
18:19 And I think you see that culture at the Media Lab, you see that culture at CSAIL, and that's
18:24 the culture I'd like to propagate.
18:26 Class of 2016, thank you very much.
18:29 [END]
18:30 1
18:30 Page 1 of 2