sds podcast episode 309: learning through competition · 2019. 11. 3. · convolutional neural...

SDS PODCAST

EPISODE 309:

LEARNING

THROUGH

COMPETITION

http://www.superdatascience.com/309

Kirill Eremenko: This is episode number 309 with the legendary data

science instructor, Jose Portilla.

Kirill Eremenko: Welcome to the SuperDataScience Podcast. My name

is Kirill Eremenko, Data Science Coach and Lifestyle

Entrepreneur. And each week, we bring you inspiring

people and ideas to help you build your successful

career in data science. Thanks for being here today,

and now let's make the complex simple.

Kirill Eremenko: This episode is brought to you by my very own book,

Confident Data Skills. This is not your average data

science book. This is a holistic view of data science

with lots of practical applications.

Kirill Eremenko: The whole five steps of the data science process are

covered from asking the question to data preparation,

to analysis, to visualization, and presentation. Plus,

you get career tips ranging from how to approach

interviews, get mentors and master soft skills in the

workplace.

Kirill Eremenko: This book contains over 18 case studies of real world

applications of data science. It comes off with

algorithms such as Random Forest, K Nearest

Neighbors, Naive Bayes, Logistic Regression, K-means

Clustering, Thompson sampling, and more.

Kirill Eremenko: However, the best part is yet to come. The best part is

that this book has absolutely zero code. So, how can a

data science book have zero code? Well, easy. We focus

on the intuition behind the data science algorithms, so

you actually understand them, so you feel them

through, and the practical applications. You get plenty


of case studies, plenty of examples of them being

applied.

Kirill Eremenko: And the code is something that you can pick up very

easily once you understand how these things work.

And the benefit of that is that you don't have to sit in

front of a computer to read this book. You can read

this book on a train, on a plane, on a park bench, in

your bed before going to sleep. It's that simple even

though it covers very interesting and sometimes

advanced topics at the same time.

Kirill Eremenko: And check this out. I'm very proud to announce that

we have dozens of five star reviews on Amazon and

Goodreads. This book is even used at UCSD,

University of California San Diego to teach one of their

data science courses. So, if you pick up Confident

Data Skills, you'll be in good company.

Kirill Eremenko: So, to sum up, if you're looking for an exciting and

thought provoking book on data science, you can get

your copy of Confident Data Skills today on Amazon.

It's a purple book. It's hard to miss. And once you get

your copy on Amazon, make sure to head on over to

www.confidentdataskills.com where you can redeem

some additional bonuses and goodies just for buying

the book.

Kirill Eremenko: Make sure not to forget that step is absolutely free. It's

included with your purchase of the book, but you do

need to let us know that you bought it. So, once again,

the book is called Confident Data Skills and the

website is confidentdataskills.com. Thanks for

checking it out, and I'm sure you'll enjoy.


Kirill Eremenko: Welcome back to the SuperDataScience Podcast.

Ladies and gentlemen, super pumped to have you

back here on this very special episode of the

SuperDataScience Podcast, because today, we have

none other but the legendary data science instructor,

Jose Portilla.

Kirill Eremenko: Very interesting episode. You're probably wondering

why we recorded it together since we're direct

competitors in the online education space in data

science. Well, we'll answer that question for you right

at the start of the episode. And we thought you'd be

interested to have us both in the same room talking

about your favorite topics such as AI, data science,

and the future of the world.

Kirill Eremenko: So, in this episode, you will hear about neural

networks that create other neural networks, how that

all works and what that means for data scientists.

How to manage and lead a community of over a million

students.

Kirill Eremenko: The question that Jose gets asked the most, as you

can imagine with such large communities, we get

hundreds, I think it's like 500 or so questions per day

that are asked in our courses. And here, you'll find out

what is the most asked question for Jose and how he

answers it.

Kirill Eremenko: You'll also hear about the pyramid of learning and

what is the pinnacle of learning what you need to do in

order to understand that you have indeed mastered a

topic. And finally, we're going to have a very interesting

debate about artificial general intelligence.


Kirill Eremenko: I really enjoyed chatting to Jose and I can't wait for

you to hear this podcast. So without further ado, I

bring to you the legendary data science instructor,

Jose Portilla.

Kirill Eremenko: Welcome back to SuperDataScience Podcast ladies and

gentlemen, super special guest on the show with me

today, Jose Portilla. Jose, how are you going?

Jose Portilla: Good. Good to be here. You said it right in front of me

as if there was an audience, but we're in an empty

room.

Kirill Eremenko: I know. You got to do it. Man, where are we, Jose?

Jose Portilla: We are at Udemy LIVE in Berlin, in Germany.

Kirill Eremenko: In Berlin. Out of all places ...

Jose Portilla: I know, right?

Kirill Eremenko: ... and we're back in Berlin. What a great party last

night.

Jose Portilla: Yeah. It's fantastic. You had the Birddogs, was it?

Kirill Eremenko: Birddogs. Yeah, if anybody is interested in some cool

cover band in Berlin, check out Birddogs. They were

epic.

Jose Portilla: Yeah, they're fantastic.

Kirill Eremenko: Yeah, Udemy knows how to throw a party.

Jose Portilla: That's very true.

Kirill Eremenko: Yeah. Like a lot of food, a lot of drinks and the

excursion on Friday was really cool.


Jose Portilla: Yeah, the boat tour and then the Boros Collection and

then all that stuff.

Kirill Eremenko: In the bunker that's above ground.

Jose Portilla: I was a little more interested in the building than the

... I don't know. What do you think of ...

Kirill Eremenko: It really was cool.

Jose Portilla: This is getting off topic, but what did you think of the

artwork?

Kirill Eremenko: The artwork, I never understood contemporary art.

Like post modernism. But, what I really liked in this

tour was that they explained it, and that allowed me ...

For example, that one of the trampoline and the arrow,

and the horse. Compared to Picasso, that took five

minutes to put together.

Kirill Eremenko: Maybe it took ages for the person, but it's not ... You

can't really compare that to classic art. It's just

different realms. But with the way they explained is it's

not about what the artwork is, it's about what it

represents, what the person was thinking and kind of

like the idea that they're provoking you to think about.

Kirill Eremenko: And when you think about it that way, it's like

somebody writing down an idea with pen and paper.

But here, they're just doing it with like sketches or

household goods or whatever else. And in that way

that for me, that was much easier to accept. Yeah. So

in that sense, I like the explanations. What about you?

Jose Portilla: That's actually the thing I dislike the most about it. In

my opinion, it's like, if your artwork is so reliant on a


third party to happen to explain your thesis behind the

artwork, maybe the art is not the best manifestation of

trying to get your message across. Maybe you should

just be writing a paper on whatever topic and it may

be clear to more people. But some of them were like

crazy. Like the one of the images of the houses.

Remember the 9x9?

Kirill Eremenko: Oh, yeah.

Jose Portilla: So, the thing was this ... I guess to explain it to the

listener a little bit. Apparently, there used to be this

old German company that would fly around in a

helicopter, take aerial photos of your home, then go

door to door and try to sell you an aerial photo of your

house. And apparently, it's so very popular in the 70s

to have a little aerial photo of your house. And then

Google Maps comes along and they go out of business.

Jose Portilla: So, they have 30,000 essentially stock images that

they did not end up selling, because not everybody

wanted to buy a picture of their house. And then they

gave it to this artist and he manually, instead of the

convolutional neural network or some filter, he just

looked for patterns. So, then he gets like the nine

images where everyone is washing their car, or the

nine images where all the windows are boarded up in

these houses.

Kirill Eremenko: Yeah. And puts them into one big frame or the entire

collection near each other, and then you have to guess

the name like car wash.

Jose Portilla: Yeah, you have to figure out what's the same or similar

track between all these paintings and images.


Kirill Eremenko: Yeah, definitely some interesting ideas. But fair point

on maybe it's not the best way if you need explanation.

Speaking of the building. So, it's a bunker above

ground from World War II, with two or three meter

thick ceilings and walls. Did they tell you in your

group that the bunker like there's an actually a living

resident above ...

Jose Portilla: Yes. The whole building is insane. Because you look at

it from the outside. Yeah, it's like concrete, very

industrial or brutalism looking. And I thought to

myself like, "That's weird, bunkers are usually ... I

thought they were underground." And I was like, "I'm

surprised this could have survived the Berlin

bombings."

Jose Portilla: And then you go inside and they show you how thick

the walls are. You're like, "Oh, this could survive

anything, because they're hugely thick." Yeah. Then

later of the tour, the owners of the collection live at the

top floor of this bunker. It's so weird.

Kirill Eremenko: And they explained to us how they managed to do that

because in Berlin, you're not ... You want to tell that

story?

Jose Portilla: You probably remember better than I do in some weird

legal thing, right?

Kirill Eremenko: Yeah. In Berlin, you're not allowed to legally add an

extra floor on top of a building that already exists. And

this was a bunker. They don't want to live in the

bunker. They wanted to live, add a floor on top. But

the legal loophole was that bunkers ... This building

doesn't fall under the classification of a house.


Kirill Eremenko: It falls under the classification of a bunker, and

bunkers are normally underground, so everything that

we see above ground in this case is considered the

basement. Basement one, basement two, basement

three, basement four. So, they were like, "Oh, we got to

add a top level. We kind of live in a basement."

Something like that. So, that was really fun. So, are

you having a good time in Berlin overall?

Jose Portilla: Yeah, it's been great to ... I've just been traveling

around Europe for this. Yeah. So, it's been nice to get

to see everybody.

Kirill Eremenko: Very nice. Yeah. Well, today's podcast. First of all,

some of our students who know us both will be ...

Jose Portilla: Their minds blown that we're talking to each other.

Kirill Eremenko: Yeah. It would be like thinking, "What? Did the world

turn around?" Because we are apparently like ... Well,

we're competitors. We compete. Fierce competitors at

each other's throats. So, how would you explain that?

Why are we talking more, let alone recording this

podcast if we're such fierce competitors?

Jose Portilla: It's so funny. Well, we've had this conversation on

multiple times, but everyone from the outside thinks

like one of us has to die in order for the other to

survive.

Kirill Eremenko: Hunger Games. Yeah.

Jose Portilla: Yeah. Exactly. But if anything, it's the opposite.

Specifically like at Udemy where ... I don't know. Some

people think like, "Oh, you probably wish that your


competitors come up with really bad courses or

something. That way your courses can reign supreme."

Jose Portilla: When, in fact, the opposite is true, because the worst

thing that can happen to me is that a popular

competitor releases a bad course. Because then

students think, "Oh, even just online education in

general, it wouldn't be that great."

Jose Portilla: Suddenly, it becomes a reflection of not just one

course but their entire online learning experience. So,

one of the best things that comes to me is have a

competitor like you with good content. And then it's

like I was telling you earlier, buying a course is not like

buying a car where you buy one car and then many

years later, you're not buying a car until much further

into the future.

Jose Portilla: It's more like buying a book on a topic you like. You're

going to buy multiple books by multiple people. So, the

best thing that come to me is have a competitor with a

good course, which engages a student and then says,

like, "Oh, I can actually learn some of this complex

stuff online. Let me go check out other courses, etc."

So, yeah, it's not some Hunger Games situation.

Kirill Eremenko: Yeah. For sure. And also, engage one course, tell my

friends about it. They'll come and different people like

different styles. You and I have different styles of

teaching, inevitably.

Jose Portilla: Yeah.


Kirill Eremenko: Everybody is unique and somebody might prefer the

way you explain something. Somebody might prefer

the way I ... Or somebody might benefit from both.

Jose Portilla: I was just about to say, different people like both

styles. I would say that the Venn diagram of our

crossover students is huge.

Kirill Eremenko: Yeah.


Kirill Eremenko: For sure. And also, what I like about the competition is

it doesn't let you lack off. I mean either of us, because

we hold each other up to a standard. If there was just

one of us, then the standard drops. You, first of all,

might not notice that your standard of teaching is

dropping. Students might not notice, because they

have nothing to compare it to. And then you won't be

incentivized to improve. I like this that like I can't let

my standards drop, you can't let your standards drop,

because the nature of the competitive market.

Jose Portilla: Yeah. The quality of course is getting better. I don't

know if you remember your first course. Ever look

back at it?

Kirill Eremenko: Yeah, I have. Oh, it was so [crosstalk 00:13:40].

Jose Portilla: I'm so embarrassed how bad my first courses are.

Kirill Eremenko: I know. It's like night and day.


Kirill Eremenko: I do appreciate the effort I put in. Listening back to it,

it took so much courage to start recording.


Jose Portilla: I know. I believe we might not even start doing it,

right?


Jose Portilla: Because you're by yourself in a room recording it not

knowing if anyone will ever even view this. And I'm

shocked that ... I don't know, it's almost like a different

person made that course, because it's like, "I can't

believe I did this."

Kirill Eremenko: Oh, I can be as grateful to that person who I was for

making that leap. That was good. Okay. So now, move

that out of the way. I don't know, let's maybe talk

about what are some of the recent trends, some of the

recent things that you're seeing in the data science AI

industry that you're creating courses on that students

are excited about?

Jose Portilla: Well, let's see. Recent trends. There's always new

updates to the various deep learning libraries. So, like

TensorFlow 2.O just came out. Like just, just came

out. Maybe like a couple weeks ago maybe.

Kirill Eremenko: No. I think it was in June.

Jose Portilla: Well, that was the beta or alpha, right?


Jose Portilla: I mean, the official 2.0 release was pretty recent. And

then PyTorch 1.0 also came out really recently.

Kirill Eremenko: Okay. Very cool.

Jose Portilla: So, those are some new things. The new libraries has

always been developed. Maybe this might not be such


a new trend that I recently saw the publication date of

this paper, but I just recently found out about this was

the neural architecture search or NAS by Google, or

Google AI where they're basically using recurrent

neural networks to create or search for optimized

architectures for different problems. So, like the

CIFAR-10 dataset, with the 32x32 colored images of 10

different topics like plain, frogs, whatever.

Kirill Eremenko: 60,000 images there.

Jose Portilla: Yeah. What they're doing is they're basically deciding

that humans, since we design everything in a very

structured way, like convolutional neural networks are

very structured with the kernels, everything is still

kind of squared, connected. That perhaps there is

some more organic, more optimized connection.

Jose Portilla: So, they're using a recurrent neural network to

actually build the architecture of another network to

solve for the CIFAR-10 dataset problem. And they were

able to actually improve the performance quite a bit

from whatever the state-of-the-art convolutional neural

network could do.

Jose Portilla: And this is with a network of essentially what looks

like to the human eye randomized connections, and

they can even skip layers and stuff. And so that one

really blew my mind to the fact that I used to think

now like, "Oh, the future is like recurrent neural

networks or the feature is convolutional neural

networks." When probably the reality is the future is

some unknown random network that another network


has figured out. That's almost like the ... What is it?

Like the I am Robot or ...

Kirill Eremenko: I am Robot. Yeah.

Jose Portilla: Yeah, the [crosstalk 00:16:53].

Kirill Eremenko: I, Robot.

Jose Portilla: I, Robot? Where you have robots building robots. Now,

we have neural networks building other neural

networks.

Kirill Eremenko: That's really cool. And then you can go deeper than

neural networks building neural networks [crosstalk

00:17:05].

Jose Portilla: Yeah. Then the other thing then is like a loop. Almost

like have a neural network build a neural network for

finding neural networks. What's the most optimized

thing? Yeah, that one really blew my mind, because it

really showed that the shape of the actual network

seems to have some quite a bit more importance than

the weights.

Jose Portilla: And it's not something I think ... Well, this was

published in 2017. Now, people, I'm sure, are really

thinking about it. But definitely just five years ago, I

don't think that many people were thinking about if a

randomized neural network would actually perform

better than a structured one, given the same

randomized initialization of weights.

Kirill Eremenko: Yeah. Interesting, because you sent me that paper. I

had to look through it. First of all, I was surprised. I

was like, "Yeah, this is 2017." But still, it also, as you


said, blew my mind that you have, from scratch, this

neural network that they created to create new neural

networks was building them from absolutely zero and

outperforming by a small margin like 0.09%

performance and 1.05% faster than the human ones,

but still outperforming them on the CIFAR, right?

Jose Portilla: Yeah, CIFAR-10. Yeah.

Kirill Eremenko: CIFAR-10 dataset. That was really cool. The way I

understood it, the way it works is it takes the neural

network and that is building or wants to build and

represents it as a variable length string. So, it puts it

into a text string basically. The representation of the

neural network. And then it iterates through that

string through what was ... Gradient descent, right?

Jose Portilla: Mm-hmm (affirmative).

Kirill Eremenko: To optimize for the accuracy of the image prediction. Is

that about right?

Jose Portilla: Yeah, basically. Yeah. I think maybe the reason I

found that about it so recently was I recently, this year

for sure, even though maybe it was published two,

three years ago. I recently saw the pictures of the

neural network architectures that the RNN was

actually solving for.

Jose Portilla: And it was the weirdest looking. I mean, it looked like

a little kid drew sloppy lines with random neurons

everywhere. Nothing was even. You would expect

maybe the RNN would find some sort of hidden

structure, right? But it was just ...

Kirill Eremenko: Unstructured.


Jose Portilla: Yeah, for better or for worse, it looks more like an

organic brain. Like an actual biological brain, right?

Kirill Eremenko: That's so cool. You got to send me those images,

because [crosstalk 00:19:53].

Jose Portilla: Yeah, I'll have to find them. You look at them and it's

like there's no way this performs better than a

structured network.

Kirill Eremenko: Have you ever seen those images of when certain parts

of a building or an airplane instead of a human

designing them, they get an AI to build it?

Jose Portilla: Oh, yeah.

Kirill Eremenko: Through reinforcement learning. And it's completely

weird, completely random. Like simple parts that hold

... You know that part under a table that holds the legs

of the table to the main part of the table?


Kirill Eremenko: Like 90 degree type of angle metal thing. Like if you get

an AI to design, it looks completely randomly weird.

And it's like 30% lighter, 100% stronger, less material

required. It looks very organic.

Jose Portilla: Yeah, I remember I was once in a museum. And one of

the exhibits was an antenna that was designed by a ...

It wasn't technically AI. It was like a genetic algorithm

that try to keep solving for what kind of antenna could

get the strongest signal.

Jose Portilla: And the antenna looks so weird. It looked like a string

of spaghetti like floating in space or something. And it


was like, "Yeah, this is what the algorithm figured

would get the strongest signal in this particular spot."

Jose Portilla: And it just goes to show that it's really hard to have

intuition for some of this stuff. And it kind of makes

sense. I don't know. The more you study evolution and

biology, certain animals are super weird. Like you see

a platypus, or like a squid has a beak like a bird. It's

so bizarre, but I don't know. Nature is essentially a

really long reinforcement learning algorithm, where it's

like many, many generations, what works, what

doesn't work.

Kirill Eremenko: Yeah. But what I find interesting. I was also thinking

about it just know that at the same time in nature, a

lot of things are symmetrical.

Jose Portilla: Yeah, right?

Kirill Eremenko: As weird as they are, they're symmetrical, but what AI

designs most of the time is as asymmetrical. There's

kind of a combination of both in nature.

Jose Portilla: Yeah. And then not to get too philosophical, but then

you see certain numbers keep popping up in nature

like a pi or something.

Kirill Eremenko: Oh, the Fibonacci number.

Jose Portilla: Yeah. Or the fact that definition of a normal

distribution, the actual function for it has pi in there.

It blows my mind. How is this freaking number

showing up everywhere and things that you wouldn't

think it would show up? You wouldn't think that

relationship of a circle would have much to do for


normal distribution. But then it happens [crosstalk

00:22:28].

Kirill Eremenko: And then everything follows. Like the heights of

humans. I don't know, populations of animals,

bacteria. A lot of things are normally distributed in

this world.

Jose Portilla: Yeah. I don't know. There may be some deeper order to

things that we're just not getting, but yeah. Yeah, like I

said, you see a platypus and you're like, "There must

be some random noise here."

Kirill Eremenko: Crazy. All right. Well, shifting gears a little bit. You

teach online. And by the way, congratulate 1.2 million

students.

Jose Portilla: Yeah. Well, congratulations to you too, to the both of

us, I guess.

Kirill Eremenko: It's crazy. How does it make you feel? 1.2 million

students worldwide.

Jose Portilla: Oh, it feels bizarre. I remember thinking like a long

time ago, like, "Man, when I hit 100,000 students, that

will be it. I would have hit the ultimate goal. And then

you hit that, and then you've hit it too. And then you

think, "Okay. 250,000 students, let's really go for it."

Some crazy goal. Then you hit that and you're like,

"Oh, okay, half a million." Yeah, it's just been

absolutely insane how fast everything has been

growing just in a couple of years.

Kirill Eremenko: Yeah, it is very fast. We're, I think, at 920,000

[crosstalk 00:23:45].


Jose Portilla: Yeah. I bet you, if we had this same conversation even

just some weeks from now, you would have had a

million as well.

Kirill Eremenko: Yeah. Probably.

Jose Portilla: Next time I see you, for sure, you'll have at least a

million, if not much more.

Kirill Eremenko: Yeah. That puts a lot of responsibility, right? You got

to create the right content. The right guidance is no

longer just fun and games and just putting out there

like things that you're passionate about. But you also

got to think through what do people need? What do

your 1.2 million students need? What are their

requirements?

Kirill Eremenko: You got to think about the needs of the students. I

guess my question to you is how do you go about that?

How do you go about communicating with your

audience and finding out what is it that you can help

them with the most in this next stage of your journey?

Jose Portilla: That's an interesting question. It's almost like as we've

been progressing through this online education world

and this population of students, the analogies keep

changing. So, at first, it was like, "Okay, I can

structure myself as if I teach a course, like a

classroom of 30 students."

Jose Portilla: Then it starts getting too big. It's like, "Okay. Well,

now, I'll structure myself like a seminar." So, maybe I'll

have a set piece of notes for students like they

wouldn't a large seminar class. Less one on one

interaction.


Jose Portilla: Then it starts getting bigger, it's like, "Well, I guess

now I'm structuring myself as a department end of

college." So now, I have TAs or something and much

more standardized practice across multiple courses or

something like that.

Jose Portilla: And these are structuring yourself as a university or

something. So then now you have multiple

departments of like, "Oh, Python topics, or R topics or

Tableau topics, etc. And then there's some sort of

structure within those, etc.

Jose Portilla: And now with our scale, it's almost like the analogy

becomes like a city or something. So, then you have to

start thinking of ... At this point, one-on-one

interaction as much as I love it is kind of impossible.

We can't communicate with every citizen of us that we

have a million people, right?

Jose Portilla: So, then you start trying to think of what does a city

do. So, they may have meetups. So then we try to have

different sources or students to interact with each

other. That's maybe a little more fluid. And this is

something maybe you can have advice for me, because

I know you're probably better at this than I am.

Jose Portilla: But just trying to build that sense of community.

Maybe off of you to me, because the Q&A forms, for

interaction purposes from one student to another, isn't

exactly optimized. First, we tried Slack. That quickly

got unscalable, because we couldn't pay for every

student, and it deletes the history.

Jose Portilla: Then we tried Gitter, which is kind of like this Slack

based off GitHub, but that was also trying to have


scaling problems. And then we switch to Discord,

which I hadn't really heard of it before until someone

suggested it to me. And it's like for online gaming. Do

you know what I mean?

Kirill Eremenko: Yeah, I've heard of them.

Jose Portilla: Yeah. So, it's a free version of essentially what Slack

does. And so, so far, that's what we're using to try to

help scale a sense of community. Yeah, and they can

do things like ... Well, like I said, you're probably

better at this than I am of things like a podcast or

something, to build a sense of community or some sort

of weekly updates, that kind of thing where ...

Jose Portilla: You're not going to be able to talk to each student, but

at least you can try to encourage students talking to

one another. So, I think as we scale larger, trying to

encourage the student interaction is one of my

priorities.

Kirill Eremenko: Yeah, I absolutely agree. I wouldn't say much better

than you at this. At SuperDataScience, we're also

exploring things. So, right now, we are trying out the

Slack approach that you've already tried. We're also

considering an approach of forums, an approach of

building our own system because our whole LMS at

SuperDataScience, the learning management system is

completely custom built by ourselves.

Jose Portilla: Yeah. [Murray 00:28:03] told me that.

Kirill Eremenko: And so we can add on whatever we want to just like ...

We just need to see that there's a need for it and

there's time. But in general, I completely agree with


you that as much as well, I want to interact with

everybody. I simply physically cannot do that. And

therefore, putting people into groups to talk to each

other. That's the best.

Kirill Eremenko: I'll give you an example. I was at DataScienceGO, the

conference we run in San Diego. I was running a

workshop on Tableau. And there's, I think, like 60

people in the room. All different levels. And I said right

away, "This is a workshop for beginners. If you're

advanced, there's another workshop in this

neighboring room about AI ethics. Go there. You get a

lot of value out of that. This is a workshop for

beginners."

Kirill Eremenko: I think one people changed the room. But still, there

were a lot of different levels here. Very advanced

people, beginner people. While we were going through

these exercises on building this dashboard, some

people are going really fast ahead and I thought, "What

are you doing in this room? I told you go to the other

one." And they're like, "Yeah. No, I just wanted to play

around, see what the dashboard will be like, see what

the dataset will be like."

Kirill Eremenko: And so what I started doing is said, "All right, if you

went ahead, like far ahead, why don't you get up and

help somebody who's falling behind? There's 60

people, which is not a million, but still, I can't go help

everybody."

Kirill Eremenko: And so the more advanced people, like I remember

specifically Jonathan and Ogo. If they're listening to

this huge thank you for that. They just got up and


helped out a lot of people. And there were others as

well. And in that sense, nobody was bored. Everybody

was keeping up.

Kirill Eremenko: And I think that sense of community is amazing in

data science. Data scientists want to help each other.

Our job is to facilitate that and find the best way. It

looks like we're both exploring to find like what is the

right medium for this community to thrive.

Jose Portilla: Yeah. I don't know. It almost sounds douchey to say

this, but we really are pioneers in this space, because

there's no one else we can really talk to of like, "How

do you deal with the community of students this

large?" Where you don't have some university or

company level team to handle all of it. So, we have

explore these different methods.

Jose Portilla: And the other thing I was going to say about the

students interact with each other. I think students get

a lot out of it as far as the ... There's some more official

term for it, but like the pyramid of knowledge or the

steps to really understanding a topic.

Jose Portilla: Like the very final step is teaching a topic. So, you

know you understand something if you're able to teach

it. So, I think it helps the students to help other

students because then they know that they really

understand the topic if they're able to help out another

student in it.

Kirill Eremenko: That's a great way of putting it.

Jose Portilla: Yeah. There's some official term for this that someone

will have to Google that there's a hierarchy of


understanding. And the very last or top level is the

ability to teach it. It has some sort of proper noun

name whoever discovered it.

Kirill Eremenko: Okay. Yeah, I think I've heard of this well before, but it

doesn't come to top of head, but I agree with you.

Yeah.

Jose Portilla: Although I teach stuff, I feel like I don't understand

crap. Even though I teach them.

Kirill Eremenko: Why did you love it?

Jose Portilla: Because it's like a new thing. Every five seconds in this

freaking field.


Jose Portilla: But actually, I was going to say that might be one of

the more positive aspects of the field we work in is that

the libraries are so new sometimes. And because if you

are the world's expert in TensorFlow 2.0 and you are

not a developer at Google that was actually working on

it, the amount of total experience you can have at this

moment in time, is that most like one or two years,

right?

Jose Portilla: Technically, it's based on Keras, so you could kind of

have more experience. But for something like PyTorch

1.0 as well, the most experience you could possibly

have to be the world's expert is just a few years versus

like calculus or whatever. It was around since you

were born, so you could have a lot more experience in

it.


Jose Portilla: And I think because in this field, so many people

remember what it's like to be a beginner, because it

was not that long ago that they were a beginner

themselves just by the nature of the field. They don't

mind helping out, because it was not too long ago

themselves that they knew nothing about like

TensorFlow or PyTorch.

Jose Portilla: So, I think that definitely helps out. Just a sense of

community that for whatever reason, data science and

Python has, versus some other ... Not to disparage

other communities, but like ...

Kirill Eremenko: Consulting.

Jose Portilla: Yeah, like consulting or some people in JavaScript or

web development that's been around much longer like

HTML, CSS and JavaScript. There's definitely an

attitude of like, "Oh, you don't get this? Whatever."

Jose Portilla: Because they've had enough time with it like 15 or 20

years since web 1.0 that it's probably faded from their

mind of what it's like to be a beginner versus Python

and data science. That the libraries are constantly

being updated, and there's a new library every year so

to speak. Everyone remembers what it's like to be a

beginner, so they don't mind as much helping out.

Kirill Eremenko: Got you. Is your community mostly beginners? What

did you ...

Jose Portilla: That's a great question of the general skill level, the

community. It depends how you define beginner,

because they come from all walks of life, right?



Jose Portilla: So, there's people that, yeah, they've never

programmed before. But they'll have a PhD in ...

Kirill Eremenko: Psychology.

Jose Portilla: Yeah, psychology or something. So, they're not

beginners in the sense that they're beginners at

learning, because this person is clearly able to be self

motivated and teach themselves complex topics. It's

just that they didn't take a Python class in university

because it wasn't taught there for them.

Jose Portilla: And then there's other people that they already work

at AWS or something or they're already working at

Google, and their boss just said, "Oh, I need you to

learn this esoteric library in Python or R or whatever."

Jose Portilla: And then they're definitely not beginners and they ...

For them, it's almost like they just need to pick and

choose certain lectures from the courses of like, "Oh,

let me quickly just learn this couple things my boss

told me to learn." I think, yeah, the majority of our

"beginners" ...

Kirill Eremenko: Like newcomers to data science.

Jose Portilla: Exactly, yeah. They're not beginners in the sense that

they don't know anything. They usually have some

sort of expertise in a field outside of data science or

programming. And I think it kind of attracts that mind

that you are already technically adept at something. It

makes you interested in the possibilities of leveraging

data science in Python with your current skillset.

Kirill Eremenko: Definitely. That's something we're also seeing, I think,

because of all ... Between 60% and 70% are


newcomers to data science. Whether just college

students or transitioning into data science. And then

about 20 or so percent are more advanced

practitioners. And then about 10% are managers,

executives, entrepreneurs. But what I find interesting

is that over time ... Because we've been doing this for

years. How long have you been teaching?

Jose Portilla: Since March 2015.

Kirill Eremenko: 2015. I started on Udemy in 2014, but in data science,

it was, I think, June 2015. And so similar timeline,

right?


Kirill Eremenko: And so over that ... That was, what, four years. I've

seen people grow from beginner to intermediate to

almost advanced practitioner level. I've seen people get

jobs and so on. And it's really cool to see this growth

and especially if you get to meet them in person. That

is just fantastic.

Kirill Eremenko: They're like, "Oh, I remember you three years ago, you

were like asking these questions and you were just

starting out into your journey or transitioning from

whatever other career you had. And now, you're a data

scientist. You're coaching others. People are asking

you for advice." That is so inspiring.

Jose Portilla: It blows my mind sometimes, like the careers that

some of my students have been able to get. I was just

talking to someone recently who ended up becoming a

senior developer for AWS. I start thinking to myself,


"Would I be able to get that job? I don't think I would."

Given the interview process and how hard it is.

Jose Portilla: And they're like, "Oh, thank you. Your course helped

me that so much." I was like, "I don't know if I could

do your job." It blows my mind when you see students

getting jobs that like, "I don't think, I would probably

fail that interview if I wasn't really practicing for it."

Yeah. So, it's crazy the growth of the students and how

fast everything has been going just in the past few

years.

Kirill Eremenko: That's absolutely true. What's the most common

question that students ask you?

Jose Portilla: Where do I find the notes?

Kirill Eremenko: You get like hundreds of questions. Like we both get

hundreds of questions.

Jose Portilla: Yeah. Well, there's certain questions that's just like ...

It's also a bit of a selection bias of the kind of person

that asks a question on forums or something. It is

usually a person who has not done a quick Stack

Overflow search or something.

Jose Portilla: But beyond that, beyond little silly questions like that,

maybe one of the most common questions I get is like,

"How do I choose a machine learning model?" One

thing I do is I point them straight to the ... You know

scikit-learn. They have their choosing an estimator

diagram. It's like this weird, ugly little bubble tree

chart. That's like, "Oh, if you have this many data

samples, choose this. If you're trying to do

unsupervised or supervised, do that."


Jose Portilla: So, I point at that chart, but then I also tell them that

realistically, for some of these models, it's difficult to

have an intuition for them. Once you deal with them a

lot, then you can be like, "Oh, I think you should do

this. Blah-blah." If you're about to do in SVM, there's

not that many people that would be have a strong

intuition of what the exact C value or gamma value

should be, right? They pretty much always just do a

grid search. And the same for choosing a model. You

usually run a couple and see what performs best or

then make a combination of models.

Jose Portilla: And I think a lot of students sometimes go into it

thinking like, "By the end of this, I will know exactly

what model to choose in any situation." When

realistically, you're still going to have to test out

different models. And I think it's hard to convey to

students that even after you are extremely

knowledgeable on this topic, when it comes to a new

problem, you're still just going to have to do what

everyone else does, explore the unknown, not really

know what's the best model.

Jose Portilla: So, you can be the world's top expert. At the end of the

day, when it comes to a new problem, you're still going

to have to kind of guess and check almost. Which is

kind of, to bring it back, exactly what that neural

architecture search is doing, right? Keep guessing and

checking until you find the good fit for the good model.

Kirill Eremenko: Or what AutoML is designed to do.

Jose Portilla: Exactly. Yeah.

Kirill Eremenko: Do you think AutoML will replace data scientists?


Jose Portilla: That is such a good question, because I used to think

like, "Oh, crap. Maybe we're going to be out of a job."

Especially this robot building robots and models

building models. What's left for us?

Jose Portilla: I don't know. I think what is defined as a simple

problem keeps expanding as you go throughout time.

Because something like a linear regression task many,

many years ago, that's [goaltending 00:39:37]. It's just

beginning to figure it out. That's an extremely hard

problem. How do I find the line of best fit?

Jose Portilla: Now, that's an extremely easy problem. So, I don't

think it will replace data scientists or machine learning

practitioners. It will just basically push them to harder

problems and reclassify things as easily solvable

problems or easier problems for something to be

automated against.

Kirill Eremenko: Absolutely. And I think there's always going to be room

for human creativity in these aspects. At least for the

next 10 years.

Jose Portilla: Yeah. Then you see the neural networks that are

painting and the recurrent neural networks that are

doing text generation, like character. I'm sure you've

read that blog post of the unreasonable effectiveness of

recurrent neural networks.


Jose Portilla: [crosstalk 00:40:27]. And it's writing out Shakespeare.

Kirill Eremenko: Yeah, that's an old blog post book.

Jose Portilla: That's a very old blog post.


Kirill Eremenko: Like 2015 or something. A really good one as well.

Jose Portilla: Fantastic. And it always blows my mind that the

network is doing it character by character, not word by

word. The fact that you can even read it blows my

mind.

Kirill Eremenko: Have you seen ... There's a movie that they filmed

based on the script created by [crosstalk 00:40:51].

Jose Portilla: I have heard of it. I definitely have not seen it.

Kirill Eremenko: What is it called? I forgot. Solar something. I'll link to

it in the show notes and I'll send it to you as well. It is

ridiculous. They got Middleton. So, the actor from

Silicon Valley. You know that TV show?

Jose Portilla: Oh, yeah.

Kirill Eremenko: Jeff Middleton or something.

Jose Portilla: I forget his name, but yeah, I know what you're talking

about.

Kirill Eremenko: Yeah. And then they got him to act the main role and

it's like a whole script written by this neural network

that even gave itself ... It's been a while. I forgot. It

called itself Barney. It called itself a name. It's like a

30 or 15 minute long short movie. It was on the

London Film Festival I think.

Kirill Eremenko: The sentences themselves make sense by what people

say in the movie, but overall it's complete nonsense,

but they still acted it out in a way that you get like

sure goosebumps down your spine like, "Wow. This is

a space saga of a love story in it." It's pretty funny.


Jose Portilla: Yeah, it's crazy, because it's clear that the networks

are able to easily conquer now like things like

grammar. It will just take a deeper network to conquer

something like plot, right? I don't know if you're a ...

This was maybe within the past year. OpenAI created

basically a model to produce text articles.

Kirill Eremenko: No, I don't know.

Jose Portilla: Yeah, that was really interesting because they did not

release the full model because they thought it was too

dangerous, because they basically ... With a seed

sentence of Syria blah-blah-blah.

Jose Portilla: Suddenly, this model could generate a full ... It was

essentially like fake news text article that read

perfectly. That really read someone had written it

personally. And it was just completely made up by a

network. And they decided it was so good at generating

fake news style articles that they refuse to release the

full network.

Kirill Eremenko: That is crazy.


Kirill Eremenko: This kind of reminds me the story of CRISPR. The lady

that developed CRISPR for adjusting genes. As soon as

it came out of the lab was like ... If I'm not mistaken,

she was like, "This is very dangerous for the world.

What have we created?"

Jose Portilla: Yeah. It's almost like a milestone of you know a

technology is really good and really worth pursuing if

it's always like this double edged sword. Something

like atomic science, right? Like you have this really


interesting aspect of nuclear energy. And at a certain

reactor, it's like a thorium reactor, whatever, has the

potential for very low nuclear waste and you're

conquering the atom itself, like what the universe is

built out of.

Jose Portilla: On the other hand, you also have the ability to create

a nuclear weapon. And I think it's like that for

anything. You have convolutional neural networks that

can detect cancer or skin cancer better than any

doctor could. But then, at the same time, you could

abuse these networks to then begin racial profiling

based off corrupt datasets.

Jose Portilla: Yeah, any technology I think has the ability to be

exploited for good or bad. But at least it's a good signal

that you're onto something. Like CRISPR, like you

were saying, if you see a child with a birth defect or

something, the fact that you could maybe fix it

preemptively is fantastic. But then, should you be able

to choose the color of your baby's eyes? Maybe not so

much? I don't know.

Jose Portilla: Then there's also the ethical questions. The ethics of it

is something that ... I don't know. That will take a long

time to catch up to the technology.

Kirill Eremenko: For sure. What do you think? How far are we from

AGI?

Jose Portilla: It's funny. I was just having a conversation with

someone about this here at Udemy LIVE. Every time I

get asked this question, my timeline becomes shorter.

I remember when I was first asked the question, I was


like, "Never." And then I start building out networks

myself.

Jose Portilla: The one that really convinced me was the very first

couple years ago, when I really built up my first good

text generation network. I was like, "Oh, this is way

more effective than I thought it was going to be." I felt

like I don't know what I'm doing, and I'm actually able

to do something that could fool a person.

Kirill Eremenko: Imagine someone who knows what they're doing and

then ...

Jose Portilla: That's exactly what ...

Kirill Eremenko: Just steal them.

Jose Portilla: Yeah. And there's people way smarter than me working

on stuff that's way harder than this. Will it be in my

lifetime? I don't know. I definitely not believe that it

will be reached ... It's inevitable now at a certain point

in humanity there will be general AI, the singularity or

whatever.

Jose Portilla: Will it happen in my lifetime? I don't know. Hopefully

an old man on my deathbed, maybe it will become

more clear like, "Oh, yeah, in a couple years, we'll

[crosstalk 00:45:55]."

Kirill Eremenko: Oh, man, I think 100% in our lifetime. What's his

name? Ray Kurzweil. 2025 or 2029, that's the year.

And then 2050 is when AI becomes super intelligent,

like surpasses humans, and so on. Petitions for its

independent rights, and things like that. A classic

example is I think why we mistake it is because we're


used to linear thinking, and this stuff is happened

exponential ...

Jose Portilla: That's exponential. That's true. Yeah.

Kirill Eremenko: A great question I ask people. How far do you think

you'll get from here where we sitting in 30 linear

steps? You know that one?

Jose Portilla: Yeah. Versus like the [crosstalk 00:46:37].

Kirill Eremenko: 30 exponential steps. Ridiculous. Ridiculous. I have

the same feeling. As every year passes, my timeline

gets close, so like my expectation for this.

Jose Portilla: I think it may also be out of selfishness that I hope it

doesn't happen in my lifetime where it's like ...

Because then, I think something that's going to

happen is it's going to be a real question of what does

it mean to have consciousness? And what does it

mean to actually be human?

Jose Portilla: Because when it's replicated completely artificially, it's

going to be something that humans are going to have

to grapple with, and that's a very tough thing to think

about of, "Now, what does it mean to be human, have

a fulfilled life, have consciousness when this computer

has essentially all the same things?" Right?

Kirill Eremenko: Yeah. What's the difference? How can we discriminate

against them? And now all of a sudden, they're also

conscious.


Kirill Eremenko: Why do we consider ourselves better?


Jose Portilla: Exactly. Will they be second tier citizens when they're

actually smarter than us only because we created

them? Suddenly have some sort of power of them. Will

they live with us at the same level? This is a question

for someone much smarter than me to answer or think

about.

Kirill Eremenko: Some of the AI scientists or futurists think that our

generation and the next generation are the final

generations of humans who are here. I think it was

Elon Musk saying that we are biological ... What is it

called? What's the thing that starts computer? Boot

sequence or something. Like pre-boot, or whatever.

Kirill Eremenko: I forgot the word, but basically like this. When the

computer starts, there's a part that has to go first and

then boot up the rest of the computer on the

motherboard. That's like biological way to boot AI to

get it started. And then as soon as it started, we're no

longer need it. We were just a phase in evolution that,

"Okay, now we've created AI." Boom, the end. And then

from then on, this new species, artificial intelligence,

robots, whatever, are going to take over the planet and

so on.

Jose Portilla: Yeah. And it makes you think if any civilization across

the cosmos, if it's some sort of inevitable conclusion

that once some organic system evolves enough, they

create artificial intelligence as the next step.

Kirill Eremenko: That's interesting. Quite possibly. What's his name?

And the interesting thing about AI is listening to a

podcast with Ben Goertzel recently is that it won't be

us individually. We always are individually, and we try


to ... We strive for the sense of community. We want to

be on our phones all the time, Instagram. We want to

think as a collective mind, but it's hard for us, because

the way we do it is through phones and that's very

inefficient, very slow. Whereas AI is going to be hooked

up to the Internet.

Kirill Eremenko: So, it's not going to be individual AI. They're all going

to be in one big mega mind. It's like whenever you see

an AI, whether it's robots or a program whatever else,

they're all going to be thinking the same thing and

exchanging knowledge. And so therefore, for them, for

us, one day is going to be one day. For them, it's going

to be like one day is like 100,000 years in their

collective mind. And so they're going to evolve super

fast.

Jose Portilla: Yeah. We're so limited by our monkey brains of what

consciousness even means, right? When in reality,

once general AI is achieved, it's ... Maybe superior is

not the right word, but whatever their consciousness

is will be a higher level than what we are able to

achieve as some organic system. I mean, it'll almost be

like godlike.

Kirill Eremenko: That's the thing. There's a great article on Wait But

Why where he explains the latter of consciousness. Try

to explain to an ant in ant language what a monkey is.

Like no freaking way.

Kirill Eremenko: Try to explain to a monkey in monkey language what a

human is or like what these moving things in the sky

are, which are airplanes. This is going to think as


stars. And same thing for us. Why do we think we're

the ultimate pinnacle of consciousness?

Kirill Eremenko: There is a level above us, which we'll never be able to

comprehend just simply because of the nature of how

our brains work and limitations. There's no way we

can ever understand. I think AI, I really think that it

will get to that level where it will be looking at us as

ants.

Jose Portilla: As ants. Yeah. No, for sure. I don't know. It's a

testament to our ignorance. When we think of AI, we

think of it as a copy of a human. What it really will be

like we created some superior God that will hopefully

be benevolent to us.

Kirill Eremenko: Benevolent. Yeah.

Jose Portilla: I'm very glad to be part of this generation though that

still doesn't have it. The questions that come up of

when general AI does exist are things that I'm glad

that I don't have to think about.

Kirill Eremenko: A very interesting time to be alive.

Jose Portilla: For sure. Yeah.

Kirill Eremenko: All right. Well, Jose, thank you so much for coming on

the show.

Jose Portilla: Thank you for having me.

Kirill Eremenko: What a pleasure. Where can our students or listeners

find you, connect with you, take your courses, follow

your career?


Jose Portilla: So, probably the easiest way is just if you Google

search my name, Jose Portilla, the first thing that pops

up is probably my Udemy page. So, you can always

check that out, my profile page in Udemy for different

courses. You can feel free to connect with me on

LinkedIn. Again, that's probably like the second link

on Google.

Jose Portilla: Or you can check out pieriandata.com. That's our little

company for data science stuff, but just Google Jose

Portilla and I'm maybe too accessible. You can easily

contact me either on LinkedIn or messaging on

Udemy.

Kirill Eremenko: Fantastic. And we'll definitely include those links in

the show notes. Pierian Data, by the way, we didn't

talk about this, but I want to give a shout out that you

do corporate trainings. So, if anybody is interested in

corporate trainings, check out Pierian Data. I heard

fantastic things about you.

Jose Portilla: Thank you very much.

Kirill Eremenko: Definitely have a look at that. On that note, we

probably better get back to the conference and great

chatting in person. I'm glad we did this.

Jose Portilla: Yeah, likewise. We'll have to do this again.

Kirill Eremenko: There you have it ladies and gentlemen. That was Jose

Portilla. I really hope you enjoyed this conversation as

much as I did, and we're super grateful for you being

part of this episode.

Kirill Eremenko: My favorite part by far was the conversation about

neural networks creating neural networks. That indeed


could be the future that we're heading towards where

AI builds AI, which builds AI, which builds AI, and so

on. And then we will live in a world that we probably

wouldn't even recognize today. And then we will live in

a world that we probably wouldn't even recognize

today.

Kirill Eremenko: As always, the show notes for this episode are

available at superdatascience.com/309. That's

superdatascience.com/309. There, you can find the

transcript for this episode, any materials, research

papers, images that we mentioned on this episode.

And, of course, the URLs for Jose's LinkedIn website

and Udemy profile where you can find all of his

courses.

Kirill Eremenko: I highly encourage everybody to check out Jose's

courses on Udemy. And if you or your company are

interested in in-person corporate trainings, Jose is

doing a great job in that space. You can find him at his

website, pieriandata.com.

Kirill Eremenko: On that note, if you enjoyed this episode, forward it to

somebody you know, somebody who's passionate

about data science, analytics, AI, machine learning,

somebody who's learning these things online, or

maybe somebody who's already following Jose, and

you know that they love him and would love to hear

from him on this podcast.

Kirill Eremenko: It's very easy to share the episode. Just send the link,

superdatascience.com/309. On that note, thank you

so much for being here today. I really appreciate your


time, and I look forward to seeing you back here next

time. Until then, happy analyzing.


sds podcast episode 309: learning through competition · 2019. 11. 3. · convolutional neural...

Documents