Computing the Future: Setting New Directions (Part 2)

Hello. Thank you for coming today. My name is Antonio Torralba. I'm a professor of electrical
engineering and computer science at MIT. And I'm also the director of
the MIT-IBM Watson AI Lab, which is an amazing example
of fruitful collaboration between academy and industry. And I'm also the director of
the MIT Quest for Intelligence. The Quest for Intelligence
is an MIT-wide effort trying to understand,
what is intelligence, and answer fundamental questions
about human and machine intelligence. My job now is to introduce
the next three speakers– Justin [? Inbound, ?]
[? Dimitrus ?] [? Procemus, ?] and Alexander [? Madri. ?]
All of them are my colleagues, and they are very involved with
the Quest for Intelligence. And they are all superstars in
their own ideas of research, working amazing students. So please join me in
welcoming to the stage, Justin [? Inbound. ?] [APPLAUSE] So I want to ask this
question– why do we have today all these AI
technologies, but no real AI? So what I mean is,
we have machines that do things we used to
think only humans could do, but we don't have anything like
the flexible, general-purpose intelligence that
each of you can use to do every one of
these things for yourself. So why not? What's the gap? Well, the neural networks
and deep learning driving today's AI are based
on simple equations that were derived in the
1970s and 1980s, to capture the most basic
animal learning processes. Think Pavlov's dogs,
or a rat in a maze. They're finding associations,
recognizing patterns, but not understanding. And that means they're
at most one small step towards real machine
intelligence. Real intelligence is not
just recognizing patterns, but modeling the world– explaining and
understanding what you see, imagining things that you
could see but haven't seen yet. Solving problems, making plans– to make those things real. And then building new models as
you learn more about the world. Now, the goal of
my work is to try to write down these
kinds of equations– to capture this kind
of human learning, and then to use this to build
more human-like forms of AI. And I'm starting with trying
to understand how children learn and think, because
they are the original model builders. And we are still far from having
a AI with the intelligence of even a 1 and 1/2 year old. But imagine if we
could get there. Imagine if we could
build a machine that grows into intelligence
the way a person does– that starts like a baby
and learns like a child. This may not be our
only route to real AI, but it could be our best bet. Because think about
it– a human child is the only known scaling
route in the universe. It's the only system we know
that reliably demonstrably grows into human intelligence
starting from much less. And we know that even
small steps towards this could be big. So we're starting with the
most basic common sense, that's in every 1 and 1/2
year old, but no AI. The intuitive physics that you
see in a child like this one– stacking up blocks, playing
with playing with toys. Or the intuitive psychology that
you see in a kid like this one here– another 1
and 1/2 year old– that lets them figure out what
somebody else is doing and why. To read their mind,
in a sense, even for a complex action like this,
that you've never seen before. Think about it. Watch this kid. These kids are just 1 and 1/2. If we could build robots with– [APPLAUSE] –this kind of intelligence,
with this skill and helpfulness around the house, with
this kind common sense, that would be amazing. So to do this
we've had to invent new kinds of AI-programming
languages, known as probabilistic programs. These build on but go far
beyond the deep learning that's driving today's AI. And they allow us to bring
together the best insights from multiple eras
of the field– ideas that may not
have a simple home in today's neural networks– all into a unified framework. So this means, for
example, symbolic languages for representing and reasoning
with abstract knowledge, or probabilistic
inference for reasoning about cause and effect
from sparse uncertain data. Probabilistic
programs may be what you're going to be hearing
about in the next few years, if you haven't yet. Just to give an
example from research– just in the last
year, in 2018, we and many of our collaborators
have used these tools to give that kind of
intuitive physics to robots– to allow robots to
stack up blocks, and even play the game "Jenga." To be able to imagine how
to use new tools, even ones they've never seen before. To plan complex
actions, to make sushi– or at least the rice–
to pour ice water, and even to learn to walk. The next step, the next
challenge is model learning. How could a child, or robot
build an intuitive physics model for themselves? Learning these
probabilistic programs means that your learning
algorithm has to be a program learning program– an algorithm that
writes algorithms. It's the child as coder. This is a much harder
form of learning than in today's neural networks. But children do it,
so machines can too. Now, we've made a small
step towards this, with a system that can
learn new programs, that can capture simple visual
concepts– such as a new tool, or a new handwritten character. So look at these here. You can learn thousands of new
characters and new alphabets from just a single example each. You don't need 1,000 examples
to learn a single new concept, like today's
deep-learning systems. Now, our Bayesian
program-learning system can learn like you do. It uses probabilistic programs
to model the causal processes that put ink on the page– programs for writing,
and action, and drawing. And then it runs these
programs backwards, to learn the program
most likely to have produced the character you see. This lets us generalize new
concepts from a single example, and even pass a simple kind
of Turing test– like this. We can show a new concept to
both humans and our machines, and ask them to
draw new instances, to imagine new examples. Now, can you tell which
are the people's and which are the machine's drawings
before I showed you? Try here– see if you can. My bet is that most
of you couldn't So we're passing this
very simple Turing test. It's a small step,
but it does scale. This idea lets us learn programs
that can describe the shape of a chair, that can answer
questions about pictures, and that can even learn a new
video game 1,000 times faster than today's deep-reinforcement
learning systems– and almost as fast
as a person can. Will this be the idea that
finally delivers on AI's dream, to build machines that
learn like children? Probably not yet. But it may be the
next small step. It may be the next form
of deeper learning. So stay tuned. Looking ahead, once we've
reached this first moonshot stage in our program here– the
18-month-old stage of common sense– stage two is to learn
the most important thing that every child learns
between age 1 and 1/2 and 3, and that's language. And then stage three
is using language to learn everything else– to access the full
sweep of human knowledge that builds culturally
across generations, and across societies. And that puts you in a
position to contribute new knowledge yourself. If we could build
AI like this, this would be a AI that
lives in a human world– that humans could talk to,
teach, and trust the way we've always done
with each other– even with people that were just
meeting for the first time. This could be AI that
makes us actually truly smarter and better off. Thanks. [APPLAUSE] So in life there are some
very significant moments– birth of a child,
getting married. So in the life of
institutions there aren't too many such moments. I have been 34
years at MIT, and I believe today is such a moment. So Steve and Rafael, thank you. So I would like to talk to
you about interpretable AI, and to motivate. Suppose you consider
a driverless car. And this driverless car
is involved in an accident with loss of life. Who is at fault, the driver,
the passenger, or the algorithm? And most importantly,
can society tolerate not understanding
this question? For another question similar– closer to us, let's suppose
you have a student who is not selected for freshmen
admission, even though he might be
the valedictorian of his high school. Is it an adequate
response for the algorithm to say the algorithm
made the decision? I suspect not. So in my view, and perhaps
in the view of others, interpretability matters. However, existing
methods achieve either high-quality performance
or inevitability, but not both. For example, neural networks,
that you have heard lots about. But other methods that
have exotic names, like random forest
and boosted methods, have high-quality performance
but low interpretability. More classical methods,
like regression, developed 200 years ago. And classification
regression trees have high
interpretability but not as high-quality performance. In my group, we aspire,
and to some degree we succeeded, in
developing new methods that have both characteristics–
both interpretability and high-quality performance. One of my heroes, Leo
Breiman, who passed away about 13 years ago, developed
one of the most interpretable methods– classification and
regression trees– CART for short. And he commented that on
interpretability trees right in A plus. However, on performance,
they were not as strong. So we developed in
my group some methods that are like trees, except
we use the tremendous progress that my field
optimization have made, that allows both
interpretability and performance. And in addition, allows to
partition the space into, let's say in a classification
setting, into regions– in which we can make
a decision– classify appropriately. Just to illustrate an example– this is actually an example
developed with Dana Farber researchers, on predicting the
mortality for a patient taking chemotherapy. I have a personal connection
to this application. My father was diagnosed about 12
years ago with gastric cancer. And at the very
end, the doctors, with very good intentions,
were prescribing chemotherapy– even though it was very
clear, to me at least, that the end was very near. So using this new method,
we have developed algorithms that are state of the art
in terms of performance, that predict mortality. But most importantly,
they are interpretable. And at least in my
experience in the last decade or so with medical doctors,
interpretability is a must. That is, if you say to a
patient, stop chemotherapy, a natural question is,
why do we believe that? And if you follow
the algorithm, it says, because at least
a doctor understands certain enzymes are elevated. The change of the
weight is elevated. You can explain the reasons why. And that's the claim,
just to illustrate regarding the combination of
predictability and performance. So on the horizontal axis
is the depth of the tree– the number of questions we ask. And on the vertical axis is
the accuracy of the method. The blue– this is like the tree
we have seen for chemotherapy. It has about the same
performance– slightly better, in fact– than a state-of-the-art
method called random forest. And the green– this is a tree
similar to that, except that instead of asking one question,
we ask multiple questions at a time. It has similar performance
with another state-of-the-art method, boosted trees. So this enables the
application of his methods for assessing
mortality, morbidity, and many other
questions in medicine. This is an application we
developed with Mass General Hospital– currently at use at
the trauma department there. In which an incoming patient
in the emergency department before any surgery– but just on data, we make
predictions about mortality, specific morbidities,
or any mobility in ways that are understandable. They are delivered like the
application on the right. And as I participate in
discussions with doctors in the morning rounds,
I have seen that it has changed the dialogue. It has discovered things that
humans typically don't fully understand, or expect. Because many human
doctors have let's say, thousands, maybe 2,000
experiences in their lives. This is based on about
a million experiences, from hospitals all
over the country. Clearly, the [INAUDIBLE]
data, in this case, beats the human experience. Of course, this is
not only to surgery. In many other applications,
we have seen similar results– for cancer, cardiovascular
disease, type 2 diabetes, liver and kidney transplantations. On pharmaceutical research,
the clinical trials typically analyze the mean
of the treatment effect. However, subgroups might exhibit
a very different response. So these trees that we have
developed– optimal trees– we have applied it. And they produce very
interpretable subgroups. And in this case,
identifying subgroups of exceptional responders
could guide the design, and inclusion criteria could
find [INAUDIBLE] in failed trials identify opportunities
to relabel existing drugs. In summary, the message I
would like you to remember– and I believe other
speakers have spoken– is interpretability matters. And relative to our discussions
on ethics, I would say, interpretability helps
the ethics discussion, because at least we understand
what the algorithm is doing. It's not just a black box. On that note, thank you. [APPLAUSE] Welcome, everyone. I am Alexander [? Madri. ?]
And I will talk about AI– surprise, surprise. But, actually, what I want to
talk about is the aspect of AI that I think we need to
get right, to make sure this is actually a
successful technology. So I guess it's fair to
say by now that we all are very excited about what
AI seems to be able to do. Like, things that we thought
are impossible to do merely 10 years ago now seem to be
completely within our grasp. And this is exciting. So we are thinking
of all the ways in which AI can change the way
we work, the way we travel, the way we play. So I think we can just say,
we are all ready for AI. We are all excited about this. I am excited. But as excited as I am, I
can stop to be also worried. Because I think that there
is a question that we should be asking ourselves– a
question that actually drives most of my
recent research. And this question is, even as
much as we are ready for AI, is AI ready for us? What do I mean by that? Let me demonstrate. So what you see on the
right is a beautiful peak. To us humans, it is a pig. And a state-of-the-art
classifier will view it as a pig as well. So far so good. What's interesting here? Well, it gets interesting
if I add to this image a little bit of a
carefully chosen noise. What I will get is the
picture on the right– which again, to us looks
like a beautiful piggy. However, to the state-of-the-art
classifier this is actually an airplane. So there are two lessons here. The first lesson is,
if you ever doubted, AI is a magical technology. It can make pigs fly. [LAUGHTER] That's important. The second thing
is, that is not what you would expect to happen. So what's going on? Well, your first reaction
might be, is this for real? Because, indeed, I told you
that the noise I needed to add has to be very carefully chosen. And maybe in the
real world, I can't have fine-grained enough control
over what the machine sees to really trigger it. And that's actually what
many people initially thought, until a bunch of MIT
undergrads proved them wrong. So what they did– they
3D printed a turtle, that looks like a turtle to us. But, essentially, it
classifies as a rifle– from all angles and zooms. So if you ever doubted
this so-called adversarial perturbation is a real thing. So, well, should we
worry about that? And the answer is, yes. And, why? Well, many reasons. The first and the most obvious
one is security context. If I can make your system see
something different than I see, that's how many
security breaches start. So one of the promising
uses of this technology could be in facial recognition. And here, we see a
state-of-the-art facial recognition system, that
recognizes correctly the person in the picture– despite having
these funny glasses. However, when this person
puts on these even more funny glasses on,
something magical happened. The system believes this a
completely different person. Think of the implications. But yeah, this is security. But what about safety? Sometimes we don't
really believe there are some bad guys
who want to get us. But still there are systems
like that, in which we really want to be sure that there
is no inputs that cause some of an undesirable behavior. And true enough, this system
has a very undesirable behavior, if you just perturb the picture
that you have in the right way. So this is about
security and safety. This is about getting the
getting the decision right. But sometimes it's
not only about getting the correct decision. It's also important about
getting the correct decision for the right reasons. So for instance, one of the
promising applications of AI was to use it in hiring– to make decisions about who
to hire and who not to hire. And the idea was
that, well, this would be purely data driven. So the outcomes will clearly
be impartial and optimal. So that was the dream. What happened in reality? Well, what happened in reality
is that using these solutions actually reinforced all the
biases and all the inequities that we wanted to avoid
in the first place. And my most amusing example
here is a resume screening tool that came to a conclusion that
the two most important factors predicting job performance
is being named Jared and playing lacrosse
in high school. [LAUGHTER] Yeah, so things are not great. And you know, definitely you
should be careful about that. But one more aspect
I want to talk about is about, who will this
kind of AI benefit– at least the AI technology
that we have now? So think about–
what do we new need now to deploy an AI solution? We need a lot of data. We need a lot of compute. Also, we need an education. That is actually
quite hard to get. So MIT and other universities
are trying what we can. But still there is
not that many people who have the right
training to use [INAUDIBLE] So is this something,
really, that an average man can use, understand, and apply? I don't think so. So I guess the conclusion here
is that AI technology that we have now, it has to change. It has to evolve. So on one hand, it
has to be secure. It should be tamper resistant. It should not be something
that is the weakest link in our current system
from a security point of view. It should be reliable, so
we can actually confidently deploy it the context
[INAUDIBLE] human lives matter. It also should be equitable. It should be mindful
of the societal impact of the decision it
makes, and make sure that it's consistent
with our values. Finally, it needs
to be accessible. So, essentially, even people
without the specific training and these hard-to-get
resources can actually fully take advantage of this. So the way I like
to phrase it is that essentially what we
need is we need AI 2.0. Which means we need to look at
the AI 1.0 that we have now– the proof-of-concept AI– and think about all the ways– all these aspects of here, and
figure out how to fix them. And only then the AI
2.0 actually can deploy. So essentially trying to
come up with this AI 2.0 is what my group spends
most of its time on. And there are two lessons
that we have so far. So one lesson is that
actually many people think of getting one of
these goals in isolation. But actually it turns out that
if you elect to go for all four of them, you actually have some
very nice synergy that you can, and you should
take advantage of. The second lesson
is that, actually, even if your goals are just
for achieving these four properties, you might end
up with models that are also better, from the point of view
of more traditional properties. In particular, the models
that you get to be robust, they actually also
turn out to come up with better
representation of data. So let me demonstrate. So what you see on the left
is just a picture of a dog. That is correctly
classified as a dog. So so far, so good. But it's sometimes useful
to look under the hood, and to see why did the models
decide to call it a dog? And, essentially, one
natural way to do it is to create a heat map–
where you look at every pixel, and you see how much influence
these particular pixel had on the final decision. So if you do it to
a standard model– the AI 1.0 model– you will get a heat
map like this, which is only mildly informative. However, if you do
it to a robust model, you will get things like that. Where, essentially,
suddenly what was the driving
decision of the model seems to be much more compatible
with what we humans would think as important. So the goal is AI 2.0. And this is, of
course, a quest that goes beyond any
single research group. You need to get expertise from
many different domains, that also transcend CS. So in particular areas
[INAUDIBLE] computing, I think it's greatly poised to
exactly tackle this challenge. So there's a great many
of my colleagues here that think about these
questions and share this vision. These are just some of them
that I've talked to so far, but there are many, many more. But this is more than just
about faculty thinking about this stuff I think the
most important mission here is to educate the next
generation of engineers and researchers, that
understand these issues and know how to deal with them. And that's what we are
doing at MIT as well. I believe that this is my group. And I believe that
in my group there will be many of the future
leaders in this field. But there will be
many, many more. And, essentially,
the goal here is to get AI that actually is ready
for real-world deployment– the AI that's
actually human ready. Thank you. [APPLAUSE]

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *