data-starved artificial intelligence

11
Data-Starved Artificial Intelligence Dr. Sanjeev Mohindra MIT Lincoln Laboratory 5 March 2018 Data-Starved Artificial Intelligence This material is based upon work supported by the Assistant Secretary of Defense for Research and Engineering under Air Force Contract No. FA8721-05-C-0002 and/or FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Assistant Secretary of Defense for Research and Engineering. Distribution Statement A: Approved for public release: distribution unlimited. © 2018 Massachusetts Institute of Technology. Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.

Upload: others

Post on 09-May-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data-Starved Artificial Intelligence

Data-Starved Artificial Intelligence

Dr. Sanjeev Mohindra

MIT Lincoln Laboratory

5 March 2018

Data-Starved Artificial Intelligence

This material is based upon work supported by the Assistant Secretary of Defense for Research and Engineering under Air ForceContract No. FA8721-05-C-0002 and/or FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Assistant Secretary of Defense for Research and Engineering.

Distribution Statement A: Approved for public release: distribution unlimited.

© 2018 Massachusetts Institute of Technology.

Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.

Page 2: Data-Starved Artificial Intelligence

Data-Starved AI - 2SM 03/05/18

2014

• Intelligent assistant capable of voice interaction

• Speech recognition is performed with deep neural networks trained on large data

Examples of Artificial Intelligence Applications

2016

• Defeated top ranked Go players

• AlphaGo’s supervised learning drew on 160,000 games containing 29.4 million positions. It then played itself millions of times to get better and better

2017

• Testing autonomous cars without a driver

• Scene understanding is powered by deep neural networks learning on 2.5 million real-world miles and 1 billion virtual miles in 2016

WAYMO

Page 3: Data-Starved Artificial Intelligence

Data-Starved AI - 3SM 03/05/18

Computing Power• Distributed version of AlphaGo used 40 search

threads running on 1202 CPUs and 176 GPUs– Google Tensor Processing Unit (TPU) used

when playing Lee Sedol

Access to Data • AlphaGo’s supervised learning drew on 160,000

games (played by 6–9 dan players) containing 29.4 million positions

• It then played itself millions of times to get better and better

Algorithm Advances• Two deep neural networks

Value: 13 layers , Policy: 15 layers

• Monte-Carlo tree search provided the means to heuristically prune the huge move space

Availability of data and advances in computing hardware and algorithms have led to

machines approaching or exceeding human performance in some domains

What Makes AlphaGo Go?

Page 4: Data-Starved Artificial Intelligence

Data-Starved AI - 5SM 03/05/18

Applying AI to National Security

Human-Level Performance

Deep LearningBreakthroughs

DoD – Department of Defense IC – Intelligence Community

Commercial Space is Data Rich• Data is easy to collect

• Labels are free or crowd source

• Rich datasets like ImageNet, COCO, and others.

Cap

abili

ty

Learning Curve

Amount of Labeled Data

Page 5: Data-Starved Artificial Intelligence

Data-Starved AI - 6SM 03/05/18

Cap

abili

ty

Human-Level Performance

Deep LearningBreakthroughs

Commercial AI Applications

DoD/IC Problem Space

DoD – Department of Defense IC – Intelligence Community

Learning Curve

Amount of Labeled Data

DoD Problem Space is Data-Starved

• Data has not been labeled

• Data is difficult to collect because content of interest is rare or adversary makes it hard

Applying AI to National Security

Nu

mb

er o

f Ex

amp

les

10

4

Page 6: Data-Starved Artificial Intelligence

Data-Starved AI - 7SM 03/05/18

National Security Interest is often in the tail of distribution

Data Starved AI Challenges

Not Enough Labeled Data

Nu

mb

er o

f Ex

amp

les

10

4

Not Enough Data

Nu

mb

er

of

Exam

ple

s

Objects / Events of Interest

DIUx Challenge Dataset

Xviewdataset.org

Page 7: Data-Starved Artificial Intelligence

Data-Starved AI - 8SM 03/05/18

*Vehicle detection in low-res FMV; an example of AI applied to data-rich military domain** Identification of camouflaged military targets: an example of a low-resourced and adversary-countered AI task

“Big-data” Domains

“Labeled” Data Domains*

Strong Commercial Leverage

Data-Starved• Insufficient labeled data

Data Rich• Data is easy to collect• Labels are free or crowd source

Data-Starved• Data is difficult to collect• Content of interest is rare

More sophisticated algorithms are needed in a data-starved environment

Applying AI to National Security

Example Research Thrusts

1. Develop Gold-standard datasets

2. Efficient data labeling at scale

3. Develop algorithms that require less training data

4. Pursue Cognitive Science research to inform machine learning

5. Hybrid learning that merges deep learning with model-based learning

Dat

a

More

Less

Alg

ori

thm

s

More Sophisticated

Simpler

“Low-resource” Domains**

Recent Commercial / Academic Progress

Strong National Security Pull

Physics-Based AI

Generative / Model-Based AI

Simulation Capability

Page 8: Data-Starved Artificial Intelligence

Data-Starved AI - 9SM 03/05/18

Data-Starved AI Session Talks

Computer Vision InferencingCyber Warrior

CHARIOT Detecting Online Cyber Discussions

TF-IDF Features

1% Cyber 80% Cyber

Logic Regression Classifier

Mis

s P

rob

abili

ty (

%)

False Alarm Probability (%)

0.01 0.1 1.0 10

10

20

30

40

50

60

70

80

90

100

AI to Aid Rapid Response to Cyber Attacks

AI for Imagery Analysis in Low Resource Domains

Probabilistic Computing for Data-Starved AI

Subset Prioritized by Uncertainty

Labeled Data

? ? ?Unlabeled Data

Active Learning Cycle

Analyst Labels Subset

Model Trained with Labeled Data

Object Detection

Page 9: Data-Starved Artificial Intelligence

Data-Starved AI - 10SM 03/05/18

Data-Starved AI Session Posters

Computer Vision in Low Resource Environments Teaming with the AI Cyber Warrior

Interpretable Machine Learning Threat Network Detection: Countering Weaponization of Social Media

Mr. David Mascharka, MIT Lincoln Laboratory Dr. William Streilein, MIT Lincoln Laboratory

Dr. Jonathan Su, MIT Lincoln Laboratory Dr. Olga Simek, MIT Lincoln Laboratory

Estimation Problem: Influence

Page 10: Data-Starved Artificial Intelligence

Data-Starved AI - 11SM 03/05/18

Keynote: Prof. Antonio Torralba

• Building systems that can perceive the world like humans do. A system able to perceive the world through multiple senses might be able to learn without requiring massive curated datasets.

ProfessorCSAIL

Dept. of Electrical Engineering and Computer Science

Massachusetts Institute of Technology

MIT-IBM Watson Lab

• The Lab is focused on advancing four research pillars: AI Algorithms, the Physics of AI, the Application of AI to industries, and Advancing shared prosperity through AI

Research Interests

Page 11: Data-Starved Artificial Intelligence

Data-Starved AI - 12SM 03/05/18

• Recent advances in hardware, algorithms, and the availability of large training data have led to machines approaching or exceeding human performance in some domains

• Challenge in applying AI for National Security: How do we gain understanding of the world to enable time-critical decisions in an environment that is adversarial and data starved.

• Advances in data-starved AI are needed to meet national needs

– MIT Lincoln Laboratory is actively working in this area

– Looking forward to collaborating with you to improved the state of the art

Summary