intro to machine learning: thunderplains 2016

54

Upload: frank-evans

Post on 16-Apr-2017

128 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Machine Learning

Frank D. Evans

Thunder Plains 2016

A Data Scientist is a data analyst that lives in California

A Data Scientist is

a better mathematician than any of the programmers,

and a better programmer than any of the mathematicians.

OSEMN

OSEMN

Obtain

OSEMN

Scrub

OSEMN

Explore

OSEMN

Model

OSEMN

Interpret

TOOLS

TOOLS

TOOLS

TOOLS

Data is huge. People are expensive. Computation is cheap.

Use applied statistics to let the computers program themselves.

Types

Types

Supervised "I have a set of examples with the right answers, I want to learn a pattern and use it on examples where I don't have the answers."

Types

Unsupervised "I have data with no answers, but I want to find a pattern that might lead me to an answer."

TypesReinforcement "I want to start with what I know now, and be able to learn new things as new data comes along."

Types

Supervised "I have a set of examples with the right answers, I want to learn a pattern and use it on examples where I don't have the answers."

Unsupervised "I have data with no answers, but I want to find a pattern that might lead me to an answer."

Reinforcement "I want to start with what I know now, and be able to learn new things as new data comes along."

Types

Supervised "I have a set of examples with the right answers, I want to learn a pattern and use it on examples where I don't have the answers."

Unsupervised "I have data with no answers, but I want to find a pattern that might lead me to an answer."

Reinforcement "I want to start with what I know now, and be able to learn new things as new data comes along."

Regression vs Classification

Supervised Learning

RegressionUse continuous data to make a model that predicts where new data will fit.

ClassificationLabel data into "buckets", and make predictions on which bucket a new data point will fall into.

Unsupervised Learning

Reinforcement Learning

Reinforcement Learning

Deep Learning

Deep Learning

generalization, not memorization

generalization, not memorization

generalization, not memorization

Overfitting

DomainsTabular

DomainsText

DomainsGraph

DomainsViz

You’re not trying to learn about the data, you’re trying to use the

data to learn about the world.

exaptive.com/blog

Frank D. Evans@frankdevans

@exaptive

slideshare.net/frankdevans