data mining relation to course: data mining (chap 28) g15: dillon littlefield & nathan moeller...

6
Data Mining RELATION TO COURSE: DATA MINING (CHAP 28) G15: DILLON LITTLEFIELD & NATHAN MOELLER ZIMMER, CARL. "BACTERIAL ECOSYSTEMS DIVIDE PEOPLE INTO 3 GROUPS, SCIENTISTS SAY." THE NEW YORK TIMES. THE NEW YORK TIMES, 20 APR. 2011. WEB. 10 APR. 2015.

Upload: douglas-mccarthy

Post on 23-Dec-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Mining RELATION TO COURSE: DATA MINING (CHAP 28) G15: DILLON LITTLEFIELD & NATHAN MOELLER ZIMMER, CARL. "BACTERIAL ECOSYSTEMS DIVIDE PEOPLE INTO 3

Data MiningREL AT ION TO COURSE : DATA M IN ING (CHAP 28 )

G15 : D ILLON L ITTLEF IELD & NATHAN MOELLER

Z IMMER , CARL . "BACTER IAL ECOSYSTEMS D IV IDE PEOPLE INTO 3 GROUPS , SC IENT ISTS S AY. " THE NEW YORK T IMES . THE NEW YORK T IMES , 20 APR. 2011 . WEB . 10 APR. 2015 .

Page 2: Data Mining RELATION TO COURSE: DATA MINING (CHAP 28) G15: DILLON LITTLEFIELD & NATHAN MOELLER ZIMMER, CARL. "BACTERIAL ECOSYSTEMS DIVIDE PEOPLE INTO 3

Classification vs ClusteringCriteria Classification Clustering

Prior Knowledge of classes Yes No

Use case Classify new sample into known classes

Suggest groups based on patterns in data

Algorithms Decision Trees, Bayesian classifiers

K-means, Expectation Maximization

Data Needs Labeled samples from a set of classes

Unlabeled samples

Page 3: Data Mining RELATION TO COURSE: DATA MINING (CHAP 28) G15: DILLON LITTLEFIELD & NATHAN MOELLER ZIMMER, CARL. "BACTERIAL ECOSYSTEMS DIVIDE PEOPLE INTO 3

Bacterial Ecosystems

•Blood types fall into classes A, B, AB, and O. What about gut ecosystems?

•Each gut has a unique population of microbes.

•Research suggests there may be three distinct types of microbiomes called enterotypes.

Page 4: Data Mining RELATION TO COURSE: DATA MINING (CHAP 28) G15: DILLON LITTLEFIELD & NATHAN MOELLER ZIMMER, CARL. "BACTERIAL ECOSYSTEMS DIVIDE PEOPLE INTO 3

Medical Applications

• Tailor diets to specific enterotypes

• Tailor drug prescriptions to enterotypes

• Alternative to antibiotics: restore good bacteria to gut

Page 5: Data Mining RELATION TO COURSE: DATA MINING (CHAP 28) G15: DILLON LITTLEFIELD & NATHAN MOELLER ZIMMER, CARL. "BACTERIAL ECOSYSTEMS DIVIDE PEOPLE INTO 3

Challenge

• Each person has 100 trillion microbes

• Each enterotype is a balance of many bacterial species

• Debate not settled: UMN professor Dan Knights suggests for continuum

Page 6: Data Mining RELATION TO COURSE: DATA MINING (CHAP 28) G15: DILLON LITTLEFIELD & NATHAN MOELLER ZIMMER, CARL. "BACTERIAL ECOSYSTEMS DIVIDE PEOPLE INTO 3

Classification vs ClusteringCategorize each question as a classification or clustering problem:

• What is the blood type of the patient?

• Based on gut bacteria ecosystems, do human fall into a small number of distinct groups?

• How many natural groups do humans fall into based on their gut-bacteria ecosystem?