csc411- machine learning and data mining tutorial 10– march 23 th, 2007 university of toronto...
TRANSCRIPT
![Page 1: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/1.jpg)
CSC411- Machine Learning and Data Mining
Tutorial 10– March 23th, 2007
University of Toronto (Mississauga Campus)
![Page 2: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/2.jpg)
Case 1: In order to improve the business, a national-chain supermarket starts a project to keep track of their customers. Regular customers can collect points or receive discounts by using their store card on each purchase. Temporary customers who are not members to the store will be assigned to a same temporary store card. Now supermarket is hiring the data mining analyst to help them on this project.
Question: If you are the data mining analyst, how will you design the project and what data you need for the project?
Data Mining and Machine Learning Applications
![Page 3: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/3.jpg)
Case 2: Researchers found that individuals have different responses or reactions to the same drug treatment. For example, two smokers have the same smoking history. One is detected to have lung cancer and the other one does not. Single Nucleotide Polymorphisms (SNPs) are an important resource to explain these phenomenons. One possible project is study the association between the SNPs and the DNA sequences.
Question: If you are the researcher, how will you design this project?
Data Mining and Machine Learning Applications
![Page 4: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/4.jpg)
Cancer – Different Fates
This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs): http://www.nci.nih.gov/cancertopics/understandingcancer/geneticvariation
![Page 5: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/5.jpg)
SNPs A SNPs B
SNPs C SNPs D
SNPs May Be the Solution
This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs):
http://www.nci.nih.gov/cancertopics/understandingcancer/geneticvariation
![Page 6: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/6.jpg)
What Is Variation in the Genome?Common Sequence
Variations
Polymorphism
Deletions
Translocations
Insertions
Chromosome
This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs): http://www.nci.nih.gov/cancertopics/understandingcancer/geneticvariation
![Page 7: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/7.jpg)
SNPs Are the Most CommonType of Variation
At least 1 percent of the populationMost of the population
Common sequence
G to C
SNP site
Variant sequence
This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs): http://www.nci.nih.gov/cancertopics/understandingcancer/geneticvariation
![Page 8: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/8.jpg)
The Genome Contains Genes
Gene 2 Coding region Protein 2
Protein 1
Noncoding region
Noncoding region
Gene 1 Coding region
This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs): http://www.nci.nih.gov/cancertopics/understandingcancer/geneticvariation
![Page 9: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/9.jpg)
Variation in the Human Genome
Person 1 Person 2
= Variations in DNAThis slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs): http://www.nci.nih.gov/cancertopics/understandingcancer/geneticvariation
![Page 10: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/10.jpg)
Variations Causing No Changes
= Variations in DNA that cause no changes
This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs): http://www.nci.nih.gov/cancertopics/understandingcancer/geneticvariation
![Page 11: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/11.jpg)
Variations Causing Harmless Changes
= Variations in DNA that cause harmless changesThis slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs): http://www.nci.nih.gov/cancertopics/understandingcancer/geneticvariation
![Page 12: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/12.jpg)
Variations Causing Harmful Changes
= Variation in DNA that causes harmful change
No Disease
No Disease Hemophilia
This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs): http://www.nci.nih.gov/cancertopics/understandingcancer/geneticvariation
![Page 13: CSC411- Machine Learning and Data Mining Tutorial 10– March 23 th, 2007 University of Toronto (Mississauga Campus)](https://reader036.vdocument.in/reader036/viewer/2022070306/55175cad5503463e368b4592/html5/thumbnails/13.jpg)
Variations Causing Latent Changes
Many years laterMany years later
= Variations in DNA that cause latent effects
This slide is copied from National Cancer Institute, Understanding cancel series: Genetic Variation (SNPs): http://www.nci.nih.gov/cancertopics/understandingcancer/geneticvariation