885 fall 2009
DESCRIPTION
slide of data miningTRANSCRIPT
-
Items
People
Beer
Cheese
Diapers
Eggs
Dan
1
1
Kathy
1
1
Chuck
1
1
1
Bob
1
1
Items
People
Beer
Cheese
Diapers
Eggs
Dan
1
1
Kathy
1
1
Chuck
1
1
1
Bob
1
1
Copyright 2009, The Ohio State University
Step2B: Algorithm SelectionData-Oriented: Boolean vs. quantitative associationsAssociation on discrete vs. continuous dataResult-Oriented: Single level vs. multiple-level analysisE.g, [Coors, Huggies] or [Beer, Diapers]Result-Oriented: Simple vs. constraint-basedE.g., small sales (sum < 100) trigger big buys (sum > 1,000)?Performance Oriented SelectionScalable Parallel and Sequential algorithmsSampling based methods for fast approximate results
Copyright 2009, The Ohio State University
Step 3: Knowledge InterpretationA. Post Processing of mining resultsWhen you have too many patterns, you need to:Order them using some interestingness metricPass them to the visualization tool incrementallyB. VisualizationRender the patterns in an easy-to-use intuitive manner Highlight most relevant patterns
Copyright 2009, The Ohio State University
Step 3B: Visualization
Copyright 2009, The Ohio State University
T2: ClassificationData categorization based on a set of training objects.Applications: credit approval, target marketing, medical diagnosis, treatment effectiveness analysis, automatic text categorization etc.Goal: Develop a description for each class. classification of future test data, better understanding of each class, and prediction of certain properties.Engine data example horsepower 21