Download - Presentation Title: DATA MINING
![Page 1: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/1.jpg)
Department of Computer Science Sir Syed University of Engineering
&Technology, Karachi-Pakistan.
Presentation Title: DATA MINING
Submitted By Osama Ghulam Mohammad. (2010-CS-20)Noureen Chagani (2010-CS-11)Naveed Usman (2010-CS-23)
![Page 2: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/2.jpg)
TABLE OF CONTENTS
What is data mining ? Data mining consists of five major
elements Why Mine Data?Commercial ViewpointScientific Viewpoint Some of the techniques used for
data mining
![Page 3: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/3.jpg)
What is data mining ?
Data Mining, also known as Knowledge-Discovery in Databases (KDD), is the process of automatically searching large volumes of data for patterns.
It is the process of extraction of knowledge from large datasets.
Extremely large datasets.Useful knowledge that can improve
processes.
![Page 4: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/4.jpg)
![Page 5: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/5.jpg)
Data mining consists of five major elements: Extract, transform, and load
transaction data onto the data warehouse system.
Store and manage the data in a multidimensional database system.
Provide data access to business analysts and information technology professionals.
Analyze the data by application software.
Present the data in a useful format, such as a graph or table.
![Page 6: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/6.jpg)
Why Mine Data? Commercial Viewpoint Lots of data is being collected
and warehoused Web data, e-commerce purchases at department/
grocery stores Bank/Credit Card
transactions Computers have become cheaper and
more powerful Competitive Pressure is Strong
Provide better, customized services for an edge (e.g. in Customer Relationship Management)
![Page 7: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/7.jpg)
Why Mine Data? Scientific Viewpoint Data collected and stored at
enormous speeds (GB/hour). remote sensors on a satellite telescopes scanning the skies microarrays generating gene
expression data scientific simulations
generating terabytes of data Traditional techniques
infeasible for raw data. Data mining may help
scientists . in classifying and segmenting data
![Page 8: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/8.jpg)
Some of the techniques used for data mining are:Artificial neural networks - Neural
networks are useful for pattern recognition or data classification, through a learning process. Non-linear predictive models that learn through training and resemble biological neural networks in structure.
![Page 9: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/9.jpg)
Neural Network Neural Networks map a
set of input-nodes to a set of output-nodes
Number of inputs/outputs is variable
The Network itself is composed of an arbitrary number of nodes with an arbitrary topology
Neural Network
Input 0 Input 1 Input n...
Output 0 Output 1 Output m...
![Page 10: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/10.jpg)
Decision tree
Tree-shaped structures that represent sets of decisions. These decisions generate
rules for the classification of a dataset.
![Page 11: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/11.jpg)
Decision tree (data)height hair eyes classshort blond blue Atall blond brown Btall red blue Ashort dark blue Btall dark blue Btall blond blue Atall dark brown Bshort blond brown B
![Page 12: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/12.jpg)
![Page 13: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/13.jpg)
hair
eyesB
B
A
A
darkred
blond
blue brown
![Page 14: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/14.jpg)
The Nearest neighborhood method
A classification technique that classifies each record based on the records most similar to it in an historical
database.
![Page 15: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/15.jpg)
An important technique for Data Mining is:
CLUSTU
RING
![Page 16: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/16.jpg)
Clustering : (Definition)• Clustering can be considered the most important
unsupervised learning technique; so, as every other problem of this kind, it deals with finding a structure in a collection of unlabeled data.
• Clustering is “the process of organizing objects into groups whose members are similar in some way”.
• A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.
![Page 17: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/17.jpg)
Clustering
The greater the similarity (or homogeneity) within a group, and the greater the difference between groups, the “better” or more distinct the clustering.
![Page 18: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/18.jpg)
![Page 19: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/19.jpg)
Why clustering?
A few good reasons ...
Simplifications Pattern detection
![Page 20: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/20.jpg)
The K-Means Clustering MethodBasic K-means Algorithm for finding K
clusters:1. Select K points as the initial
centroids.2. Assign all points to the closest
centroid.3. Recompute the centroid of each
cluster.4. Repeat steps 2 and 3 until the
centroids don’t change.
![Page 21: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/21.jpg)
Figure 10a shows the case when the cluster centers coincidewith the circle centers. This is a global minimum. Figure 10b shows a local minima.
![Page 22: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/22.jpg)
Cluster Example
![Page 23: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/23.jpg)
“The key in business is to know something that nobody else knows.”
— Aristotle Onassis “To understand is to perceive patterns.”
— Sir Isaiah Berlin
![Page 24: Presentation Title: DATA MINING](https://reader036.vdocument.in/reader036/viewer/2022062520/56816320550346895dd39a31/html5/thumbnails/24.jpg)
Thank You