data mining report phase (1 ) lamiya el_saedi 220093158
DESCRIPTION
DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158. Index. 1.1 : Introduction 1.2 : Descriptions 1.2.1: White wine description 1.2.2: Brest Tissue description 1.3: Conclusion . 1.1: INTRODUCTION. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/1.jpg)
DATA MINING REPORTPHASE (1)
LAMIYA EL_SAEDI 220093158
![Page 2: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/2.jpg)
Index 1.1: Introduction 1.2: Descriptions 1.2.1: White wine description 1.2.2: Brest Tissue description 1.3: Conclusion
![Page 3: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/3.jpg)
1.1 :INTRODUCTION
In this phase we discuss the first step in data mining PREPROCESSING on two datasets. The first one is an CSV file talked about White Wine, and the other is an XLS file talked about Brest Tissue. We work on Rabid Miner program. In this phase we will use plot data to understanding, find the outlier in data cleaning. Remove attribute (columns) which are not related to each other, set roles to convert target class from regular to label in data transformation. And using sampling from large data in data reduction.
![Page 4: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/4.jpg)
1.2 DESCRIPTIONS 1.2.1: white wine description
Methods: 1- Discretize process: In this method we choose quality as target
class which is take values from 0 to 10 to represent quality of white wine from bad to excellent as a new classification.
We added four classes :Bad from –infinity to 3Good from 4 to 5Very good from 6 to 7Excellent from 8 to 10
![Page 5: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/5.jpg)
Discretize process
Figure 1.2.1.1: the model of discretize process
![Page 6: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/6.jpg)
continue
Figure 1.2.1.2: the output of discretize method
![Page 7: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/7.jpg)
Sample process and Remove correlate attribute
Figure 1.2.1.3: Sample process and Remove correlate attribute on white wine
dataset
![Page 8: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/8.jpg)
continue
Figure 1.2.1.5: result of sample process and remove correlation attribute on white
wine dataset
![Page 9: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/9.jpg)
filter process
Figure 1.2.1.6 filter example process on white win dataset
![Page 10: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/10.jpg)
continue
Figure 1.2.1.8: sweet white wine based on Syria measurements
Figure 1.2.1.7: non sweet white win based on Syria measurements
![Page 11: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/11.jpg)
1.2.2 :Brest tissue descriptiondetect outlier
Figure 1.2.2.1: outlier process on Brest tissue dataset
![Page 12: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/12.jpg)
continue
Figure: 1.2.2.2 plot outlier method on Brest tissue dataset
![Page 13: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/13.jpg)
Figer:1.2.2.3 the row of outlier data
![Page 14: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/14.jpg)
2 -Remove correlated attribute:
Figure 1.2.2.4: remove correlated attribute from Brest tissue dataset
![Page 15: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/15.jpg)
continue
Figure 1.2.2.5: the remain attribute after execute the remove correlation process from Brest tissue
![Page 16: DATA MINING REPORT PHASE (1 ) Lamiya El_Saedi 220093158](https://reader035.vdocument.in/reader035/viewer/2022062305/5681637e550346895dd45f85/html5/thumbnails/16.jpg)
1.3 :CONCLOSION1. Preprocessing phase is very important to prepare your
data for next phases, and be comfortable your data are correct.
2. You must input your data set as it is extension type
3. When input the attribute you must choose correct data type to work on it with more flexibility.
4. Methods maybe not satisfy for other data set, because each data set has specific characteristics.
5. if you have a sample process in a model every time you can get a deferent results.