ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

15
A Review of Hybrid Data Mining Algorithm for Big Data Mining Presented By PRASANTA KUMAR PAUL RESEARCH SCHOLAR AIIT AMITY UNIVERSITY RAJASTHAN First International Conference on Smart Technologies in Computer and Communication (SmartTech-2017) Under the guidance of DR. SONALI VYAS ASSISTANT PROFESSOR AIIT AMITY UNIVERSITY RAJASTHAN

Upload: prasanta-paul

Post on 22-Jan-2018

86 views

Category:

Engineering


5 download

TRANSCRIPT

Page 1: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

A Review of Hybrid Data Mining Algorithm for Big Data Mining

Presented By

PRASANTA KUMAR PAUL

RESEARCH SCHOLAR

AIIT

AMITY UNIVERSITY RAJASTHAN

First International Conference on Smart Technologies in Computer and Communication (SmartTech-2017)

Under the guidance of DR. SONALI VYAS

ASSISTANT PROFESSOR

AIIT

AMITY UNIVERSITY RAJASTHAN

Page 2: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

What is …… ?

• Hybrid Data Mining

‣ Hybrid data mining algorithm can be presented as a combination of different classifiers. The classification ability of data mining algorithm are different, this why combining them may increase the performance of the system in term of accuracy. But they must be well chosen. There are other approach which are more general Boosting and Bagging. They are very interesting and can be efficient. An example of application in image processing is the face detection in real time using Adaboost.

Page 3: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

LITERATURE SURVEY

P Thamilselvan Image classification using hybrid data mining algorithm.

Deshmukh, A. P., & Pamu, K. S. (2012).Introduction to Hadoop distributed file system .

Feilong Cao, proposed a new algorithm, combination of Extreme K-Means (EKM) and Effective Extreme Learning Machine (EELM)

Alireza Taravat et al, introduced a new hybrid algorithm for automatic cloud detection in a complete-sky image.

M.R. et al. [10] in this study they presented a hybrid algorithm using Support Vector Machine (SVM) and K-nearest neighbor (KNN) algorithm.

Page 4: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

RELATED HYBRID ALGORITHMS FOR BIG DATA MINING

Hybrid evolutionary clustering with empty clustering solution (H (EC) 2 S)

RC Part (Representative Construction):

EFC Part (Enhanced Fireworks algorithm for clustering):

CSC Part (Cuckoo search for clustering):

Hybrid evolution clustering with empty clustering solution (H (EC) 2 S) indicates better precision when contrasted with other hybrid approaches.

Page 5: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

RELATED HYBRID ALGORITHMS FOR BIG DATA MINING

Hybrid Clustering Algorithm (HBCA) using BIRCH and K-Means

Hybrid Clustering Algorithm (HBCA) using BRICH and K-Means, This proposed method gives better performance then K-Means and K-medoid. By using WEKA datamining tool.

Page 6: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

RELATED HYBRID ALGORITHMS FOR BIG DATA MINING

GA/DT Hybrid data mining algorithm

GA/DT Hybrid data mining algorithm, This proposed method gives 20 % more effective then the decision tree and genetic programming individually.

Page 7: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

RELATED HYBRID ALGORITHMS FOR BIG DATA MINING

VAMR Algorithm- Vertical-Apriori MapReduce algorithm

Initial scan

Producing frequent 1-item set and its TID set

Producing frequent (K+1) item set

More Applicants

END

Page 8: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

RELATED HYBRID ALGORITHMS FOR BIG DATA MINING

Apriori-MapReduce Algorithm

Apriori algorithm is redesigned into a map reduce platform; therefore increase the efficiency upto 15 %.

Page 9: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

RELATED HYBRID ALGORITHMS FOR BIG DATA MINING

Hybrid GA-SVM model

Page 10: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

COMPARISON OF DIFFERENT HYBRID DATA MINING ALGORITHMS BASED ON IMAGE CLASSIFICATION

Table 1

Narration of Hybrid Algorithm (Base on Image Classification)

S.No Proposed hybrid Approach Purpose of development Draw backs

1 Genetic Algorithm and Support Vector Machine To reduce the dimensionality and optimize the classification process

Display the high error rate.

2 Decision Tree and Naive Bayes To improve the classification accuracy of multi class problem

Given less compact Solution.

3 Extreme K-Means and Effective Extreme learning Machine

To improve the classification accuracy

Process rate is very slow for Training.

4 Naïve Bayes and Support Vector machine To improve the performance of specificity and sensitivity

Several key parameters needed to achieve the best classification result.

5 Support Vector Machine and Classification regression tree

To identify the age band of 2D image face.

The regression provide highly confusion

Page 11: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

CONCLUSION AND FUTURE WORK

The proposed Methodology provides a comprehensive knowledge about how to deal with large datasets. The methodology is easy but requires good knowledge of data mining.

From this review the hybrid method Hybrid evolution clustering with empty clustering solution (H (EC) 2 S) indicates better precision when contrasted with other hybrid approaches.

In future, we means to consolidate at least two data mining methods. By applying the proposed hybrid technique, it is planned to discover better classification precision and besides, reduce the computational time complexity then another hybrid method.

Page 12: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

REFERENCES

Cui, X., Yang, S., & Wang, D. (2016, August). An algorithm of apriori based on medical big data and cloud computing. In Cloud Computing and Intelligence Systems (CCIS), 2016 4th International Conference on (pp. 361-365). IEEE.

Grami, M., Gheibi, R., & Rahimi, F. (2016, September). A novel association rule mining using genetic algorithm. In Information and Knowledge Technology (IKT), 2016 Eighth International Conference on (pp. 200-204). IEEE.

Afzali, M., Singh, N., & Kumar, S. (2016, March). Hadoop-MapReduce: A platform for mining large datasets. In Computing for Sustainable Global Development (INDIACom), 2016 3rd International Conference on (pp. 1856-1860). IEEE.

Azizi, N., Zemmal, N., Sellami, M., & Farah, N. (2014, April). A new hybrid method combining genetic algorithm and support vector machine classifier: Application to CAD system for mammogram images. In Multimedia Computing and Systems (ICMCS), 2014 International Conference on (pp. 415-420). IEEE.

Cao, F., Liu, B., & Park, D. S. (2013). Image classification based on effective extreme learning machine. Neurocomputing, 102, (pp.90-97) ELSEVIER.

Yannick, L. L., Sebastien, P., & Djamel, M. (2013, September). Combining regression and classification methods for age band estimation from human faces. In 2013 8th International Symposium on Image and Signal Processing and Analysis (ISPA) (pp. 136-141). IEEE.

Taravat, A., Del Frate, F., Cornaro, C., & Vergari, S. (2015). Neural networks and support vector machine algorithms for automatic cloud classification of whole-sky ground-based images. IEEE Geoscience and remote sensing letters, 12(3), 666-670. IEEE.

Thamilselvan, P., & Sathiaseelan, J. G. R. (2015). A Comparative Study of Data Mining Algorithms for Image Classification. I.J. Education and Management Engineering, Modern Education and Computer Science Press (2), 1-9. IEEE.

Thamilselvan, P., & Sathiaseelan, J. G. R. (2015, March). Image classification using hybrid data mining algorithms-a review. In Innovations in Information, Embedded and Communication Systems (ICIIECS), 2015 International Conference on (pp. 1-6). IEEE.

Na, S., Xumin, L., & Yong, G. (2010, April). Research on k-means clustering algorithm: An improved k-means clustering algorithm. In Intelligent Information Technology and Security Informatics (IITSI), 2010 Third International Symposium on (pp. 63-67). IEEE.

Page 13: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining

REFERENCES

Joshi, R., Patidar, A., & Mishra, S. (2011, April). Scaling k-medoid algorithm for clustering large categorical dataset and its performance analysis. In Electronics Computer Technology (ICECT), 2011 3rd International Conference on (Vol. 2, pp. 117-121). IEEE.

Kaur, J., & Singh, H. (2015, December). Performance evaluation of a novel hybrid clustering algorithm using birch and K-means. In 2015 Annual IEEE India Conference (INDICON) (pp. 1-6). IEEE.

Deshmukh, A. P., & Pamu, K. S. (2012). Introduction to Hadoop distributed file system. IJEIR, 1(2), 230-236.

Woo, J. (2012, January). Apriori-Map/Reduce Algorithm. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA) (p. 1). The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (World Comp).

Karimov, J., & Ozbayoglu, M. (2015, October). High quality clustering of big data and solving empty-clustering problem with an evolutionary hybrid algorithm. In Big Data (Big Data), 2015 IEEE International Conference on (pp. 1473-1478). IEEE.

Kaur, J., & Singh, H. (2015, December). Performance evaluation of a novel hybrid clustering algorithm using birch and K-means. In 2015 Annual IEEE India Conference (INDICON) (pp. 1-6). IEEE.

Carvalho, D. R., & Freitas, A. A. (2004). A hybrid decision tree/genetic algorithm method for data mining. Information Sciences, 163(1), 13-35.

] Dhaka, V. S., & Vyas, S. (2014). Analysis of Server Performance with Different Techniques of Virtual Databases. Journal of Emerging Trends in Computing and Information Sciences, 5(10).

Vyas, S. (2015). Analyzing Performance of Virtual and Non-Virtual database. Journal of Global Research Computer Science & Technology, 3(8)Pp 32-42.

Page 14: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining
Page 15: Ppt for paper id 696 a review of hybrid data mining algorithm for big data mining