Download - Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)
![Page 1: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/1.jpg)
Data Mining Technique For Classification and Feature Evaluation
Using Stream Mining
Ranjit R. Banshpal
![Page 2: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/2.jpg)
•Introduction
•Data streams classification
•Decision Tree
•VFDT
•Challenges
•Applications
•Conclusion
•References
OUTLINE
![Page 3: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/3.jpg)
IntroductionIntroduction
• What is Data mining ?
• Extracting knowledge from historical data.
• What is Data stream Mining ?
• Extracting knowledge from real high stream data
• Why we use Data stream Mining ?
![Page 4: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/4.jpg)
Network Traffic Data
Sensor Data Call Center Data
Continue flow Data
Examples:
Introduction (Cont…)Introduction (Cont…)
![Page 5: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/5.jpg)
5
• Uses past labeled data to build classification model
• Predicts the labels of future instances using the model
• Helps decision making
Data Stream ClassificationData Stream Classification
Network traffic
Classification model
Attack traffic
Firewall
Block and quarantine
Benign traffic
Server
Model
update
Expert analysis and labeling
![Page 6: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/6.jpg)
Decision TreesDecision Trees
• Decision tree is a classification model. Its structure is a like a general tree structure or flow chart.– Internal node: It is used for testing the attribute
value.
–Leaf node: class labels.
Fig: Decision Tree of Weather
![Page 7: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/7.jpg)
Decision Tree (cont...)Decision Tree (cont...)
• Limitations–Classic decision tree assume all training data
can be simultaneously stored in main memory.
–Disk-based decision tree repeatedly read training data from disk sequentially.
![Page 8: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/8.jpg)
VFDTVFDT
• VFDT takes less time as compare to Decision tree.
• In order to find the best attribute at a node, it will take small subset of
the training examples that pass through that node.
– Given a stream of examples, use the first ones to choose the
root attribute.
– Once the root attribute is chosen, the successive examples
are passed down to the corresponding leaves, and used to
choose the attribute there, and so on recursively.
![Page 9: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/9.jpg)
VFDT (cont...)VFDT (cont...)
Data Stream
Data Stream
(Gender)-Type) (Car_
GG_
Age<30?
Yes
Yes No
Age<30?
Car Type=Sports Car?
No
Yes
Yes No
No
Car Type= normal
![Page 10: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/10.jpg)
• Infinite length
• Concept-drift
• Concept-evolution
• Feature Evolution
ChallengesChallenges
![Page 11: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/11.jpg)
classifier Ensemble M
outlier detection moduleBuffer outliers instances.
Clusters instances in
Buffer
cluster isTransform
ed
into a pseudopoin
t data
structure
clusters clusters
clusters
Centroid,Weight,radiusCentroid,Weight,radi
usCentroid,Weight,radiusCentroid,Weight,radius
Set of Pseudopoint H
The data stream is divided into equal sized chunks(Input)
Calculate q-NSC value Assigned to every instance in Pseudopoint
If tp is greater than the threshold
corresponding classifier
votes in favor
of a another class
Another instance
algorithm
Fig: Work flow for Identifying concept evolution.
![Page 12: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/12.jpg)
Feature-EvolutionFeature-Evolution
![Page 13: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/13.jpg)
•Applicable to many domains such as•Intrusion detection system.
•Share Market Data.
•Security Monitoring.
•Network monitoring and traffic engineering.
•Business : credit card transaction flows.
•Telecommunication calling records.
•Web logs and web page click streams.
ApplicationsApplications
![Page 14: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/14.jpg)
• In data stream classification VFDT algorithm is efficient to
classified high dimensional data in to the another class.
• Then, VFDT shows two key mechanisms of the another class
detection technique, outlier detection, and multiple class
detection.
ConclusionConclusion
![Page 15: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/15.jpg)
ReferencesReferences[1] Mohammad M. Masud, Qing Chen, Latifur Khan, Charu C. Aggarwal, JingGao,
Jiawei Han, “Classification and Adaptive Novel Class Detection of Feature-Evolving
Data Streams”, IEEE Tran. on Knowledge And Data Engi., Vol. 25, No. 7, July
2013.
[2] Durga Toshniwal, Yogita K,“Clustering Techniques for Streaming Data–A
Survey”, 3rd IEEE International Advance Computing Conference (IACC), 2013.
[3] S. Hashemi, Y. Yang, Z. Mirzamomen, and M. Kangavari, “Adapted One-versus-
All Decision Trees for Data Stream Classi-fication,” IEEE Trans. Knowledge and
Data Eng., vol. 21, no. 5, pp. 624-637, May 2012.
[4] A. Bifet, G. Holmes, B. Pfahringer, R. Kirkby, and R. Gavalda,“New Ensemble
Methods for Evolving Data Streams,” Proc. ACMSIGKDD 15th Int’l Conf.
Knowledge Discovery and Data Mining,pp. 139-148, 2011.
![Page 16: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/16.jpg)
[5] C.C. Aggarwal, “On Classification and Segmentation of Massive Audio Data
Streams,” Knowledge and Information System, vol. 20, pp. 137-156, July 2009.
[6] M.M. Masud, J. Gao, L. Khan, J. Han, and B.M. Thuraisingham, “Classification
and Novel Class Detection in Concept-Drifting Data Streams under Time
Constraints,” IEEE Trans. Knowledge and Data Eng.,vol. 23, no. 6, pp. 859-874,
June 2011.
[7] M.M. Masud, Q. Chen, L. Khan, C. Aggarwal, J. Gao, J. Han, and B.M.
Thuraisingham, “Addressing Concept-Evolution in Concept-Drifting Data Streams,”
Proc. IEEE Int’l Conf. Data Mining (ICDM), pp. 929-934, 2010.[8] M.-Y. Yeh, B.-R. Dai, and M.-S. Chen, “Clustering over multiple evolving streams by events and correlations,” IEEE Trans. on Knowl. and Data Eng., vol. 19, no. 10, pp. 1349–1362, Oct. 2009
ReferencesReferences
![Page 17: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/17.jpg)
Any Questions?
![Page 18: Data mining technique for classification and feature evaluation using stream mining(ranjit banshpal)](https://reader034.vdocument.in/reader034/viewer/2022051323/54825bb95906b5e2048b4647/html5/thumbnails/18.jpg)
THANK YOUTHANK YOU