sample of data security and knowledge discovery research at the university of texas at dallas dr....

26
Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu Dr. Kevin Hamlen September 20, 2007

Upload: richard-kelley

Post on 05-Jan-2016

217 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

Sample of Data Security and Knowledge Discovery

Research at the University of Texas at Dallas

Dr. Bhavani ThuraisinghamDr. Latifur Khan

Dr. Murat KantarciogluDr. Kevin Hamlen

September 20, 2007

Page 2: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

204/20/23 18:16

Outline

0 Data and Applications Security- Information sharing, Geospatial data management,

Surveillance, Secure web services, Privacy, Dependable information management, Intrusion detection

0 Data Mining an d Knowledge Discovery- Data Mining for Security Applications, Data Mining for

Bioinformatics, Data Mining for Data and Software Quality

Page 3: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

304/20/23 18:16

Research Group: Data and Applications Security

0 Core Group- Prof. Bhavai Thuraisingham (Professor & Director,

Cyber Security Research Center)- Prof. Latifur Khan (Director, Data Mining Laboratory)- Prof. Murat Kantarcioglu (Joined Fall 2005, PhD.

Purdue)- Prof. Kevin Hamlen (Peer to Peer systems Security,

Joined 2006 from Cornell U.)0 Students and Funding

- Over 20 PhD Students, 40 MS students (combined)- Research grants Air Force Office of Scientific

Research NSF, NGA, Raytheon, - - - -

Page 4: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

404/20/23 18:16

Vision 1: Assured Information Sharing

PublishData/Policy

ComponentData/Policy for Agency A

Data/Policy for Coalition

PublishData/Policy

ComponentData/Policy for Agency C

ComponentData/Policy for Agency B

PublishData/Policy

1. Friendly partners

2. Semi-honest partners

3. Untrustworthy partners

Research funded by two

grants from AFOSR

Page 5: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

504/20/23 18:16

Vision 2: Secure Geospatial Data Management

Data Source A

Data Source B

Data Source CSECURITY/ QUALITY

Semantic Metadata ExtractionDecision Centric FusionGeospatial data interoperability through web servicesGeospatial data miningGeospatial semantic web

Tools for Analysts

Research Supported by Raytheon on pne grant; working on robust prototypes on second grant

Page 6: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

604/20/23 18:16

Vision 3: Surveillance and Privacy

Raw video surveillance data

Face Detection and Face Derecognizing system

Suspicious Event Detection System

Manual Inspection of video data

Comprehensive security report listing suspicious events and people detected

Suspicious people found

Suspicious events found

Report of security personnel

Faces of trusted people derecognized to preserve privacy

Page 7: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

704/20/23 18:16

Example Projects

0 Assured Information Sharing

- Secure Semantic Web Technologies

- Social Networks and game playing

- Privacy Preserving Data Mining

0 Geospatial Data Management

- Secure Geospatial semantic web

- Geospatial data mining

0 Surveillance

- Suspicious Event Detention

- Privacy preserving Surveillance

- Automatic Face Detection, RFID technologies

0 Cross Cutting Themes

- Data Mining for Security Applications (e.g., Intrusion detection, Mining Arabic Documents); Dependable Information Management

Page 8: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

804/20/23 18:16

Social Networks

0 Individuals engaged in suspicious or undesirable behavior rarely act alone

0 We can infer than those associated with a person positively identified as suspicious have a high probability of being either:- Accomplices (participants in suspicious activity)- Witnesses (observers of suspicious activity)

0 Making these assumptions, we create a context of association between users of a communication network

Page 9: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

904/20/23 18:16

Privacy Preserving Data Mining

0 Prevent useful results from mining

- Introduce “cover stories” to give “false” results

- Only make a sample of data available so that an adversary is unable to come up with useful rules and predictive functions

0 Randomization and Perturbation

- Introduce random values into the data and/or results

- Challenge is to introduce random values without significantly affecting the data mining results

- Give range of values for results instead of exact values

0 Secure Multi-party Computation

- Each party knows its own inputs; encryption techniques used to compute final results

Page 10: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

1004/20/23 18:16

Framework for Geospatial Data Security

DATA PRESENTATION COMPONENTS

Access Control Module

Geospatial Data Registration

spatial and temporal registration of geospatial data

Data Integration Services&

Data Repository Access

DATA ACCESS LAYER

DAC/RBAC Policy Specification

Policy ReasoningEngine

Trust & Privacy Management

Authentic Data Publication

Auditing

Misuse Detection

SECURITY LAYER

OpenGeospatialConsortiumFramework

Core &ApplicationSchemas

GeospatialFeatures

GeographyMarkupLanguage

Metadata

GIS Web ServicesTraditional GIS

Wrapper

GeospatialDataRepositories

Page 11: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

1104/20/23 18:16

Data Mining for Surveillance

0 We define an event representation measure based on low-level features

0 This allows us to define “normal” and “suspicious” behavior and classify events in unlabeled video sequences appropriately

0 A visualization tool can then be used to enable more efficient browsing of video data

Page 12: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

1204/20/23 18:16

Data Mining for Intrusion Detection / Worm Detection

TrainingData Classification

Hierarchical Clustering (DGSOT)

Testing

Testing Data

SVM Class Training

DGSOT: Dynamically growing self organizing treeSVM: Support Vector Machine

Page 13: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

1304/20/23 18:16

Intrusion Detection: Results

Training Time, FP and FN Rates of Various Methods

 

MethodsAverage

Accuracy

Total Training Time

Average FP

Rate (%)

Average FN

Rate (%)

Random Selection 52% 0.44 hours 40 47

Pure SVM 57.6% 17.34 hours 35.5 42

SVM+Rocchio Bundling

51.6% 26.7 hours 44.2 48

SVM + DGSOT 69.8% 13.18 hours 37.8 29.8

Page 14: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

1404/20/23 18:16

Information Assurance Education

Current CoursesIntroduction to Information Security: Prof. ShaTrustworthy Computing: Prof. Sha Cryptography: Profs. Sudborough, MuratInformation Assurance: Prof. YenData and Applications Security: Prof. Bhavani ThuraisinghamBiometrics: Prof. Bhavani Privacy: Prof. Murat KantarciogluSecure Language, prof. Kevin HamlenDigital Forensics: Prof. Bhavani Thuraisingham

Future CoursesNetwork Security: Profs. Ventatesan, Sarac Security Engineering: Profs. Bastani, CooperIntrusion Detection: Profs. Khan, ThuraisinghamDigital Watermarking: Prof. Prabhakaran

Courses at AFCEA and AF BasesKnowledge Management, Data Mining for Counter-terrorism, Data Security, preparing a course on SOA and NCES with Prof. Alex Levis - GMU and Prof. Hal Sorenson - UCSD)

Page 15: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

1504/20/23 18:16

Knowledge Discovery in Images

0 Goal: Find unusual changesProcess:

- Use data mining to model normal differences between images

- Find places where differences don’t match model

0 Questions to be answered:

- What are the right mining techniques?

- Can we get useful results?

Page 16: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

1604/20/23 18:16

Change Detection:

0 Trained Neural Network to predict “new” pixel from “old” pixel- Neural Networks good for multidimensional continuous data- Multiple nets gives range of “expected values”

0 Identified pixels where actual value substantially outside range of expected values- Anomaly if three or more bands (of seven) out of range

0 Identified groups of anomalous pixels

Page 17: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

1704/20/23 18:16

Multimedia/Image Mining

Images Segments Blob-tokens

Automatically annotate images then retrieve based on the textual annotations.

Page 18: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

1804/20/23 18:16

Web Page Prediction: Problem Description

?

Financial Aid Information (P3)

Office of admission (P1)

VIP web page (P2)

What page is Next??

Page 19: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

1904/20/23 18:16

Web Page Prediction: Architecture

User sessions

MarkovModel

Dempster’s Rule

Feature Extraction

SVM Sigmoid mappingSVM

output

ANN Sigmoid mappingANN

output

Markovprediction

SVMprediction

ANNPrediction

fusion

FinalPrediction

Page 20: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

2004/20/23 18:16

Misuse/Misinformation/ Insider threat

0 %50 of corporate breaches or losses of information that were made public in the past year were insider attacks

0 %50 of those insider attacks were the thefts of information by employees

0 It is hard to model individuals!!!0 Role based access control provides tools to model given

roles0 Challenge: How to develop models for predicting normal

usage of a role vs misuse?0 Challenge: How to integrate misuse, auditing and access

control systems?0 Current Status: We are developing misuse detection system

based on clustering; Risk-based analysis

Page 21: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

2104/20/23 18:16

Time Constrained KDD: Proposal to AFOSR with UIUC

0 The military must continually carry out the followed operations:

- Surveillance: monitor the behavior of the people or objects to see if they are deviating from the norm; Maneuver – Place the enemy in a position of disadvantage through the flexible application of combat power; Mass: the effects of overwhelming combat power at the decisive place and time; Attack: an attempt to actively strike at the enemy, as opposed to a defensive plan.

0 Track the enemy and DETER him during surveillance and maneuver stage through

- Knowledge Discovery: Extract concepts from the stream data arriving from the sensors; Time Constrained Activity Analysis: Extract knowledge from the enemy activities arriving in the form of streams; Ontology Management: Develop ontologies and subsequently conduct multi-modal data analysis of the multimedia data captured and resolve conflicts and uncertainty; Resource Allocation: Utilize the knowledge discovered, apply decision theories and determine resource allocation

Page 22: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

2204/20/23 18:16

Some Experiences with Tools

0 Tools developed in-house

- Image mining tool, Data Sharing Tool,

- Intrusion detection/Malicious code detection tools, Web page prediction tool

- Multimedia mining/Image extraction including MPEG7 feature descriptors

- Cluster visualization tool

0 External tools

- Oracle data mining product

- IDIS data mining tool

- WEKA data mining tool

- XML SPIE and QUIP

- INTEL OpenCV

Page 23: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

2304/20/23 18:16

Technical and Professional Accomplishments

Publications of research in top journals and conferences, books IEEE Transactions, ACM Transactions, 8 books published and 2 books in preparation including one on UTD research (Data Mining Applications, Awad, Khan and Thuraisingham)

Member of Editorial Boards/Editor in Chief Journal of Computer Security, ACM Transactions on Information and Systems Security, IEEE Transactions on Dependable and Secure Computing, IEEE Transactions on Knowledge and Data Engineering, Computer Standards and Interfaces - - -

Advisory Boards / Memberships/OtherPurdue University CS Department, Invitations to write articles in Encyclopedia Britannica on data mining, Keynote addresses, Talks at DFW NAFTA and Chamber of Commerce, Commercialization discussions of data mining tools for security

Awards and Fellowships IEEE Fellow, AAAS Fellow, BCS Fellow, IEEE Technical Achievement Award, IEEE Senior Member

Page 24: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

2404/20/23 18:16

Our Model: R&D, Technology Transfer Standardization and Commercialization Basic Research (6-1 Type)

Funding agencies such as NSF, AFOSR, NGA, - - - -, etc. ; Publish our research in top journals (ACM and IEEE Transactions)

Applied Research Some federal funding (e.g., from government programs) and Commercial Corporations (e.g., Raytheon); Our current collaboration with AFRL-ARL

Technology Transfer / DevelopmentWork with corporations such as Raytheon to showcase our research to sponsors (e.g., GEOINT) and transfer research to operational programs such as DCGS

StandardizationOur collaborations with OGC, OASIS and standardization of our research (e.g., GRDF)

Commercialization Patents, Work with VCs, Corporations, SBIR, STTR for commercialization of our tools (e.g., our work on data mining tools)

Page 25: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

2504/20/23 18:16

Our Vision for Assured Information Sharing/KDD

Time constrained KDD(Future)

Link Analysis(AFOSR, Texas)

Game Theory(AFOSR Dependable

Information Management(Texas)

Misinformation/Misuse(AFOSR)

Geospatial(NGA, Raytheon)

Semantic Web(NSF, AFOSR)

Incentive based Knowledgemanagement(Future)

AssuredInformationSharing/KDDPrivacy

Preserving data mining(Texas)

Technologies will contribute to Assured Information Sharing

Page 26: Sample of Data Security and Knowledge Discovery Research at the University of Texas at Dallas Dr. Bhavani Thuraisingham Dr. Latifur Khan Dr. Murat Kantarcioglu

2604/20/23 18:16

Our Collaborations inAssured Information Sharing and KDD

Time Constrained KDD(UIUC)

Link Analysis(UGA, UAZ)

Game Theory(UTD Management School)

Dependable Information Management(UCR, UTSA)

Misinformation/Misuse(Purdue)

Geospatial(UMN, UCD, Purdue, WVU, UCF)

Semantic Web(UMBC, UTSA)

Knowledgemanagement(SUNY Buffalo)

AssuredInformationSharing/KDDPrivacy

Preserving data mining(Purdue)