university of texas at dallas systems group department of computer science erik jonsson school of...
TRANSCRIPT
University of Texas at Dallas
Systems Group
Department of Computer ScienceErik Jonsson School of Engineering
and Computer ScienceThe University of Texas at Dallas
April 23, 2007
University of Texas at Dallas
Information about the Group
• Over 10 members• Members of editorial boards of IEEE and ACM Transactions• Advisory boards (e.g., Purdue University CS Department)• Funding from NSF (including career awards), AFOSR, ARO, DoD,
NASA and Corporations• PhD form prestigious universities including Cornell, Princeton, USC,
Purdue, UNC• IEEE/AAAS Fellows, Senior Members, Awards• Keynote addresses at major conferences (e.g., ACM SACMAT 04,
PAKDD06, IEEE Policy 07)• Collaboration with Leading researchers
– Purdue, UMBC, U of VA, GMU, UIUC, U of MN, GATech etc.
University of Texas at Dallas
Technology Themes• Our research is focusing on Core System areas such as
– Embedded Systems, Distributed Systems and Networks, Data Management Systems
• We are also conducting extensive research in systems applications including– Data Mining, Visualization, Graphics, Bioinformatics, Multimedia
and Animation, Geospatial information management, and Wireless Computing
• Security cuts across all areas– Data and applications security, Network security, Data Mining for
Security Applications, Privacy, Secure languages, Embedded systems security, Secure data grid
University of Texas at Dallas
Vision of the Systems Group• Five Pronged Approach to R&D in Systems and
Applications – 1. Basic research in systems ranging from complexity results to
systems design• Funding from NSF, AFOSR, ARO, etc.
– 2. Applied research: Large scale design and implementation projects (Alcatel, Raytheon, Nokia, Rockwell, etc.)
– 3. Technology Transfer: work with corporations such as Raytheon to transfer the research to Operational programs
– 4. Standards – work with organizations such as OGC, W3C to transfer research to standards
– 5. Commercialization: Work with Office of Sponsored Research to commercialize our tools (e.g., Data Mining for security)
University of Texas at Dallas
Embedded Systems & SecurityEmbedded Systems & SecurityEdwin ShaEdwin Sha
Timing & Memory Optimization HW/SW for Security
Billions of units produced yearly, versus millions of desktop units
Application Specific: more parallel, heterogeneous, networked
Tightly-constrained: low cost, low power, small memory
Real-time & SecureNeed both hardware & software: need design
automation and optimization: compiler, OS, hardware
Timing: all the instructions in a loop nest can be executed in parallel Power: switching activities is reduced by 42.8%Program size: code-size reduction technique reaches 50% reductionSecurity: Hardware/Software Defender protects systems from any buffer-flow attacks
http://www.utdallas.edu/~edsha
Timing optimization for loops: Develop retiming, MD retiming. All the instructions in a loop nest can be executed in parallel.
Hiding memory latency:CPU is fast; memory is slow. Prefetching data before they are required. Combining with partitioning and iterational retiming. Completely hide memory latencies.
Protection from buffer-overflow attacksProblems: protection capability, overheadSolution: Hardware/Software Defender (HSDefender).
Intrusion Detection for known worms & virusesProblems: performanceSolution: very high-performance specialized parallel architectures.
University of Texas at Dallas
Visual Languages and CommunicationsVisual Languages and CommunicationsKang ZhangKang Zhang
Scientific/Technical ApproachesDevelop a spatial graph grammar formalism with
efficient parsingBuild a graph induction engineAdd semantics to UML diagramsDesign intuitive and effective graph visualization and
navigation algorithms (e.g. graph labeling, mobile browsing)
Learn from visual arts and design for aesthetic information visualization and user-interfaces
Accomplishments Proposed a context-sensitive graph grammar
formalism with polynomial parsing speed Applied graphical specification and reasoning to
various application domains Developed a visual data clustering and noise
removal system
Challenges Measurement/evaluation of aesthetics and visual
effectiveness; Usability; Scalability
Objectives• Build a Theoretical Foundation for Visual
Specification and Reasoning• Apply Visual Techniques to Data Engineering• Enhance Information Access on Mobile Devices• Promote Aesthetic Aspects of Visualization for
High Usability Funding: NSF ITR: 216K + proposal submitted;
Scholarship grants: NSF CSEMS, DoEdu GAANN
Round-Trip Visual Engineering
Visual Languages(Graph Grammars)
Model-Driven Engineering
Data Interoperation
InformationVisualization
Mobile DisplayMultimedia Authoring
Visual Arts& Design
Applications
University of Texas at Dallas
Next General Prolog SystemsNext General Prolog SystemsGopal GuptaGopal Gupta
RationaleResearch in logic programming driven by quest to
find the optimal computation rule -- select clauses in optimal order-- select goals in optimal order
Tabling/Parallelism allows optimal clause orderDet. Coroutining/constraints allow optimal goal orderCoinductive LP/ASP adds further power
Accomplishments Developed coinductive logic programming and
efficient ways to implement it. Developed scalable, easy-to-realize parallel
implementation on Beowulf arch. Developed easy-to-realize implementation for tabled
logic programming Developed methods for goal-directed execution of
answer set programs (non-monotonic reasoning).
Objectives• Develop the next generation of Prolog system
that integrates various recent advances:•Finite Domain Constraints •Tabled Logic Programming•Coinductive Logic Programming•Answer Set Programming (ASP)•Deterministic coroutining•Parallelism (via Multicores);o
Approach• Develop simple-to-implement approaches (else
impl. becomes too complex).• Use an existing Prolog engine (GNU Prolog)• Exploit parallelism on multicore machines
Applications• Model checking and verification• Non-monotonic reasoning• Semantic web reasoning engines
University of Texas at Dallas
Assured Information SharingAssured Information SharingBhavani Thuraisingham, Latifur Khan, Murat KantarciogluBhavani Thuraisingham, Latifur Khan, Murat Kantarcioglu
Scientific/Technical ApproachConduct experiments as to how much information is
lost as a result of enforcing security policies in the case of trustworthy partners
Develop more sophisticated policies based on role-based and usage control based access control models
Develop techniques based on game theoretical strategies to handle partners who are semi-trustworthy
Develop data mining techniques to carry out defensive and offensive information operations
Accomplishments Developed an experimental system for determining
information loss due to security policy enforcement Developed a strategy for applying game theory for
semi-trustworthy partners; simulation results Developed data mining techniques for conducting
defensive operations for untrustworthy partners
Challenges Handling dynamically changing trust levels;
Scalability
Objectives• Develop a Framework for Secure and Timely
Data Sharing across Infospheres.• Investigate Access Control and Usage Control
policies for Secure Data Sharing.• Develop innovative techniques for extracting
information from trustworthy, semi-trustworthy and untrustworthy partners.
Funding: AFOSR: 306K + 120K + proposal submitted; Matching funds from dean
ComponentData/Policy for Agency A
Data/Policy for Coalition
Publish Data/Policy
ComponentData/Policy for Agency C
ComponentData/Policy for Agency B
Publish Data/Policy
Publish Data/Policy
University of Texas at Dallas
Malicious Code Detection using Data MiningMalicious Code Detection using Data Mining
Latifur Khan and Latifur Khan and Bhavani ThuraisinghamBhavani Thuraisingham
Scientific/Technical ApproachDevelop a hybrid data mining approach to
detect malicious executables. Important features of malicious and benign executables are identified and trained classifiers
Three set of features are extracted: Binary
features are extracted from the binary executables; assembly features are extracted from disassembled executables; function call features are extracted from program headers.
Accomplishments• Developed a tool that can detect malicious
executables in near real time. Future Work• Detect malicious executable in real time with a very
low false alarm rate• Extend this work to detect buffer overflow by
discriminating messages containing code (i.e., attack message) from messages containing no code (i.e., non attack message)
Objectives• Develop a framework for Malicious code
detection• Overcome shortcoming of Traditional
approaches--Signature based & Not effective against “zero day” attacks
• Proposed Innovative Framework will be deployed in untrustworthy partners
Funding: AFOSR: 306K + proposal submitted; Matching funds from dean
University of Texas at Dallas
Geospatial Information Management for National SecurityGeospatial Information Management for National Security
Latifur Khan and Bhavani ThuraisinghamLatifur Khan and Bhavani Thuraisingham
Scientific/Technical Approach• Develop Semantic Web Services--Conjunction of two
powerful technologies : Semantic Web and Web Services
• Semantic Web Services provide richer semantics required for automation of service discovery, selection and execution tasks
• Develop Geo Service Discovery and dynamic compositions to integrate geospatial information services by exploiting OWL-S to describe Web services
Accomplishments Developed a tool that can handle certain types of
queries with a limited number of geospatial and non geospatial data sources
Future Work• Complete toolkit that can handle a complex query
automatically and effectively on the fly from a significant number of geospatial and non geospatial data sources
• Extend this for national security data analysis
Objectives• Develop a framework for Geospatial Data integration to
incorporate geospatial data sources and other sources • Framework will facilitate standard metadata that
describes geospatial repositories and a coherent mechanism to connect repositories-- Seamless integration of Geospatial and Non-Geospatial information with minimal human intervention– (a sample query “Find movie theaters within 30 miles of 75080” )
• Funding: Raytheon: 200K + proposal submitted; Matching funds from dean
1. Query
Profile
2. Service Discovery
3. Compose
Selection
4. Construct Sequence
5.Return Dynamic
Service URI
DAGISCompose
r
DAGISCompose
r
Match-Maker
Match-Maker
DAGISAgent
DAGISAgent
ClientClient
ComposerSequencer
ComposerSequencer
TX
Zipcode Finder
Zipcode Finder
Theater Finder
Theater Finder
Richardson
30 Miles
Theaters
University of Texas at Dallas
Securing Critical InformationSecuring Critical InformationI-Ling YenI-Ling Yen
Objectives
Many data-intensive applications hosting critical data Data grid Large-scale distributed
database How to secure these systems
under hostile Internet environment Secure storage Secure operations on the data
Problem Statements
No matter how good the intrusion detection systems are, adversaries always manage to penetrate the system
Need to support intrusion tolerance Even if the system is compromised,
critical information can still stay secure Simple encryption won’t work
In storage system: key management issues In data applications: data need to be
decrypted when operated on
Data Grid
Developed data grid storage systems Combine secure sharing and
replication to achieve security, availability, and integrity
Efficient data placement algo. for allocating data shares and their replicas to achieve the best access performance
Operating on Encrypted Data
Developed search algorithm to support the processing of search queries on encrypted data
Developed new encryption algorithms to allow secure computation on secret data
Need to integrate these algorithms in systems while ensuring overall system security
University of Texas at Dallas
Data Integrity, Quality and Provenance for Command and Control Applications
Murat Kantarcioglu and Bhavani Thuraisingham
Scientific/Technical ApproachDevelop integrity and provenance policy
languageDevelop risk management based approach
that considers risks due data provenanceApply game theoretical and incentive based
techniques to enforce honest behavior in policy enforcement
AccomplishmentsDeveloped comprehensive architecture for an
integrity control systemDeveloped integrity policy languageDeveloped an initial approach to risk
evaluation
Challenges Developing techniques against malicious behavior
Objectives• Reduce the complexity of the data integrity
assurance process• Develop tools to decide whether to “admit” data
into a database• Develop techniques to analyze the confidence of
query results based on data provenance Funding: AFOSR: 300K ; Matching funds from
dean (Joint work with Elisa Bertino from Purdue University)
Conventional Access Controller
Integrity Validator
Access Request
Access Controller
Integrity Controller
Integrity Policy Supplier
Access Control Results
Integrity MetadataRepository
Integrity Policy Repository
University of Texas at Dallas
Privacy-Preserving Data MiningMurat Kantarcioglu and Bhavani Thuraisingham
Scientific/Technical ApproachDevelop secure multi-party computation based
approaches for distributed data mining tasks under different adversarial assumptions
Develop perturbation based approaches for individually adaptable privacy preservation
Develop statistical methods to measure privacy loss due to data mining results
Develop cryptographic framework for using data mining results privately
Accomplishments Showed that various distributed data mining
protocols could be implemented using few specific secure protocols (see the figure above)
Developed a perturbation technique that allows individuals to choose their own privacy level
Developed various secure tools for enabling privacy preserving data mining.
Challenges Relative inefficiency of cryptographic techniques,
accuracy loss in perturbation based approaches
Objectives• Learn data mining results without disclosing the
private data• Measure privacy loss due to data mining results• Explore possible trade-offs between privacy,
efficiency and accuracy• Devise techniques to use data mining results
privately
Data Mining on Horizontally Partitioned DataSpecific Secure Tools
•Secure Sum
•Secure Comparison
•Secure Union
•Secure Logarithm
•Secure Poly. Evaluation
•Association Rule Mining
•Decision Trees
•EM Clustering
• Naïve Bayes Classifier
•K-NN Classifier
University of Texas at Dallas
Classification and Prediction Models for Mining Spatial Data Weili Wu
• Motivation and Application Historical Examples:
– London Asiatic Cholera 1854 (Griffith)
– Dental health and fluoride in water,
Colorado early 1900s
Current Examples:
– Crime hotspots (NIJ CML, police petrol )
– Environmental justice (EPA), fair lending
practices
– Location aware services (Defense: Sensor
networks, Mobile ad-hoc networks)
– Ecology: Spatial habitat model
• Funding
– NSF 300K + Matching funds from dean
• Research Problem FormulationGiven:1. Spatial Framework 2. Explanatory functions:3. A dependent class:4. A family of function mappings:
Find: Classification model: Objective: maximize classification_accuracy
Constraints: Spatial Autocorrelation exists
• Accomplishments:– Developed efficient spatial-temporal model to
analysis Geo-spatial data.– Developed new spatial similarity measure to
build a more advanced model.– Developed new efficient search algorithm.
},...{ 1 nssS RSf
kX:
CRR ...
cf̂),ˆ( cc ff
},...{: 1 MC ccCSf
University of Texas at Dallas
Dependable Distributed SystemsDependable Distributed SystemsNeeraj MittalNeeraj Mittal
Objectives
Develop novel algorithms for monitoring executions of distributed systems.
Develop new algorithms for effective sharing of resources.
Challenges
Asynchronous system with no global clock or shared memory.
Processes and channels may be unreliable.
Processes may join and leave the system at any time.
Scientific Accomplishments
Developed algorithms for detecting stable properties (e.g., termination) under a variety of conditions: processes may fail by
crashing failed processes may recover
Develop efficient algorithms for group mutual exclusion.
Future Work
Monitoring algorithms when the system is dynamic.
Resource management algorithms when processes and/or channels may be unreliable.
University of Texas at Dallas
Key Management in Sensor NetworksKey Management in Sensor NetworksNeeraj MittalNeeraj Mittal
Objectives
Develop novel schemes for securing communication in sensor nodes deployed in hostile territory.
Communication between two sensor nodes may need to be protected against snooping by another node.
Challenges
Sensor nodes have limited resources.
Wireless communication is vulnerable to eavesdropping.
Sensor nodes are vulnerable to physical captures.
Scientific Accomplishments
Developed novel schemes for pre-distributing keys among sensor nodes under a variety of conditions: limited deployment
knowledge is available some sensor nodes may be
malicious
Future Work
Dynamically refresh the keys stored at uncompromised sensor nodes.
Protect against new malicious sensor nodes joining the network.
University of Texas at Dallas
Computational Systems Biology through Mining High Computational Systems Biology through Mining High Throughput DataThroughput Data
Ying LiuYing Liu
Scientific/Technical ApproachesBiological networks are modularUsing random forest tree to integrate
heterogeneous dataNew formulation for heavy sub-graph miningDesign graph mining algorithmPropose new metrics to measure dense sub-
graphs
Accomplishments Integrated 7 different types of data to construct
protein-protein interaction networks Formulate heavy sub-graph discovery problem as a
quadratic functions Proposed new graph mining algorithms based on
Evolutionary computing and neural network
Challenges Large-scale data size; Heavy sub-graph discovery
problem is NP-complete problem.
Objectives• Design efficient algorithm for biological
network inference• Integrate heterogeneous biological data• Decompose Biological networks into functional
modules• Discover functional hierarchy from biological
networks
University of Texas at Dallas
Physically-Based Deformable ModelsPhysically-Based Deformable ModelsXiaohu GuoXiaohu Guo
Scientific/Technical ApproachInvestigate the theoretical foundations for quasi-
conformal surface mapping and harmonic volumetric mapping
Based on the regular parametric domains included by geometric mapping, develop a GPU-accelerated framework including real-time PDE/ODE solver, collision detection, and volume rendering
Having the regular parametric domains (i.e. geometry images), use image-based (2D/3D) compression and streaming technique for efficient transmission of deformable models.
Potential Applications Surgical training and dynamic simulation of human
tissues/muscles under interactive manipulation 3D model registration and target localization in
medical imaging, based on deformable models
Challenges Multiple users’ collaborative manipulation will
result in data incoherency at different client sites, deformable model decomposition techniques can be further investigated
Objectives• Develop a physically-based simulation and
visualization platform for deformable models, which can perform dynamic simulation, collision detection, and material property visualization, in real-time.
• Investigate physically-based deformable models under a networking collaborative virtual environment.
GPU-Accelerated PDE/ODE Solver
Harmonic Surface and Volumetric Mapping
Geometry Images
Deformable Models Compression and Network Streaming
GPU-Accelerated Deformable Models
GPU-Accelerated Collision Detection
GPU-Accelerated Volume Rendering
University of Texas at Dallas
Language-based Software SecurityLanguage-based Software SecurityKevin W. HamlenKevin W. Hamlen
Objectives
Develop systems for safe execution of mobile code from untrusted sources
Support low-level binary formats, legacy languages, etc.
Provide formally provable security guarantees (e.g., using type theory)
Challenges When source is untrusted, code
signing doesn’t help
Static analyses useful when possible, but interesting security properties are undecidable
In-lined Reference Monitors are sufficiently powerful, but need formal proof techniques to guarantee safety
Scientific Accomplishments
Developed the first certified In-lined Reference Monitoring system fully automatic program-
rewriter for managed .NET all generated code has
machine-checkable soundness proof
Future Work
Support lower level binary formats (x86 machine code rather than .NET bytecode)
Reduce disconnect between theory & implementation by creating smaller verifiers (e.g., logic programming)
University of Texas at Dallas
Multimedia Systems and Networking
• 3D Motions: Motion capture and Gesture sensors data • 3D Models: Educational instructions, Role playing games
• For delivery (streaming): focus on wireless networks
• Biomedical Applications Physical Medicine & Rehabilitation Parkinson’s and other Neurological Diseases Study Dynamics of Human Anticipation
• Security Applications Emergency Handling: Streaming Animated Instructions Over PDAs,
Laptops on Wireless Ad-hoc Networks Optimal Sensor Placement, Suspicious activity identification.
• Arts and Technology Copyright Protection: Watermarking of 3D Models and Captured Motions Reusability of Models and Motions
• Funding from NSF Career and ARO
B. Prabhakaran ([email protected])
University of Texas at Dallas
Our Directions and Plans• Each technology area is making very good technical progress
• Will continue to enhance our research and follow the five pronged model
• Also plan on developing interdisciplinary projects
– within the Group
– Across the Groups
– Across UTD and Partners (e.g., School of Management, UT Southwestern Medical Center)
• Continue to increase the number of Fellows, Board members, Keynote talks etc.
• Center Scale Project is our major goal