the evolution of dependable computing at the university of ...the evolution of dependable computing...
TRANSCRIPT
-
The Evolution of Dependable Computing at the University of Illinois
Dedicated toProfessor Algirdas Avizienis
R. Iyer, W. Sanders, J. Patel, Z. KalbarczykCenter for Reliable and High-Performance Computing
Department of Electrical and Computer Engineering andCoordinated Science Laboratory
University of Illinois at Urbana-Champaign
-
Long Association with FTCS (now DSN) Al Aviziennis
Ph.D. Illinois, Founder FTCS and IFIP WG 10.4
Program Chair or General ChairGary Metze, faculty, FTCS-2, FTCS-4John Hayes, Ph.D. Illinois, FTCS-7Jacob Abraham, faculty, FTCS-11, FTCS-19, IEEE TC on Fault-Tolerant Computing chair
Ravi Iyer, faculty, FTCS-19, FTCS-25, IEEE TC on Fault-Tolerant Computing chairKent Fuchs, faculty, FTCS-27IEEE TC on Fault-Tolerant Computing chairRam Chillarege, Ph.D. Illinois, FTCS-28Bill Sanders, faculty, FTCS-29 IEEE TC on Fault-Tolerant Computing chairZbigniew Kalbarczyk, Ph.D., DSN-02Tim Tsai, Ph.D. Illinois, DSN-04
-
Early Developments –Illiac-I and Illiac-II
During 1950’s and 1960’s Illiac-I and Illiac-II
Frequent failures of vacuum tubes jump started the early work in fault diagnosis
A subtle design bug in the arithmetic unit, (-2) * (-2) giving (-4), escaped the tests
caught after about nine months by a numerical double-check built into a user’s program
While no attempt was made to model faults systematically, the handshake mechanism used in the ALU control exhibited the basic idea of what is now called self-checking operation.
Avizienis conducts early research on parallel computer arithmetic
-
Pioneering Work of Seshu and Metzein Test and Diagnosis
Sequential Analyzer (Seshu), which included a set of programs for
automatic generation of fault simulation data (stuck-line faults) for a given logic circuit and test sequence
automatic generation of test sequences for combinational and sequential circuits.
studying computer self-diagnosis
Sequential analyzer was applied in at Bell Telephone Laboratories to improve diagnosis procedures for ESS-1
Chang, Manning, and Metze produced, as a tribute to Seshu; probably the first book devoted entirely to digital fault diagnosis.
PMC (Preparata, Metze, Chien) model for computer self-diagnosis
-
Automatic Test Pattern Generation (ATPG) – from Research to Product
First commercial ATPG in 70’s (Marlett) and then several updated versions through 80’s and 90’s
Founding ATPG company Sunrise Test Systems, now a part of Synopsis (Niermann and Patel)
Commercialization of PROOFS, the fastest, most memory efficient fault simulation algorithm, in 1990
PROOFS now used in all commercial ATPG tools
-
Testing of multi-Million gates –Scan Design
Early Scan Design
NEC, IBM and William-Angel of Stanford, all claim to be the first to propose Scan
Others claim that the first proposal of the scan idea was in the book by Chang, Manning and Metze of Illinois!
Today large circuits with close to a million flip-flops in scan chain requires enormous test time and data
Illinois Scan Architecture (1999-2002)
Reduces test time and test data by factors of 100 or more with no logic overhead!
Already in use at IBM, Intel, Syntest and others
-
Illinois Scan Architecture
internal scan chains
externalscan-in
pins
outputcompactor
-
Functional-Level Test GenerationMemory testing using functional-level fault model (i.e., without the availability of information about their internal structure)
the initial fault model for memories included stuck bits in the memory and coupling between cells in a memory.
Test generation procedures for microprocessors based on an extrapolation of the approach to testing memories
A general graph-theoretic model at the register-transfer level to model microprocessors using only information about its instruction set and the functions performed. A fault simulation study on a real microprocessor showed extremely good fault coverage for tests developed using these procedures.
-
Detection & Recovery
-
Self-Checking and Time Redundancy
Pioneering work of Carter led Anderson and Metze (at Illinois) to formulate Totally Self Checking (TSC) circuits; subsequent work led to introduction of Strongly Fault Secureproperty
Triggered widespread research in academia and industry in the 70’s and 80’s
Metze proposed time redundancy techniques for checking errors based on Alternating Logic; subsequently enhanced as RESO (Recomputing with Shifted Operands)
-
Algorithm-based Fault Tolerance –ABFT and Checkpointing
ABFTMatrix encoding schemes for detecting and correcting errors when matrix operations are performed by processor arrays
Generalized to linear arrays, Laplace equation solvers, and FFT networks
Ideal for low-cost fault tolerance for special-purpose computations, including signal-processing applications
Explored by a large number of researchers,
more recently as part of the REE program at JPL
CheckpointingAsynchronous checkpointing for distributed systems optimized for space overhead and performance
Compiler-assisted multiple instructions rollback scheme to aid in speculative execution repair in microprocessors
Low-overhead coordinated checkpointing for long-running parallel and high-availability applications
RENEW toolset for rapid development and testing of checkpoint protocols
-
Current Directions
-
Hierarchical Application Aware Error Detection and Recovery
Parity, ECC at processor level Control logic protection
Reliable ultra-fast interconnect servicesApplication-aware programmable
checks
Application Support
Compiler Support
Operating System Support
Hardware Support
Robust (self-checking) middlewareAPI for High-Dependability support
Control flow checking and data audit
Application and/or OS instrumentation
to customize and invoke FT mechanisms
Kernel Health Monitor,Application transparent checkpointing
Security services, e.g., runtime memory randomization
Compiler generated assertions
HW broadcast
Randomizing memory layout
Instructions to invoke hardware-
levelerror checks
-
Middleware and Hardware Frameworks for Fault Tolerance and Security
Hardware Platform
ARMORs
Application
Operating System
ExecutionARMOR
ARMORInterface
ApplicationHeartbeatARMOR
DaemonARMOR
Node 1
Hardware Platform
ARMORs
Application
Operating System
ExecutionARMOR
ARMORInterface
ApplicationHeartbeatARMOR
DaemonARMOR
Node 1
ARMOR are multithreaded processes composed of replaceable building blocks – elements, which implement error detection, recovery policies, security services, and management of runtime environment.
ARMOR – High Availability and Security Middleware
RSE – Reliability and Security EngineInstruction-
Cache Instruction-FetchBranch
Predictor
Instruction-AddressTranslation Buffer
4-entryFetch Buffer
Dispatcher
InstructionDecoder
RegisterFile
Load / StoreUnit
IntegerUnit
Multiply / Divide Unit
Branch Resolve Unit
ReorderBuffer
Write-Buffer
Data-Cache
Commit-UnitData-AddressTranslation Buffer Arbiter
Instruction-Cache Instruction-Fetch
BranchPredictor
Instruction-AddressTranslation Buffer
4-entryFetch Buffer
Dispatcher
InstructionDecoder
Dispatcher
InstructionDecoder
RegisterFile
Load / StoreUnit
IntegerUnit
Multiply / Divide Unit
Branch Resolve Unit
ReorderBuffer
Write-Buffer
Data-Cache
Commit-UnitData-AddressTranslation Buffer Arbiter
ROBAllocPtr1 ( i )ROBAllocPtr2 ( i )
InstrReg1InstrReg2
MUX3
MDUDataOutALUDataOutLSUEffAddr
LoadFromALU ( i )LoadFromMDU ( i )LoadFromLSU ( i )
LSUDataOut
LoadFromLSU ( i )
MUX5
LSUSrc2ALUSrc2MDUSrc2
IssueLSU ( i )IssueALU ( i )IssueMDU ( i )
MUX2
InstructionCheckerModule
MemoryLayout
Randomization
DataDependency
Tracker
MemoryAccessUnit
InstructionOutputQueue
Memory Access Request Mem_Rdy
Memory Access Request
MemoryData
ModuleOutputs
RSE Framework
CUCommitInstr1
InstrToCommit ( i )
MUX4CUCommitInstr2
Mem
AdaptiveHeartbeatMonitor
RegFile_DataEntry i
Execute_OutEntry i
Fetch_Out Entry i
Memory_OutEntry i
Commit_OutEntry i
ModuleEnable/Disable
InstrReg3InstrReg4
ROBAllocPtr3 ( i )ROBAllocPtr4 ( i )
MUX1 Hardware Modules
Input interface
Fetch / Dispatch Width 4 instructionsIssue width 4 instructionsRUU / LSQ size 16/8 entriesInstruction L1 cache Size: 8 KB, 1-way associativeData L1 cache Size: 8 KB, 1-way associativeInstruction L2 cache Size: 64 KB, 2-way associative
Data L2 Cache Size: 128 KB, 2-way associative
checkcheckValid
Bus-InterfaceUnit
ExternalBus
Framework to provide application-aware techniques for error-detection, masking of security vulnerabilities and recovery in a uniform, low overhead manner
Algorithms and architectures for building dependable, object-oriented, distributed computer/communication systems
AQuA – Adaptive Quality of Service for Availability
ITUA - Intrusion Tolerance by Unpredictable Adaptation
Algorithms and architectures for building intrusion-tolerant distributed systems
GossipNameServer Gateway
Object QuO
ObjectFactory
Gateway
ProteusDependability
Manager
GatewayGateway
Object QuO
Underlying Group Communication System
GossipNameServer Gateway
Object QuO
Gateway
Object QuO
ObjectFactory
Gateway
ObjectFactory
Gateway
ProteusDependability
Manager
GatewayGateway
Object QuO
Gateway
Object QuO
Underlying Group Communication System
-
Modeling and Simulation
Componentlibraries
Fault injector
Simulationengines
Faultdictionaries
Other facilities
- hardware componentsand their functional behavior
- logical (architecture) components and their functionality
- hardware componentsand their functional behavior
- logical (architecture) components and their functionality
- hardware componentsand their functional behavior
- logical (architecture) components and their functionality
- effect of faults on higher level in terms ofa low level fault model
- effect of faults on higher level in terms ofa low level fault model
- effect of faults on higher level in terms ofa low level fault model
- random numbergenerators
- recording, statisticsreporting
- random numbergenerators
- recording, statisticsreporting
VHDLbehavioral
model
VHDLstructural
model
C++basedmodel
Blockdiagram
VHDLbehavioral
model
VHDLstructural
model
C++basedmodel
Blockdiagram
Task & Environment Manager
DEPENDDEPENDMenupagesMenupagesTool-kit
Graphical editorComp-lib builderVHDL translator
A simulation framework to support the design of systems for fault tolerance and high availability. It takes as inputs both VHDL and C++ system description and produces as output dependability characteristics including fault coverage, availability, and performance. (License to several companies and employed to simulate a number of industrial Systems, e.g. Integrity S2 from Tandem)
DEPEND
MöbiusAn Extensible Modeling Tool for Quantifying Dependability, Security, and Performance Features:• Multiple model representation techniques facilitate modeling of hardware, software, protocols, and application in a unified manner• Unified representation of dependability, security and performance attributes• Integrated analytical/numerical and simulation-based solution(Licensed by over 190 institutions)
Framework Component
Atomic Model
Composed Model
Solvable Model
Connected Model
Study Specifier(generates multiplemodels)
-
Experimental System Evaluation and Benchmarking – Fault Injection
CampaignScript
CampaignScript
LogLog
ControlHost
ControlHost
ProcessManagerProcessManager
InjectorProcessInjectorProcess
ApplicationProcess
ApplicationProcess
ProcessManagerProcessManager
InjectorProcessInjectorProcess
ApplicationProcess
ApplicationProcess
LAN
Error Injection Targets
Control Host
CampaignScript
CampaignScript
LogLog
ControlHost
ControlHost
ProcessManagerProcessManager
InjectorProcessInjectorProcess
ApplicationProcess
ApplicationProcess
ProcessManagerProcessManager
InjectorProcessInjectorProcess
ApplicationProcess
ApplicationProcess
LAN
Error Injection Targets
Control HostNFTAPE A software framework for conducting automated
fault/error injection-based dependability characterization;Evaluation/benchmarking of:- fault-tolerant systems, e.g., Tandem FT-platforms;- operating systems, e.g., Linux kernel on Pentium and PowerPC- applications, e.g., call processing, space-borne software- security violations due to errors, e.g., FTP, firewall facilities(Licensed to multiple institutions, including JPL-NASA)
LokiA software fault injector for distributed systems in which the introduction of faults is triggered based on the global state of the system. Evaluation of large-scale distributed systems, e.g., a group membership protocol in Ensemble, and correlated network partitions in Coda distributed file system.
. . .Loki RuntimeSystemUnderTest Loki RuntimeSystemUnderTest Loki RuntimeSystemUnderTest
Loki Runtime
SystemUnderTest
Probe
Inject Fault
StateMachine
Notifications
Recorder
StateMachineTransport
FaultParser
LAN1
LAN2
. . .Local Timelines
Offline ClockSynchronization
Uses timestamp datacollected before andafter the experiment
Single Global Timeline
To LAN2
MeasureAnalysis
Measures3711
1
N
2
N21
-
Operational System Data Analysis
LAN of Windows NT Machines Internet Data type and number of machines
System logs from 503 servers
System logs from 68 mail servers
Web site access success/failure logs from 97 most popular Web sites
Period for data collecting 4 months 6 months 40 weeks Failure context
Machine reboots logged in a server system log.
Machine reboots logged in a server system log.
Inability to contact a host and fetch an HTML file from it.
Availability
System perspective User perspective (user gets expected service)
99% N/A
99% 92%
N/A 99%
Downtime (median)
N/A
0.2h
1h (45%) 4.5h (49%) 53h (6%)
Comments
Indication of error propagation across the network
A few major network-related failures made nearly 70% of the hosts inaccessible
Failure Data Analysis
Impact of workload on hardware and software fault toleranceCharacterization of error latencyFailure characterization of UNIX and Windows NT servers Reliability of Internet hosts
Security Vulnerability Analysis
Size(PostD
ata)bk=C
B->fd=&addr_free-(offset of field bk)B->bk=Mcode♦-
B->fd=&addr_free-(offset of field bk)B->bk=Mcode
-♦ When buf is freed, execute B->fd->bk = B->bkB->fd and B->bk
unchanged ♦-
.GOT entry of function free points to MCode
addr_free changed♦-
addr_freeunchanged ♦- -♦ Execute addr_free when
function free is called
Mcode is executed
Note: addr_free is the .GOT entry of function free
X?
Calloc is called
-♦ Load addr_freeto the memory during program initializationManipulate the
.GOT entry of function free(i.e., addr_free)
pFSM1pFSM2
pFSM3
pFSM4
Heap Layout
Operation 1:
Operation 2:
Operation 3:
Size(PostD
ata)bk=C
B->fd=&addr_free-(offset of field bk)B->
Size(PostD
ata)bk=C
B->fd=&addr_free-(offset of field bk)B->bk=Mcode♦-
B->fd=&addr_free-(offset of field bk)B->bk=Mcode
-♦ When buf is freed, execute B->fd->bk = B->bkB->fd and B->bk
unchanged ♦-
.GOT entry of function free points to MCode
addr_free changed♦-
addr_freeunchanged ♦- -♦ Execute addr_free when
function free is called
Mcode is executed
Note: addr_free is the .GOT entry of function free
X?
Calloc is called
-♦ Load addr_freeto the memory during program initializationManipulate the
.GOT entry of function free(i.e., addr_free)
pFSM1pFSM2
pFSM3
pFSM4
Heap Layout
Operation 1:
Operation 2:
Operation 3:
bk=Mcode♦-
B->fd=&addr_free-(offset of field bk)B->bk=Mcode
-♦ When buf is freed, execute B->fd->bk = B->bkB->fd and B->bk
unchanged ♦-
.GOT entry of function free points to MCode
addr_free changed♦-
addr_freeunchanged ♦- -♦ Execute addr_free when
function free is called
Mcode is executed
Note: addr_free is the .GOT entry of function free
X?
Calloc is called
-♦ Load addr_freeto the memory during program initializationManipulate the
.GOT entry of function free(i.e., addr_free)
pFSM1pFSM2
pFSM3
pFSM4
Heap Layout
Operation 1:
Operation 2:
Operation 3:
1: PostData = calloc(contentLen +1024,sizeof(char));x=0; rc=0;
2: pPostData= PostData;3: do {4: rc=recv(sock, pPostData,
1024, 0);5: if (rc==-1) { 6: closeconnect(sid,1);7: return; 8: }9: pPostData+=rc;10: x+=rc;11: } while ((rc==1024) ||
(x
-
Security Vulnerability Avoidance and Runtime Protection
CERT advisories indicateCERT advisories indicate≥≥ 66% 66% vulnerabilitiesvulnerabilities due to pointer due to pointer taintednesstaintedness≥≥ 33% 33% due to errors in library functions due to errors in library functions or incorrect invocations of library or incorrect invocations of library functionsfunctions
Pointer Pointer TaintednessTaintedness –– a unified basis a unified basis for reasoning about security for reasoning about security vulnerabilitiesvulnerabilities
A pointer is tainted if a user value, A pointer is tainted if a user value, including a return address, is derived including a return address, is derived directly or indirectly from the user input. directly or indirectly from the user input.
Pointer Pointer taintednesstaintedness semanticssemanticsapplied to formally reason and extract applied to formally reason and extract security specifications of library security specifications of library functions functions
Exposes (and removes) security Exposes (and removes) security vulnerabilities vulnerabilities
Vulnerability Avoidance Transparent Runtime Randomization
Protects against about 60% of security attack reported by CERTBreaks the attacker’s assumptions on the fixed memory layout of the target systemMakes it impossible to correctly determine location of critical application associated memory regions essential in designing and launching a successful attackConverts an attack into application crashProgram initialization overhead – 1% to 6%NO runtime overhead Memory cost –extra copy of GOT(global offset table) 200 Byte to 3.5 KB
use code
Kernel space0xC0000000
0xFFFFFFFF
0x08048000static data
bss
shared libraries0x40000000 ± Rand
0xBFFFFFFC - Randuser stack
user heap End_of_bss + Rand
-
DPASA – Designing Protection & Adaptation Into a Survivability Architecture
Client Zone
JBI Management Staff
ExecutiveZone
CrumpleZone
OperationsZone
JBI Core
Quad 1 Quad 2 Quad 3 Quad 4
Network
Access Proxy (Isolated Process Domains in SE-Linux)Domain6
First Restart Domains Eventually Restart HostLocal Controller
RMI
STCPTCP
PS Sensor Rpts
TCP UDP
IIOP
PSQImplPSQImpl
IIOP
TCP
DC
Eascii
Domain1 Domain2 Domain3 Domain4 Domain5Forward/Ratelimit
Proxy LogicInspect / Forward / Rate Limit
Client Zone
JBI Management Staff
ExecutiveZone
CrumpleZone
OperationsZone
JBI Core
Quad 1 Quad 2 Quad 3 Quad 4
Network
Access Proxy (Isolated Process Domains in SE-Linux)Domain6
First Restart Domains Eventually Restart HostLocal Controller
RMI
STCPTCP
PS Sensor Rpts
TCP UDP
IIOP
PSQImplPSQImpl
IIOP
TCP
DC
Eascii
Domain1 Domain2 Domain3 Domain4 Domain5Forward/Ratelimit
Proxy LogicInspect / Forward / Rate Limit
Access Proxy (Isolated Process Domains in SE-Linux)Domain6
First Restart Domains Eventually Restart HostLocal Controller
RMI
STCPTCP
PS Sensor Rpts
TCP UDP
IIOP
PSQImplPSQImpl
IIOP
TCP
DC
Eascii
Domain1 Domain2 Domain3 Domain4 Domain5Forward/Ratelimit
Proxy LogicInspect / Forward / Rate Limit
JBI critical mission objectives
JBI critical functionality
Initialized JBI provides essential services
Authorized publish processed successfully
ConfidentialityDataflow Timeliness Integrity
(from functional model execution)
Functional model assumptions hold
JBI mission awareness
CA1: Origin of Attacks on Clients
CA2: Attack Propagation from Clients
CA3: Client Process Corruption
PA1: Client-Core
Communication I & C
PA2: Alternate Path
Availability
QA1: QIS Incorruptibility
QA2: QIS Communication
Cutoff
QA3: QIS Input
Integrity
QA4: QIS Function
Correctness
AA1: AP Function
Correctness
AA2: AP Application-layer Integrity
AA3: AP Application-layer Confidentiality
AA4: Origin of Attacks on
Access Proxy
AA5: Attacks from AP
AA6: DoSfrom Compromised
Core
AA7: AP Process Corruption
AA8: DoSPrevention by Access Proxy
DA1: DC Communications
GA1: Process Corruption on Guardian
DA3: Process
Corruption on DC
GA2: Attacks from Guardian
DA2: Origin of Attacks on DC
SA1: Origin of Attacks on PSQ Server
SA2: Attacks from PSQ Server
SA3: IO Integrity in PSQ Server
SA4: Client Confidentiality in PSQ Server
SA5: IO Authenticity
SA6: Network-layer I & C
SA7: Process Corruption in PSQ Server
SeA1: Attacks from IDS Sensor
SeA2: Sensor False Alarm
Rate
SeA3: Sensor Detection Delay
SeA4: Sensor Detection Probability
SeA5: Process
Corruption in Sensor
AcA1: Process Corruption in Actuator
AcA2: Attacks from Actuator
LA1: Process Corruption in
Local Controller
LA2: Attacks from Local Controller
CoA1: CorrleatorFalse Alarm
Rate
CoA2: Origin of Attacks on Correlator
CoA3: Attacks from Correlator
CoA4: Alert IntegrityMA1: SM Byzantine Agreement
MA2: Origin of Attacks on
SM
MA3: Attacks from SM
PsA1: ADF Policy Server
Input Correctness
PsA1: ADF Policy Server
Synchronization
ScA1: Process Corruption in Subscribed
Client
System Connectivity
Physical Topology
Network Topology Restricted Routing No Tunneling Attacks
Process Isolation
SELinux Trusted Solaris Windows 2000
Type Enforcement Hardened Kernel Hardened Kernel Kernel Loadable Wrappers
VMWareover SELinux
Platform Mechanisms Component-specific policy
Private Key Confidentiality
No Unauthorized Direct Access
Keys Protected from Theft
DoDCommon Access Card (CAC)
PKCS #11 Tamperproof
Keys Not Guessable
Algorithmic Framework
Key Length Key Lifetime
No Unauthorized Indirect Access
Physical Protection of CAC device
Protection of CAC Authentication Data
No Compromise of Authorized Process Accessing CAC
No Cryptography in Access Proxy
Not Preconfigured
Not Reconfigurable
ADF NIC services protected
ADF Correctness
ADF NIC Physical Security
ADF NIC Firmware Initialization
ADF Key Initialization
ADF Agent Initialization
ADF Protocol Correctness
ADF Host Independence
ADF Agent Correctness
VPG Integrity VPG Confidentiality
Policy Server Integrity
ADF Policy Correctness
Correctness of Registration Protocol
Correctness of Reattachment
Protocol
Hard-wired Configuration
Electrically Isolated
Physically Protected
Connectivity Physical Integrity
Electrical Integrity
Gate Configuration and
Truth Table
Proxy Protocol Configuration
Can Identify Malformed Traffic
Correctness of Flow Control Mechanisms
Bidirectional Flow Control
Correctness of Certificate Exchange
IDS Experimental Evaluation
Correctness of Modified ITUA Protocols
Functional model faithful to design
IDS objectives
Notification Confidentiality
IO Confidentiality (end-to-end)
Confidential info not exposed
Unauthorized activity properly rejected
Authorized join/leave processed successfully
Authorized query processed successfully
Authorized subscribe processed successfully
JBI properly initialized
Design Team Review
JBI critical mission objectives
JBI critical functionality
Initialized JBI provides essential services
Authorized publish processed successfully
ConfidentialityDataflow Timeliness Integrity
(from functional model execution)
Functional model assumptions hold
JBI mission awareness
CA1: Origin of Attacks on Clients
CA2: Attack Propagation from Clients
CA3: Client Process Corruption
PA1: Client-Core
Communication I & C
PA2: Alternate Path
Availability
QA1: QIS Incorruptibility
QA2: QIS Communication
Cutoff
QA3: QIS Input
Integrity
QA4: QIS Function
Correctness
AA1: AP Function
Correctness
AA2: AP Application-layer Integrity
AA3: AP Application-layer Confidentiality
AA4: Origin of Attacks on
Access Proxy
AA5: Attacks from AP
AA6: DoSfrom Compromised
Core
AA7: AP Process Corruption
AA8: DoSPrevention by Access Proxy
DA1: DC Communications
GA1: Process Corruption on Guardian
DA3: Process
Corruption on DC
GA2: Attacks from Guardian
DA2: Origin of Attacks on DC
SA1: Origin of Attacks on PSQ Server
SA2: Attacks from PSQ Server
SA3: IO Integrity in PSQ Server
SA4: Client Confidentiality in PSQ Server
SA5: IO Authenticity
SA6: Network-layer I & C
SA7: Process Corruption in PSQ Server
SeA1: Attacks from IDS Sensor
SeA2: Sensor False Alarm
Rate
SeA3: Sensor Detection Delay
SeA4: Sensor Detection Probability
SeA5: Process
Corruption in Sensor
AcA1: Process Corruption in Actuator
AcA2: Attacks from Actuator
LA1: Process Corruption in
Local Controller
LA2: Attacks from Local Controller
CoA1: CorrleatorFalse Alarm
Rate
CoA2: Origin of Attacks on Correlator
CoA3: Attacks from Correlator
CoA4: Alert IntegrityMA1: SM Byzantine Agreement
MA2: Origin of Attacks on
SM
MA3: Attacks from SM
PsA1: ADF Policy Server
Input Correctness
PsA1: ADF Policy Server
Synchronization
ScA1: Process Corruption in Subscribed
Client
System Connectivity
Physical Topology
Network Topology Restricted Routing No Tunneling Attacks
Process Isolation
SELinux Trusted Solaris Windows 2000
Type Enforcement Hardened Kernel Hardened Kernel Kernel Loadable Wrappers
VMWareover SELinux
Platform Mechanisms Component-specific policy
Private Key Confidentiality
No Unauthorized Direct Access
Keys Protected from Theft
DoDCommon Access Card (CAC)
PKCS #11 Tamperproof
Keys Not Guessable
Algorithmic Framework
Key Length Key Lifetime
No Unauthorized Indirect Access
Physical Protection of CAC device
Protection of CAC Authentication Data
No Compromise of Authorized Process Accessing CAC
No Cryptography in Access Proxy
Not Preconfigured
Not Reconfigurable
ADF NIC services protected
ADF Correctness
ADF NIC Physical Security
ADF NIC Firmware Initialization
ADF Key Initialization
ADF Agent Initialization
ADF Protocol Correctness
ADF Host Independence
ADF Agent Correctness
VPG Integrity VPG Confidentiality
Policy Server Integrity
ADF Policy Correctness
Correctness of Registration Protocol
Correctness of Reattachment
Protocol
Hard-wired Configuration
Electrically Isolated
Physically Protected
Connectivity Physical Integrity
Electrical Integrity
Gate Configuration and
Truth Table
Proxy Protocol Configuration
Can Identify Malformed Traffic
Correctness of Flow Control Mechanisms
Bidirectional Flow Control
Correctness of Certificate Exchange
IDS Experimental Evaluation
Correctness of Modified ITUA Protocols
Functional model faithful to design
IDS objectivesIDS objectives
Notification ConfidentialityNotification
ConfidentialityIO Confidentiality
(end-to-end)IO Confidentiality
(end-to-end)
Confidential info not exposed
Confidential info not exposed
Unauthorized activity properly rejected
Unauthorized activity properly rejected
Authorized join/leave processed successfully
Authorized join/leave processed successfully
Authorized query processed successfully
Authorized query processed successfully
Authorized subscribe processed successfullyAuthorized subscribe processed successfully
JBI properly initializedJBI properly initialized
Design Team Review
JBI critical mission objectives
JBI critical functionality
Initialized JBI provides essential services
Authorized publish processed successfully
ConfidentialityDataflow Timeliness Integrity
(from functional model execution)
Functional model assumptions hold
JBI mission awareness
CA1: Origin of Attacks on Clients
CA2: Attack Propagation from Clients
CA3: Client Process Corruption
PA1: Client-Core
Communication I & C
PA2: Alternate Path
Availability
QA1: QIS Incorruptibility
QA2: QIS Communication
Cutoff
QA3: QIS Input
Integrity
QA4: QIS Function
Correctness
AA1: AP Function
Correctness
AA2: AP Application-layer Integrity
AA3: AP Application-layer Confidentiality
AA4: Origin of Attacks on
Access Proxy
AA5: Attacks from AP
AA6: DoSfrom Compromised
Core
AA7: AP Process Corruption
AA8: DoSPrevention by Access Proxy
DA1: DC Communications
GA1: Process Corruption on Guardian
DA3: Process
Corruption on DC
GA2: Attacks from Guardian
DA2: Origin of Attacks on DC
SA1: Origin of Attacks on PSQ Server
SA2: Attacks from PSQ Server
SA3: IO Integrity in PSQ Server
SA4: Client Confidentiality in PSQ Server
SA5: IO Authenticity
SA6: Network-layer I & C
SA7: Process Corruption in PSQ Server
SeA1: Attacks from IDS Sensor
SeA2: Sensor False Alarm
Rate
SeA3: Sensor Detection Delay
SeA4: Sensor Detection Probability
SeA5: Process
Corruption in Sensor
AcA1: Process Corruption in Actuator
AcA2: Attacks from Actuator
LA1: Process Corruption in
Local Controller
LA2: Attacks from Local Controller
CoA1: CorrleatorFalse Alarm
Rate
CoA2: Origin of Attacks on Correlator
CoA3: Attacks from Correlator
CoA4: Alert IntegrityMA1: SM Byzantine Agreement
MA2: Origin of Attacks on
SM
MA3: Attacks from SM
PsA1: ADF Policy Server
Input Correctness
PsA1: ADF Policy Server
Synchronization
ScA1: Process Corruption in Subscribed
Client
System Connectivity
Physical Topology
Network Topology Restricted Routing No Tunneling Attacks
Process Isolation
SELinux Trusted Solaris Windows 2000
Type Enforcement Hardened Kernel Hardened Kernel Kernel Loadable Wrappers
VMWareover SELinux
Platform Mechanisms Component-specific policy
Private Key Confidentiality
No Unauthorized Direct Access
Keys Protected from Theft
DoDCommon Access Card (CAC)
PKCS #11 Tamperproof
Keys Not Guessable
Algorithmic Framework
Key Length Key Lifetime
No Unauthorized Indirect Access
Physical Protection of CAC device
Protection of CAC Authentication Data
No Compromise of Authorized Process Accessing CAC
No Cryptography in Access Proxy
Not Preconfigured
Not R e c o n f i g u r a b l e
ADF NIC services protected
ADF Correctness
ADF NIC Physical Security
ADF NIC Firmware Initialization
ADF Key Initialization
ADF Agent Initialization
ADF Protocol Correctness
ADF Host Independence
ADF Agent Correctness
VPG Integrity VPG Confidentiality
Policy Server Integrity
ADF Policy Correctness
Correctness of Registration Protocol
Correctness of Reattachment
Protocol
Hard-wired Configuration
Electrically Isolated
Physically Protected
Connectivity Physical Integrity
Electrical Integrity
Gate Configuration and
Truth Table
Proxy Protocol Configuration
Can Identify Malformed Traffic
Correctness of Flow Control Mechanisms
Bidirectional Flow Control
Correctness of Certificate Exchange
IDS Experimental Evaluation
Correctness of Modified ITUA Protocols
Functional model faithful to design
IDS objectives
Notification Confidentiality
IO Confidentiality (end-to-end)
Confidential info not exposed
Unauthorized activity properly rejected
Authorized join/leave processed successfully
Authorized query processed successfully
Authorized subscribe processed successfully
JBI properly initialized
Design Team Review
JBI critical mission objectives
JBI critical functionality
Initialized JBI provides essential services
Authorized publish processed successfully
ConfidentialityDataflow Timeliness Integrity
(from functional model execution)
Functional model assumptions hold
JBI mission awareness
CA1: Origin of Attacks on Clients
CA2: Attack Propagation from Clients
CA3: Client Process Corruption
PA1: Client-Core
Communication I & C
PA2: Alternate Path
Availability
QA1: QIS Incorruptibility
QA2: QIS Communication
Cutoff
QA3: QIS Input
Integrity
QA4: QIS Function
Correctness
AA1: AP Function
Correctness
AA2: AP Application-layer Integrity
AA3: AP Application-layer Confidentiality
AA4: Origin of Attacks on
Access Proxy
AA5: Attacks from AP
AA6: DoSfrom Compromised
Core
AA7: AP Process Corruption
AA8: DoSPrevention by Access Proxy
DA1: DC Communications
GA1: Process Corruption on Guardian
DA3: Process
Corruption on DC
GA2: Attacks from Guardian
DA2: Origin of Attacks on DC
SA1: Origin of Attacks on PSQ Server
SA2: Attacks from PSQ Server
SA3: IO Integrity in PSQ Server
SA4: Client Confidentiality in PSQ Server
SA5: IO Authenticity
SA6: Network-layer I & C
SA7: Process Corruption in PSQ Server
SeA1: Attacks from IDS Sensor
SeA2: Sensor False Alarm
Rate
SeA3: Sensor Detection Delay
SeA4: Sensor Detection Probability
SeA5: Process
Corruption in Sensor
AcA1: Process Corruption in Actuator
AcA2: Attacks from Actuator
LA1: Process Corruption in
Local Controller
LA2: Attacks from Local Controller
CoA1: CorrleatorFalse Alarm
Rate
CoA2: Origin of Attacks on Correlator
CoA3: Attacks from Correlator
CoA4: Alert IntegrityMA1: SM Byzantine Agreement
MA2: Origin of Attacks on
SM
MA3: Attacks from SM
PsA1: ADF Policy Server
Input Correctness
PsA1: ADF Policy Server
Synchronization
ScA1: Process Corruption in Subscribed
Client
System Connectivity
Physical Topology
Network Topology Restricted Routing No Tunneling Attacks
Process Isolation
SELinux Trusted Solaris Windows 2000
Type Enforcement Hardened Kernel Hardened Kernel Kernel Loadable Wrappers
VMWareover SELinux
Platform Mechanisms Component-specific policy
Private Key Confidentiality
No Unauthorized Direct Access
Keys Protected from Theft
DoDCommon Access Card (CAC)
PKCS #11 Tamperproof
Keys Not Guessable
Algorithmic Framework
Key Length Key Lifetime
No Unauthorized Indirect Access
Physical Protection of CAC device
Protection of CAC Authentication Data
No Compromise of Authorized Process Accessing CAC
No Cryptography in Access Proxy
Not Preconfigured
Not Reconfigurable
ADF NIC services protected
ADF Correctness
ADF NIC Physical Security
ADF NIC Firmware Initialization
ADF Key Initialization
ADF Agent Initialization
ADF Protocol Correctness
ADF Host Independence
ADF Agent Correctness
V P G I n t e g r i t y VPG Confidentiality
Policy Server Integrity
ADF Policy Correctness
Correctness of Registration Protocol
Correctness of Reattachment
Protocol
Hard-wired Configuration
Electrically Isolated
Physically Protected
Connectivity Physical Integrity
Electrical Integrity
Gate Configuration and
Truth Table
Proxy Protocol Configuration
Can Identify Malformed Traffic
Correctness of Flow Control Mechanisms
Bidirectional Flow Control
Correctness of Certificate Exchange
IDS Experimental Evaluation
Correctness of Modified ITUA Protocols
Functional model faithful to design
IDS objectivesIDS objectives
Notification ConfidentialityNotification
ConfidentialityIO Confidentiality
(end-to-end)IO Confidentiality
(end-to-end)
Confidential info not exposed
Confidential info not exposed
Unauthorized activity properly rejected
Unauthorized activity properly rejected
Authorized join/leave processed successfully
Authorized join/leave processed successfully
Authorized query processed successfully
Authorized query processed successfully
Authorized subscribe processed successfullyAuthorized subscribe processed successfully
JBI properly initializedJBI properly initialized
Design Team Review
DemonstratesADF based protectionDJM: signing and signature checking, authentication, RBAC, semantic and behavioral checks, FT protocolsAdaptive response: rapid reaction, IO rejection, quad isolation, dynamic clients, fall back
R o u t e r 1
C A FC l i e n t
A O D BC l i e n t
W e a t h e rC l i e n t
W 0W 01 0 M b p s R o u t e r 2 A t t a c k e r
Q u a d 2 Q u a d 3 Q u a d 4
H u b
M a n a g e d S w i t c h
N I D S H u b
Q u a d 1
1 0 0 M b p s
1 0 0 M b p s
R o u t e r 1
C A FC l i e n t
A O D BC l i e n t
W e a t h e rC l i e n t
W 0W 01 0 M b p s R o u t e r 2 A t t a c k e r
Q u a d 2 Q u a d 3 Q u a d 4
H u b
M a n a g e d S w i t c h
N I D S H u b
Q u a d 1
1 0 0 M b p s
1 0 0 M b p s
Enough proof of concept implementations developed to show 3 AFRL clients running a simplified scenario over 4 quad core
DPASA W/O Signature Verification
0
200
400
600
800
1000
1200
0 100 200 300 400 500 600 700
Publications
Late
ncy(
ms)
Series1
Testbed configuration
Screenshots and performance graph
The DPASA IT-JBI design provides critical functionality with high probability even when under heavy successful attack.
Total number of intrusions versus MTTD_A (min)
0
100
200
300
400
500
600
700
800
900
10 100 1000 10000 100000
MTTD_A (min)
Tota
l Num
ber
of In
trusi
ons
12 hour mission 24 hour mission 48 hour mission
98% of all publishes successful when new vulnerabilities are discovered, on the average, once a day or less often during a 12-hour mission. (An extremely high new vulnerability discovery rate; CERT data suggest MTTDA ~ 6000 min.)
At this new vulnerability discovery rate, system provides correct functionality even when about 10 intrusions occur during a 12-hour mission.
Fraction of successful publishes versus MTTD_A (min)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
10 100 1000 10000 100000
MTTD_A (min)
Frac
tion
of S
ucce
ssfu
l Pub
lishe
s
12 hour mission 24 hour mission 48 hour mission
-
Secure ARMORs – Seamless Dependability & Security Testbed
MultithreadedLSA, PDS Algorithms
iMac
iMac
iMac
Ad Hoc Wireless NetworkSimulator (NS/JavaSim)
Highly ReliableWireline Network
Errorspropagation
Higher ErrorRate
Physical Wireless Testbed
Wireless Network Emulator
ErrorInjection
iMac
•Frequent disconnections• Mobility• Fading
• Noisy channels• Battery exhaustion• HW transients• SW bugs
• HW transients• SW bugs
High performance/ bandwidth
Limited performance/ bandwidth
Error Injection
Secure ARMORscustomizable software and/or
hardware to provide reliability and security
services
Well established techniques: • Replication• Checkpointing
• Seamless interaction across domains• Limit error propagation• Reduce dependability bottleneck dueto wireless network
-
The Information Trust Institute (ITI)Trust is about public confidence –
It is about security, but also correctness, reliability, availability, and survivability
ITI: Integrated private/public R&D and workforce development to foster technology transfer, new industry, products, and services in order to provide design and validation tools and architectural constructs needed to ensure and justify public trust in critical applications and systems
Ensuring Public Confidence
Providing designers and agencies with the
tools to measure and validate trust in IT-dependent critical
applications and systems
PublicEducation &
Workforce DevelopmentAddressing the nation’s
cybertrust workforce needs:identifying and encouraging
talent in K-12,university & college programs,
professional programs,public awareness
Designing inTrust
Providing system design& validation tools, and
architectural constructs todeploy and validate
trustworthiness in critical applications & systems
The loss of public confidence in or access to critical systems can be economically devastating:Two weeks after 9/11: >$10B and 100,000 jobs lost in airline industryDowntime costs per hour: brokerage operations: $6.45 M, credit card authorization: $2.6 M
-
Research in Dependable and Secure Systems at Illinois
FacultyMicroprocessor architecture – S. Patel, N. Carter Hardware checkpointing – J. TorrellasFormal methods – J. MeseguerTest and VLSI design – J. PatelStorage systems – Y. ZhouMobile computing – N. VaidyaSimulation – D. NicolModeling and design – W. SandersOptical networks – S. LumettaMeasurement and design – R. IyerBenchmarking and design – Z. Kalbarczyk
Major PartnersIBM – benchmarkingSUN – next generation supercomputerBBN – Intrusion-Tolerant Computing and Adaptive SystemsMotorola – wireless communication and computingHP – utility computingBoeing – trusted softwareJPL – next generation space borne computing