computational resiliency steve j. chapin, susan older center for systems assurance syracuse...

Post on 19-Jan-2016

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Computational Computational ResiliencyResiliency

Steve J. Chapin, Susan OlderSteve J. Chapin, Susan Older

Center for Systems AssuranceCenter for Systems Assurance

Syracuse UniversitySyracuse University

Gregg IrvinGregg Irvin

Mobium EnterprisesMobium Enterprises

24 July 2001 Not for Public Release

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Recap: What isRecap: What isComputational Computational Resiliency?Resiliency?

The ability to sustain application operation The ability to sustain application operation and dynamically restore the level and dynamically restore the level

of assurance during an attack.of assurance during an attack.

Application-centric self defense, builtApplication-centric self defense, builton replication, migration, functionalityon replication, migration, functionality

mutation, and camouflage.mutation, and camouflage.

Computational ResiliencyComputational Resiliency

Mission CriticalApplication

Attack

Degraded Application sufficiently Improved by

Resiliency to perform Mission Critical Function

Techniques applied to correct situation

ComputationalResiliency

Result ofAttack

Degraded Application trying to perform Mission Critical

Function

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Multi-Faceted ApproachMulti-Faceted Approach

Theoretical frameworkTheoretical framework reason about conformance to policyreason about conformance to policy

Computational resiliency libraryComputational resiliency library dynamic application managementdynamic application management

System software support System software support scheduling/policy frameworksscheduling/policy frameworks

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Computational Computational Resiliency LibraryResiliency Library Dynamic multithreadingDynamic multithreading MigrationMigration ReplicationReplication CamouflageCamouflage Functionality reconfigurationFunctionality reconfiguration Policy-based managementPolicy-based management

Example of CRLibExample of CRLib

16 2x Pentium

16 2x Pentium

16 2x Pentium

16 Alpha

Firewall

Intel 8x SMP

Intel 8x SMP

SGI Origin

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

"The Net"

“Safe Zone”OASIS protection

“The Wild”limited protection

The Benign StateThe Benign State

16 2x Pentium

16 2x Pentium

16 2x Pentium

16 Alpha

Firewall

Intel 8x SMP

Intel 8x SMP

SGI Origin

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

"The Net"

Dudley’s job(low priority)

Bullwinkle’s jobRocky’s job

The AttacksThe Attacks

16 2x Pentium

16 2x Pentium

16 2x Pentium

16 Alpha

Firewall

Intel 8x SMP

Intel 8x SMP

SGI Origin

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

"The Net"

Snidely attacks: blocked atfirewall

Dudley does nothing.

The AttacksThe Attacks

16 2x Pentium

16 2x Pentium

16 2x Pentium

16 Alpha

Firewall

Intel 8x SMP

Intel 8x SMP

SGI Origin

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

"The Net"

Natasha attacks Rocky; caught by IDS.

The AttacksThe Attacks

16 2x Pentium

16 2x Pentium

16 2x Pentium

16 Alpha

Firewall

Intel 8x SMP

Intel 8x SMP

SGI Origin

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

"The Net"

Rocky’s job migrates back into safe zone;Dudley must give up resources.

The AttacksThe Attacks

16 2x Pentium

16 2x Pentium

16 2x Pentium

16 Alpha

Firewall

Intel 8x SMP

Intel 8x SMP

SGI Origin

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

"The Net"

Boris attacks Bullwinkle’s job.Some attacks succeed.

The AttacksThe Attacks

16 2x Pentium

16 2x Pentium

16 2x Pentium

16 Alpha

Firewall

Intel 8x SMP

Intel 8x SMP

SGI Origin

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

3Com Superstack 3300

"The Net"

Bullwinkle’s job employs camouflage,decoys, and migration.

Groups and ReplicationGroups and Replication

Group

Processor

One group per One group per computational computational tasktask

User selects User selects replication level, replication level, other policiesother policies

Group mapped Group mapped across processorsacross processors

Periodic liveness Periodic liveness checkschecks

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Theory Framework: Theory Framework: GoalsGoals Understand the interplay among Understand the interplay among

core aspects of CRLibcore aspects of CRLib Groups, locations, resources, Groups, locations, resources,

schedules, …schedules, … Reason about effects of Reason about effects of

configuration and policy choicesconfiguration and policy choices Reason about applications’ Reason about applications’

conformance to desired behaviorconformance to desired behavior

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Framework BasicsFramework Basics Build on existing mobile calculi Build on existing mobile calculi

-Calculus, Mobile Ambients, Join--Calculus, Mobile Ambients, Join-CalculusCalculus

Capture essential features of CRLibCapture essential features of CRLib ReplicationReplication MigrationMigration ReconfigurationReconfiguration CamouflageCamouflage

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

A A -Calculus Primer-Calculus Primer Collection of Collection of namesnames

Represent information: vRepresent information: values, alues, communication links (channels), codecommunication links (channels), code

Have scopeHave scope Message-based communicationMessage-based communication

receipt of a value on xreceipt of a value on xtransmission of y along xtransmission of y along x

Information mobility: information Information mobility: information can be passed beyond original can be passed beyond original scopescope

yx

yx

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Finding a Service Finding a Service ProviderProvider

Client wants to find a service Client wants to find a service provider:provider:

1.1. Query the Service Directory, include Query the Service Directory, include a SASE. a SASE.

2.2. Wait for response.Wait for response.

3.3. Upon receipt, submit request.Upon receipt, submit request.

0... reqspspaddraddrquery

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Handling Service Handling Service RequestsRequests Service Directory repeatedly responds Service Directory repeatedly responds

to queries, arbitrarily choosing provider.to queries, arbitrarily choosing provider.

Service providers wait for requests.Service providers wait for requests.

crabraararaquery .!

jobDOjobb .! jobDOjoba .!

jobDOjobc .!

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

crabraararaquery .!

jobDOjobb .! jobDOjoba .!

jobDOjobc .!

bbccaa

0... reqspspaddraddrquery

queryquery

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

0.. reqspspaddr

crabraararaquery .!

jobDOjobb .! jobDOjoba .!

jobDOjobc .!

bbccaa

caddrbaddraaddr

addraddr

a b ca b c

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

0.reqb

crabraararaquery .!

jobDOjobb .! jobDOjoba .!

jobDOjobc .!

bbccaa

0

bb

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

crabraararaquery .!

jobDOjobb .! jobDOjoba .!

jobDOjobc .!

bbccaa

reqDO

0

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Initial QuestionsInitial Questions What are the primary entities, as What are the primary entities, as

well as the relationships among well as the relationships among them?them? Groups, locations, failuresGroups, locations, failures External events: DEFCON changesExternal events: DEFCON changes Scheduling policiesScheduling policies Application policies Application policies

What is the most appropriate way What is the most appropriate way to integrate those components?to integrate those components? And at what abstraction level?And at what abstraction level?

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

In Progress: Two Calculi In Progress: Two Calculi Higher-level calculus that Higher-level calculus that

incorporates the CRLib APIincorporates the CRLib API Captures groups, policies, etc.Captures groups, policies, etc.

Lower-level calculus that provides Lower-level calculus that provides semantics for higher-level calculussemantics for higher-level calculus Captures abstract implementation Captures abstract implementation

details. details.

Soundness of the translation will Soundness of the translation will provide validation. provide validation.

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

A Thought ExperimentA Thought ExperimentSuppose there are two tasks, A and Suppose there are two tasks, A and

B, working in parallel:B, working in parallel: A’s replication level: 4A’s replication level: 4 B’s replication level: 2B’s replication level: 2 Three processors: P1 P2 P3Three processors: P1 P2 P3

Resulting behavior (modulo Resulting behavior (modulo robustness) should be similar to robustness) should be similar to system with single copies of A and system with single copies of A and B.B.

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Open QuestionsOpen Questions How do we define “similar”, much How do we define “similar”, much

less prove it?less prove it? CorrectnessCorrectness PerformancePerformance RobustnessRobustness

What are sufficiently high-level yet What are sufficiently high-level yet informative performance informative performance measures?measures? How to model camouflage?How to model camouflage?

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Back to CRLib: StatusBack to CRLib: Status Multiple platformsMultiple platforms

Windows NT/2000, Linux, SGI IRIX, Windows NT/2000, Linux, SGI IRIX, SolarisSolaris

Heterogeneous resource Heterogeneous resource management methodsmanagement methods Load-balancing across heterogeneous Load-balancing across heterogeneous

networksnetworks Performance improvement by factor of 3Performance improvement by factor of 3

Demo this eveningDemo this evening

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

In ProgressIn Progress Adding support for Byzantine Adding support for Byzantine

failuresfailures User-level option for authenticated User-level option for authenticated

messagesmessages Based on Lamport-Shostak-Pease Based on Lamport-Shostak-Pease

algorithmsalgorithms Greater resiliency needed for Greater resiliency needed for

nonauthenticated messagesnonauthenticated messages Evaluating cost of replicationEvaluating cost of replication

Compare to standard checkpointingCompare to standard checkpointing

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Next Steps for ProjectNext Steps for Project Tool for user policy expressionTool for user policy expression

Choices for replication/recovery methods, Choices for replication/recovery methods, agreement protocols, message-passing agreement protocols, message-passing schemes schemes

State-dependent policy specified via “chinese State-dependent policy specified via “chinese menu” approachmenu” approach

Scheduling frameworkScheduling framework Schedulers that understand CR policies, Schedulers that understand CR policies,

resulting resource demands, user/process resulting resource demands, user/process priorities priorities

Build on previous MESSIAHS and Legion workBuild on previous MESSIAHS and Legion work Finalize core CR calculi; turn to analysis Finalize core CR calculi; turn to analysis

techniquestechniques

Computational Resiliency – CSAComputational Resiliency – CSA Not for Public ReleaseNot for Public Release

Open IssuesOpen Issues Cost/benefit analysis of CRCost/benefit analysis of CR

How much protection do we provide if How much protection do we provide if the attacker knows what we’re trying the attacker knows what we’re trying to do?to do?

How much is performance affected by How much is performance affected by message load, active replication, message load, active replication, etc. ?etc. ?

Potential integration with other Potential integration with other OASIS projectsOASIS projects

top related