KIT – The cooperation of Forschungszentrum Karlsruhe GmbH and Universität Karlsruhe (TH)
Digital On-Demand Computing Organism - DodOrg
SPP OC KolloquiumDFG SPP 1183 “Organic Computing”Augsburg, September 21/22, 2009
Talk Overview
►Project Motivation and Overview►DodOrg: Plasticity and Dynamics ►Results of third year:
Organic MonitoringOrganic MiddlewareOrganic Low Power ManagementOrganic Hardware
►Conclusion Phase II►Research Goals for Phase III
StabilityRobustness
►Phase III Summary
21.09.20092 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
DodOrg Motivation
Classic Scenario:►Only those scenarios can be handled
that were considered in advance,where the cause can be detected,where the corresponding reaction had been explicitly programmed.
►Lack of adaptation leads to insufficient reactions (e.g. shutdown …)
DodOrg Scenario:►System reaction based on indications (higher level of abstraction)
e.g. CRC/bit error rate, network bottleneck, environmental change or change on application level
►Proper reaction possible even ifScenario was not considered in advance.Cause was not detected, Reaction was not explicitly programmed.
►Flexible response to changed environmental situation
Scenario detection: recognize that something is differentAdapt to changed requirements either by known path or gradual process of rearrangement (optimization, healing)Plasticity: Stabilization but not “petrification’’
21.09.20093 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Phase II: Refined Layer Model
Brain
Lev
elOr
gan
Leve
lCe
ll Lev
el
Myo-cardial
Cell
Nervous System
Application API
Application Monitoring
HardwareMonitoring
MiddlewareMonitoring,Feedback
Application Testbed(all groups)
Organic Middleware(Brinkschulte)
Organic Processing Cells (Becker)
Distributed Low Power Management (Henkel)
Biol
ogica
l Con
sider
atio
ns (B
ränd
le)
Orga
nic M
onito
ring
Syst
em (K
arl)
HeartHormone Level Computation
Dynamic Power ManagementReal-time considerations
Temperature,Local Traffic
Proa
ctiv
e In
telli
gent
Dat
a A
naly
sis
Self-AdaptationSelf-Optimization
Self-Healing
Stable HormoneInteraction
Stable EnergyDistribution
OPC Interaction,Metrics
Plasticity Aspects
21.09.20094 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Phase II: Project Objectives
PlasticityNarrowing the dynamic properties.The ability of the system to leave a stable state in order to adapt to larger environmental changes.
DynamicsThe ability of the system to react according different parameter changes (external, internal).
+ Decreased power consumption+ Oscillation avoidance+ Overhead reduction + Reduced complexity- Delayed reaction
+ Improved performance+ Immediate reaction- Danger of oscillation- Unstable behavior
21.09.20095 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Example of Plasticity
Scenario►Link Failure Packet Deadlock►Recovery Broadcast to deliver stuck packet►Implications
Hormon cycleNetwork topology because of broken link neighbor relationship changes
►MW redistribution of task►Different Link Workload►Power Management Regulation of new Power Situation
21.09.20096 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
OPC-Swap
Scenario
OPC OPC OPC OPC
OPC OPC OPC OPC
OPC OPC OPC OPC
OPC OPC OPC OPC
OPC
OPC
OPC
OPC
OPC OPC OPC OPC OPC
OPCOPCOPC OPC OPC
OPC
OPC
OPC
OPC
OPC
OPC
OPC OPC OPC OPCOPCOPC
OPC
OPC
OPC
OPC
OPC
OPC
OPC
OPC OPC OPC OPC
OPC OPC OPC OPC
OPC OPC OPC OPC
OPC OPC OPC OPC
OPC
OPC
OPC
OPC
OPC OPC OPC OPC OPC
OPCOPCOPC OPC OPC
OPC
OPC
OPC
OPC
OPC
OPC
OPC OPC OPC OPCOPCOPC
OPC
OPC
OPC
OPC
OPC
OPC
OPC
No failure
Aver
age
pack
et la
tenc
y (b
lue
orga
n)
Optimization
time
Related tast group= organ
OPC OPC OPC OPC
OPC OPC OPC OPC
OPC OPC OPC OPC
OPC OPC OPC OPC
OPC
OPC
OPC
OPC
OPC OPC OPC OPC OPC
OPCOPCOPC OPC OPC
OPC
OPC
OPC
OPC
OPC
OPC
OPC OPC OPC OPCOPCOPC
OPC
OPC
OPC
OPC
OPC
OPC
OPC
Detour Routing
Link error
Recovery
OPC
OPC
OPC
21.09.20097 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Monitoring: Overview(Prof. Karl)
►AimEnable and support Self-X capabilitiesFocus on increased self-awareness
►RequirementsSustained system monitoringReal-time analysis and evaluation
Correlation of (many) eventsIdentification of problems/causes
Semantic data compressionProcessing at the source of dataGeneration of meta-data
Adaptivity (reconfiguration)InterfacingScalability
Application API
Application Monitoring
HardwareMonitoring
MiddlewareMonitoring,Feedback
Application Testbed(all groups)
Organic Middleware(Brinkschulte)
Organic Processing Cells(Becker)
Distributed Low Power Management (Henkel)
Biological Considerations (Brändle)
Organic Monitoring System (Karl)
Hormone Level Computation
Dynamic Power ManagementReal-time considerations
Temperature,Local Traffic
Proactive Intelligent Data A
nalysis
Self-Adaptation
Self-Optimization
Self-Healing
Stable Hormone
Interaction
Stable Energy
Distribution
OPC Interaction,
Metrics
Plasticity Aspects
Application Hardware
Monitoring
Raw
& C
ooked
Data
Feedback
Configuration
Requirem
ents
Status
Status
Middleware
Low-Power Planning
21.09.20098 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Monitoring: Event Monitoring and Aggregation(Prof. Karl) Hormone-Inspired
Event Specification and Transmission
Meta-Events
►Decentral Event Monitoring and AggregationBased on the hormone concept, Organic Monitoring Modules (MM) monitor the hormone concentration in each OPCEach hormone is treated as an individual eventEvents are aggregated in an Associative Counter Array (ACA)
Association of events to countersCache principleEvent transmission upon overflow or replacement
Events transmitted are forming so-called Second Messenger or Meta-EventsMeta-Events are then transmitted to High-Level instances for data analysis
Event Counter Modulo
Hormones
EventAggregation
21.09.20099 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Monitoring: Intelligent Data Analysis Techniques(Prof. Karl)
►State Classification in High-Level Monitoring
Several High-Level Monitoring Units (HLM) operate on Meta-Events generated by several Organic Monitoring modulesIncoming Meta-Events are stored in so-called Event Lists which are sorted by hormone typeState Classification algorithms are using one or more Event Lists to classify the state of the (sub-) system as good, neutral or badIf necessary, appropriate actions are triggered by HLM instances
►Application Scenario – Power ConsumptionThe power consumption of OPCs is monitored by dedicated hardware sensorsScenario: encoding of a wav-file using lameA hormone is generated each time 10 mW are consumed by the CPUState Classification every 16 Meta-EventsSimulation time: ~ 4 Billion Cycles
~ 300 Mio. HormonesTraining Phase: 1 000 000 CyclesResults: 282,292 state classifications
Used for Training: 76Good: 242,590 Neutral: 35,096Bad: 4,530
21.09.200910 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Middleware: Re-Intruduction of the Artificial Hormone System(Prof. Brinkschulte)►Aim:
Mapping tasks on Organic Processing Elements (OPC)Providing the system with self-X properties on the middleware layer:
Self-ConfigurationSelf-HealingSelf-Optimization
Achieving a good mapping inregards to
Requirements of each taskRelationships of the tasksCondition of each cell and it’s neighborhood
Reacting and Adapting to changes (plasticity)
e.g. increased bit-rate errorsReaching stable mapping conditions
R
OPCR
OPC
R
OPCR
OPC
R
OPCR
OPC
R
OPCR
OPC
Organic Middleware: New Challenges of the Second Phase(Prof. Brinkschulte)
Hormone Concept - Evaluation and Refinement:Analyzing the network load in the systemGenerating hormone configurations with evolutionary algorithms
Stability:Stability of the system of individual OPCs with Hormone Cycles
a bounded input (configuration + measured data) results in an bounded outputno oscillation of tasks and no unnecessary task allocations
Examination of Stability & Robustness: Finding limits of systems:
How many tasks/jobs are suitable for a given number of OPCsCalculating minimum requirements of the system:
How many OPCs will be needed for a given scenario (at a minimum)
21.09.200912 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Middleware: Phases of the Configuration(Prof. Brinkschulte)
Phase 1:Strong Eagervalues, weak Suppressors and weak Accelerators every cell wants to execute tasksSuppressors rise when tasks are mapped
Phase 2:Strong Suppressors, Eagervalues to zero System will remain in a stable stateReconfiguration and Self-Optimization (fluctuation of the hormones)
21.09.200913 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Low Power Management: DodOrg Interfaces(Prof. Henkel)
Cost Function
Organic MiddlewareInfluencing
Hormone Expression
Power / RTManager
Organic M
onitoring
Consume
Fade
Trade &Negotiate
Policy
Energy BudgetManager
Local EnergyBudget P4
P3
P2
P1
Fill
EnergyInput
EfficiencyRT criteria
Temperature
Local traffic
Power sourceP
eers
(in
neig
hbor
ed O
PC
s)
OPC
Voltage / frequencysetting
PowerStates
AssignedTasks
ScheduledTasks
data / informationactions
Legend:
Future energy levelActual energy level
Energy level
Actual power state
► Organic MiddlewareCost Function
Used for computaion of local eager values
► Organic MonitoringExternal Cost Function parameters
Used to select apt power management policy
► OPCConfigure Power State
21.09.200914 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Low Power Management: Managing Energy-Distribution(Prof. Henkel)
Cost Function
Organic MiddlewareInfluencing
Hormone Expression
Power / RTManager
Organic M
onitoring
Consume
Fade
Trade &Negotiate
Policy
Energy BudgetManager
Local EnergyBudget P4
P3
P2
P1
Fill
EnergyInput
EfficiencyRT criteria
Temperature
Local traffic
Power sourceP
eers
(in
neig
hbor
ed O
PC
s)
OPC
Voltage / frequencysetting
PowerStates
AssignedTasks
ScheduledTasks
data / informationactions
Legend:
Future energy levelActual energy level
Energy level
Actual power state
► Energy Distribution: goalsLow energy consumptionAvoidance of local thermal hot-spotsConvergent system behavior (plasticity)
► Energy Distribution: main conceptEach OPC has a Local Energy Budget
Determines the local available energyGlobal Power Source
Assigns energy budgets to OPCs (pulse-based)
Energy Budget ManagerAgent controlling Local Energy BudgetNegotiates & Trades energy budget with neighboring OPCsInfluences Power Manager policies
21.09.200915 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Low Power Management: Agent Negotiation(Prof. Henkel)
OPC requiresenergy to rundemanding task
OPC negotiates energybudget with neighborsusing cost function basedon supply and demand
OPC
Local EnergyBudget
Energy Flow
Energy Cost
0/3
0/3 0/3
0/3 0/30/3
0/3 0/3
0/3
Energy units used/free
0/3
0/1 0/3
7/7 0/30/3
0/1 0/3
0/3
Energy units used/free
0/2
0/2 0/3
7/0 0/30/3
0/2 0/3
0/2
21.09.200916 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Low Power Management: Power Trading – Getting buyand sell Values(Prof. Henkel)
Sell value:
Buy value:
Sell value must also consider running tasks:
Sell value
Buy value
Wan
ts to
buy
Wan
ts to
sel
l
Free units (used units constant)
1 2 3 4 5
Threshold atstable-state
Weights and γ are dependant on OPC type, total amount of energy units and number OPCs
pi is priority of task i
sell
-buy
sell
-buy
Agent of OPC n sells to neighbor i if:(selln – buyn) – (selli – buyi) > threshold
21.09.200917 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Low Power Management: Power Trading – Agent Communication(Prof. Henkel)
0
10
20
30
40
50
60
70
80
90
100
1 5 9 13 17 21 25
time (pulses)
Tota
l age
nt c
omm
unic
atio
n
1 taskmapped
2 tasksmapped
3 tasksmapped
zero communication indicates stable state is reached
21.09.200918 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Processing Cells Work plan: Close Control Loop Effects, Metrics, Cost Functions (Prof. Becker)
Foundation laid byDNA-configuration controlAdaptive routingAutomated test systemHardware prototype
ChallengesDynamics of cell interactionInterference with Middleware/Low-Power Management
Research GoalsMetricsOptimizationPlasticity
Optimization & Evaluation• HW Structure• Cell Interaction• Guarantees, Stability
Comparison to classical approaches
Metrics• Power• Availability• Performance• …
Abstract OC-HW Model
Library ofOC-CellsLibrary ofOC-CellsLibrary ofOC-Cells
Input: Organic Hardware Architecture (Phase I)
Determine level ofsystem abstraction
21.09.200919 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Processing Cells: Close Control Loop Effects: Example Link Fault Detection (Prof. Becker)
Analysis of neighbor broadcast Broadcast uses flooding Assumption on fault free link: receive neighbor BC over adjacent link
BC-T
artNoC 1 artNoC 2
artNoC 3 artNoC 4Link Fault None
Routing Status Fault free
Link Fault None
Routing Status Fault free
C4
C4
C4
C4
South
C2
Detour S
C2
OPC-Status Interface
OPC-Status Interface
Broadcast-TokenBroadcast messagefrom cell 4
BC-T
Broken link
include statusBC-Round 1
C4
BC-Round 2
C2
21.09.200920 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Accu
mulat
ed ad
dition
al ha
rdwa
re co
st pe
r OPC
[TSM
C 13
0]
Poten
tial fl
exibi
lity/ fa
ult to
leran
ce
Organic Processing Cells: OPC-Hardware Cost –Synergetic Effects (Prof. Becker)
Basic OPC;µC-DP, xy-routing artNoC
Broadcast and Multicast
Error recovery;deadlock, livelock, dead end
Full adaptive routing
Fault aware routing
Observer; Link failure detection,deadlock detection
Status interface;OPC node and link status,OPC heartbeat
Clock and Power Management
Observer ;temperaturemonitoring
One cycle clock switch
Adaptive routing; West First
0,024
mm2
0,007
mm2
0,016
mm2
0,01 m
m2
0,005
mm2
0,02 m
m2
0,04 m
m2
FPGA
Impl.
FPGA
Impl.
FPGA
Impl.
21.09.200921 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Processing Cells : Xilinx-Virtex-II Pro Hardware Prototype (Prof. Becker)
Power Measurement Setup
►2D-Partial and dynamic reconfiguration of FPGA-datapath
►OPC-based distributed clk-management
►Core OPC functionality
►Derive cost and performance figures for OPC HW-model
21.09.200922 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Processing Cells : OPC Power Model (Prof. Becker)
Power Down
Change Power Mode
Unspecified/Empty
ReconfiguringPower Up
Specified/Programmed
Idle
Processing
UnconnectedConnected
OPC- Model
Cp = 0 mW
Cp = 76 mW
Cp 100 MHz = 100 mW
Cp100 MHz = 120 mW
Cp = 10 mW
Cp = 86 mW(Power Up/Down)
Change Power Mode(Clock Scaling)
Cp = 82 mW
Cp 100 MHz = 92 mW
Cp 100 MHz = 100 mW
40µs40µs
880 µs/Col
21.09.200923 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Conclusion and Plans for Phase III
►Outcome of Phase IIConcepts individually tested and applicability provenMonitoring: hormone-inspired associative event coding and use of associative countersMiddleware: reaching stable hormone and mapping situations while still being able to react to changes (plasticity)Low-Power-Processing: local agent-based energy budget distributionProcessing Cells: abstract OC-hardware model and evaluation of cell interaction metrics and cost-functions
21.09.200924 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Outlook: Stability Phase III
►Phase IIISelf-optimization scenariosConflict avoidance throughproactivityCell-level low-power issues and interactionOff-chip communicationRobustness during Development/Processing PhasePrototype implementation: application phase detection and fault-tolerance
21.09.200925 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Monitoring: Outlook(Prof. Karl)
►Trend DetectionPrediction of future system statesIdentification of potentially harmful system states in advance
►Avoiding Potential Conflicts through ProactivityBased upon Trend Detection and Event CorrelationInitiating required system chances to avoid bad or harmful system states (e.g. high temperature)Proactive self-healing and self-optimization
Past System States CurrentSystem State
PredictedSystem States
Avoid this state
21.09.200926 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Middleware: Outlook of the Third Phase(Prof. Brinkschulte)
► Robustness and StabilityExamination of dynamic aspects of task (re-)allocation and operation during normal conditions (also in the presence of internal or external disturbances)System changes due to Self-Adaptation and Self-Optimization must lead to new stable conditionsRobustness against mal-behaving internal/external components (comparable to illness in a biological system)
Being able to react to „ill“ OPCsCounter-measures against malicious attacks
Immune mechanisms for advanced Self-Healing and Self-Protecting Aspects to increase Robustness (together with the Organic Processing Cell Capabilities and the Organic Monitoring)Examination and Evaluation of different Self-optimization ScenariosQuality Analysis of the Artificial Hormone System
21.09.200927 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Low Power Management: Outlook(Prof. Henkel)
Thermal test simulation
Thermal issues are a major concernfor the robustness of the DodOrg System and are directly influenced bythe power distribution
Thermal models need to be developed andincorporated in the power negotiation processImprovement of negotiation process using econ-omic learning
Thermal-aware power negotiation21.09.200928 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Processing Cells: Outlook Phase III Stability and Robustness (Prof. Becker)
OPC-Lifecycle
Development Phase:
Reconfiguration
Processing Phase:
Calculation
Global View: Phases are concurrent within DodOrg Cell-Array
Ongoing Change
Adaptation Mechanisms
Resulting Configurations
21.09.200929 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Organic Processing Cells: Outlook Phase III Goals (Prof. Becker)
►Chip To Chip CommunicationSeamless and transparent expansion of the on chip communication servicesDynamic OPC resource pool physical growth of the whole DodOrg organism New challenges for all subprojects
►Robustness on hardware-level during development phaseScope
Loading new ConfigurationEstablish-Inter-Cell-DatapathPower Up/Down Cells
Goal : Reach Fail Safe State after development phase ►Robustness on hardware-level during processing phase
ScopeOPC-Datapath (packet sender)OPC to OPC communication Path ( artNoC-Network)
Goal: Cell immune System with Cell-Input/Output-Guidance Mechanism
21.09.200930 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Questions?
Thank you for your attention!
21.09.200931 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
Application Testbed(all groups)
Organic Middleware(Brinkschulte)
Organic Processing Cells(Becker)
Organic Low Power Management (Henkel)
Orga
nic M
onito
ring S
ystem
(Kar
l)
List of Publications:
21.09.200932 SPP 1183 OC Kolloquim – Augsburg, 21.-22. September 2009
• D. Kramer, R. Buchty, and W. Karl, “A Scalable and Decentral Approach to Sustained System Monitoring“, ACACES,2009
• R. Buchty and W. Karl, “Design Aspects for Self-Organizing Heterogeneous Multi-Core Architectures“, IT - Information Technology Journal 5/08, 2008
• R. Buchty, D. Kramer, and W. Karl, “An Organic Computing Approach to Sustained Real-time Monitoring“, BICC08, 2008
• R. Buchty, O. Mattes, and W. Karl, “Self-aware Memory: Managing Distributed Memory in an Autonomous Multi-master Environment,“ ARCS, 2008
• R. Buchty and W. Karl, A Monitoring) “Infrastructure for the Digital on-demand Computing Organism (DodOrg)“, IWSOS, 2006
• Hans-Peter Löb, Rainer Buchty, Wolfgang Karl, “A Network Agent for Diagnosis and Analysis of Real-time Ethernet Networks“, CASES, 2006
• U. Brinkschulte and A. von Renteln, “Analyzing the Behavior of an Artificial Hormone System for Task Allocation”, ICATC, 2009
• U. Brinkschulte , A. von Renteln, and M. Weiss, “Examining Task Distribution by an artificial hormone system based middleware”, ISORC, 2008
• U. Brinkschulte, M. Pacher and A. von Renteln, “An Artificial Hormone System for Self-Organizing Real-Time Task Allocation”, in Organic Computing, 2007
• U. Brinkschulte, A. von Renteln, and M. Pacher, “Reliability of an Artificial Hormone System with Self-X Properties”, PDCS, 2007
• T. Ebi, M. A. Al Faruque, and J. Henkel, “TAPE: Thermal-aware Agent-based Power Economy for Multi/Many-Core Architectures”, ICCAD 2009 (accepted)
• M. Shafique, L. Bauer, and J. Henkel, “REMiS: Run-time Energy Minimization Scheme in a Reconfigurable Processor with Dynamic Power-Gated Instruction Set” , ICCAD 2009 (accepted)
• M. A. Al Faruque, R. Krist, J. Henkel: ”ADAM: Run-time Agent-based Distributed Application Mapping for on-chip Communication", DAC 2008
• C. Schuck, B. Haetzer, and J. Becker, “An Interface for a Decentralized 2d-Reconfiguration on Xilinx Virtex-FPGAs for Organic Computing“, ReCoSoC, 2008
• C. Schuck, M. Kuehnle, M. Huebner, and J. Becker, “A framework for dynamic 2D placement on FPGAs“ , IPDPS, 2008
• C. Schuck, S. Lamparth, J. and Becker, ”artNoC - A Novel Multi-Functional Router Architecture for Organic Computing”, FPL, 2007