simulation, emulation sathish vadhiyar sources / credits: microgrid, simgrid

37
Simulation, Simulation, Emulation Emulation Sathish Vadhiyar Sathish Vadhiyar Sources / Credits: Sources / Credits: Microgrid, Simgrid Microgrid, Simgrid

Upload: alban-cox

Post on 02-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Simulation, EmulationSimulation, Emulation

Sathish VadhiyarSathish Vadhiyar

Sources / Credits: Microgrid, Sources / Credits: Microgrid, SimgridSimgrid

Page 2: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

ImportanceImportance

Needed for characterizing behavior Needed for characterizing behavior of Grid systems in the futureof Grid systems in the future

During development period, to test During development period, to test methodologies under repeatable methodologies under repeatable conditionsconditions

For simulating “what if” scenariosFor simulating “what if” scenarios

Needed when there is no real grid. Needed when there is no real grid. Needed in IndiaNeeded in India

Page 3: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

MicroGridMicroGrid

Enables systematic design and evaluation of Enables systematic design and evaluation of middleware, applications, and network services middleware, applications, and network services for computational Grid.for computational Grid.Provides an environment for scientific and Provides an environment for scientific and repeatable experiments.repeatable experiments.Microgrid can also predict performance on Microgrid can also predict performance on futuristic and fictional topologiesfuturistic and fictional topologiesFeaturesFeatures

Enables use of Globus applications without change by Enables use of Globus applications without change by virtualizing execution environment providing the illusion virtualizing execution environment providing the illusion of virtual Grid.of virtual Grid.

Uses global virtual time to preserve simulation accuracyUses global virtual time to preserve simulation accuracy Provides basic resource simulation models for Provides basic resource simulation models for

computing, memory and networkingcomputing, memory and networking

Page 4: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Virtualizing resourcesVirtualizing resources

Uses mapping table for mapping from virtual IP Uses mapping table for mapping from virtual IP address to physical IP addressaddress to physical IP addressIntercepts relevant library callsIntercepts relevant library calls

GethostbynameGethostbyname Bind, send, receiveBind, send, receive Process creation – process created through Globus Process creation – process created through Globus

resource management functionsresource management functions

User will be logged in directly to a physical host User will be logged in directly to a physical host and submit jobs to virtual hostsand submit jobs to virtual hostsGlobus gatekeeper, job managers and client hosts Globus gatekeeper, job managers and client hosts run on virtual hostsrun on virtual hostsAll socket interfaces and information services are All socket interfaces and information services are also virtualizedalso virtualized

Page 5: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Global CoordinationGlobal Coordination

Simulation Rate – rate at which simulator Simulation Rate – rate at which simulator runs. How much of real cpu is simulator runs. How much of real cpu is simulator using. using. Minimum feasible simulation rateMinimum feasible simulation rate depending on desired virtual resources and depending on desired virtual resources and actual capacities of physical resourcesactual capacities of physical resources

Minimum value of SR over all resources – Minimum value of SR over all resources – fastest rate at which simulation can be run fastest rate at which simulation can be run in a functionally correct mannerin a functionally correct manner

Page 6: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Simulation Rate ExamplesSimulation Rate Examples

Given physical = 1 GHz, virtual = 2 Given physical = 1 GHz, virtual = 2 GHz, simulation rate cannot be less GHz, simulation rate cannot be less than 2. Otherwise you will be than 2. Otherwise you will be guaranteeing more than 100% CPU guaranteeing more than 100% CPU usage !usage !

Given physical = 2 GHz, virtual = 1 Given physical = 2 GHz, virtual = 1 GHz, simulation rate cannot be less GHz, simulation rate cannot be less than 0.5. Same argument.than 0.5. Same argument.

Page 7: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

MoreMore

Another parameter (say x) that determines Another parameter (say x) that determines how fast time progresses in the applicationhow fast time progresses in the applicationGreater the value, faster the time Greater the value, faster the time progresses in the applicationprogresses in the applicationCalls like gettimeofday and select use Calls like gettimeofday and select use these parameters to return appropriate these parameters to return appropriate adjusted timesadjusted timesThus virtual cpu twice the speed of real Thus virtual cpu twice the speed of real cpu, simulation rate = 2, and x =2 will cpu, simulation rate = 2, and x =2 will give ½ the time for a code fragmentgive ½ the time for a code fragment

Page 8: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Resource SimulationResource Simulation

Page 9: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Resource SimulationResource Simulation

Simulation rate is divided equally Simulation rate is divided equally across all processes executing on the across all processes executing on the physical hostphysical hostThe resulting fractions are then The resulting fractions are then enforced by local MicroGrid CPU enforced by local MicroGrid CPU schedulerschedulerIt is a scheduler daemon using It is a scheduler daemon using signals to allocate local physical CPU signals to allocate local physical CPU capacity to local MicroGrid taskscapacity to local MicroGrid tasks

Page 10: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

How to ensure CPU usageHow to ensure CPU usage

Naïve strategy - Calculate usage for procs. on Naïve strategy - Calculate usage for procs. on virtual machine. Give all procs. the same virtual machine. Give all procs. the same usage.usage.E.g. if (virtual / physical) is 25% and 2 procs. E.g. if (virtual / physical) is 25% and 2 procs. running on virtual machine, assign each running on virtual machine, assign each process 10 milliseconds every 80 process 10 milliseconds every 80 milliseconds.milliseconds.Not goodNot good An application process should always be ready to An application process should always be ready to

run if it has not used its available CPU slotsrun if it has not used its available CPU slots A computation intensive process should be able to A computation intensive process should be able to

fully utilize the quota for virtual machinefully utilize the quota for virtual machine

Page 11: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

MicroGrid CPU ControllerMicroGrid CPU Controller

Each CPU controller on each physical hostEach CPU controller on each physical hostUses SIGSTOP and SIGCONT to stop and continue Uses SIGSTOP and SIGCONT to stop and continue processesprocessesConsists of 3 partsConsists of 3 parts

Live process interception – whenever a virtual process is Live process interception – whenever a virtual process is created or destroyed on microgrid using main() or exit(), created or destroyed on microgrid using main() or exit(), CPU controller traps it and updates its process tableCPU controller traps it and updates its process table

CPU usage monitoring – every sliding window, the controller CPU usage monitoring – every sliding window, the controller reads CPU usage from /proc of processes in its process tablereads CPU usage from /proc of processes in its process table

Process scheduling – the controller calculates CPU usage of Process scheduling – the controller calculates CPU usage of each virtual host in a time window. If the amount of each virtual host in a time window. If the amount of effective cycles exceed the speed of the virtual hosts, the effective cycles exceed the speed of the virtual hosts, the controller sends SIGSTOP to all processes of the virtual controller sends SIGSTOP to all processes of the virtual hosts, otherwise, it wakes up processes and let them hosts, otherwise, it wakes up processes and let them proceedproceed

Page 12: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

CPU ControllerCPU Controller

Page 13: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Determining sliding window sizeDetermining sliding window size

E - design accuracy errorE - design accuracy error

p - scaled virtual machine speed (fraction p - scaled virtual machine speed (fraction of physical CPU)of physical CPU)

w - the sliding window size in jiffiesw - the sliding window size in jiffies

n - the available jiffies in a sliding windown - the available jiffies in a sliding window

n should satisfy: w = round(n/p) and | 1 - n should satisfy: w = round(n/p) and | 1 - n/(p*w) | < En/(p*w) | < E

Find the smallest n that satisfies equation | Find the smallest n that satisfies equation | 1 - (n/p)/round(n/p) | < E, then find w.1 - (n/p)/round(n/p) | < E, then find w.

Page 14: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

ExampleExample

Real machine – 1 GHzReal machine – 1 GHzVirtual machine – 600 MHzVirtual machine – 600 MHzSimulation rate – 2Simulation rate – 2E – 0.05E – 0.05p = 600/1000 = 60%, with simulation rate p = 600/1000 = 60%, with simulation rate 2, it is 30% real cpu2, it is 30% real cpuSmallest n that satisfies | 1 - (10n/3) / Smallest n that satisfies | 1 - (10n/3) / round(10n/3) | < 0.05round(10n/3) | < 0.05Try n= 1,2,3…Try n= 1,2,3…Here, n = 2Here, n = 2w = 7w = 7

Page 15: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Network SimulationNetwork Simulation

Based on MaSSF – a scalable packet-level Based on MaSSF – a scalable packet-level network simulator that supports direct network simulator that supports direct execution of unmodified applicationexecution of unmodified applicationUses a distributed simulation engineUses a distributed simulation engineCan model many kinds of network Can model many kinds of network protocols including TCP/IP, UDP, user-protocols including TCP/IP, UDP, user-defined protocols etc.defined protocols etc.Intercepts live network streams at the Intercepts live network streams at the socket level using wrapper library called socket level using wrapper library called WrapSocketWrapSocket

Page 16: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Live traffic interceptionLive traffic interception

Page 17: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

ScalabilityScalabilityGiven a network topology and Given a network topology and available cluster nodes, MaSSF available cluster nodes, MaSSF partitions the virtual network to partitions the virtual network to multiple blocks and assigns each block multiple blocks and assigns each block to a cluster nodeto a cluster node

Every cluster node runs a discrete Every cluster node runs a discrete event simulation engineevent simulation engine

Events are exchanged among Events are exchanged among simulation engines. Cluster nodes also simulation engines. Cluster nodes also needs to synchronize periodically. needs to synchronize periodically. Involves trafficInvolves traffic

Page 18: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

ScalabilityScalabilityHence network mapping has to be Hence network mapping has to be done carefully to minimize done carefully to minimize communication of simulation events communication of simulation events between simulation engine nodes and between simulation engine nodes and to achieve load balance across to achieve load balance across partitionspartitions

Network mapping problem modeled Network mapping problem modeled as graph partitioning problem – can as graph partitioning problem – can estimate the number of simulation estimate the number of simulation events on each single link and use it events on each single link and use it to calculate edge weight.to calculate edge weight.

Page 19: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Improving scalabilityImproving scalabilityGraph partitioning for network Graph partitioning for network mapping problemmapping problem

Input graph – traffic information Input graph – traffic information (defines edge weights), network (defines edge weights), network structurestructure

Constraints – weighted sum of Constraints – weighted sum of computation and memory computation and memory requirement on each simulation requirement on each simulation engine node (vertex weight) to be engine node (vertex weight) to be balanced among multiple verticesbalanced among multiple vertices

Objectives – communication across Objectives – communication across partitions (edge-cut) to be partitions (edge-cut) to be minimizedminimized

Partitioned network defines the Partitioned network defines the mapping of simulated network mapping of simulated network nodes to physical resourcesnodes to physical resources

Page 20: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Real applications on MicroGrid - Real applications on MicroGrid - Lot more to do…Lot more to do…

Page 21: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

SimGridSimGrid

You know itYou know it

Page 22: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

References / Sources / CreditsReferences / Sources / Credits

Validating and Scaling the MicroGrid: A Validating and Scaling the MicroGrid: A Scientific Instrument for Grid Dynamics, Xin Liu, Scientific Instrument for Grid Dynamics, Xin Liu, Huaxia Xia, and Andrew Chien, to appear in the Huaxia Xia, and Andrew Chien, to appear in the Journal of Grid Computing.Journal of Grid Computing.

The MicroGrid: a Scientific Tool for Modeling The MicroGrid: a Scientific Tool for Modeling Computational Grids Computational Grids , in Proceedings of SC2000 , in Proceedings of SC2000 (Song, Liu, Jakobsen, Bhagwan, Zhang, Taura (Song, Liu, Jakobsen, Bhagwan, Zhang, Taura and Chien)and Chien)

Simgrid: A Toolkit for the Simulation of Simgrid: A Toolkit for the Simulation of Application Scheduling. CCGrid 01Application Scheduling. CCGrid 01

Page 23: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

JUNK!JUNK!

Page 24: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

CallsCalls Setting up the simulated application and computation Setting up the simulated application and computation

environmentenvironment Simulating the application execution once the tasks have been Simulating the application execution once the tasks have been

assigned to resources – SG_simulateassigned to resources – SG_simulate Scheduling algorithmsScheduling algorithms

Based on performance prediction – SG_getPredictionBased on performance prediction – SG_getPredictionImplementation of scheduling decision – Implementation of scheduling decision – SG_scheduleTaskOnResourceSG_scheduleTaskOnResourceAlso supports runtime scheduling algorithms. Control must be Also supports runtime scheduling algorithms. Control must be returned from SG_simulate to scheduling algorithm itself. For work returned from SG_simulate to scheduling algorithm itself. For work queue control is returned after each task completes. For others, queue control is returned after each task completes. For others, user can specify how long a simulation should run before control is user can specify how long a simulation should run before control is returned. SG_unscheduleTask can be used to modify scheduling returned. SG_unscheduleTask can be used to modify scheduling decisions for tasks. Many API calls help the user to keep track of decisions for tasks. Many API calls help the user to keep track of past scheduling decisions.past scheduling decisions.

Page 25: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

SG_getclock returns virtual global timeSG_getclock returns virtual global time

Can do post mortem analysis with the help Can do post mortem analysis with the help of resource usage and start and end times of resource usage and start and end times and compute various metrics and how the and compute various metrics and how the simulation behavedsimulation behaved

Page 26: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

SimGrid-2 paperSimGrid-2 paper

Simulations allowSimulations allow Repeatable experimentsRepeatable experiments To explore wide range of application and resource To explore wide range of application and resource

scenariosscenarios

SimgridSimgrid For developing and evaluating scheduling algorithmsFor developing and evaluating scheduling algorithms Objectives – good usability, fast simulations, Objectives – good usability, fast simulations,

configurable, tunable and extensible simulations, configurable, tunable and extensible simulations, scalablescalable

Aim towards simulation standardizationAim towards simulation standardization

Page 27: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Simgrid componentsSimgrid components

Agent – implements scheduling algorithm, Agent – implements scheduling algorithm, contains code, private data and locationcontains code, private data and locationLocation – where agent runs, defined by Location – where agent runs, defined by location, mail boxes for communicating with location, mail boxes for communicating with other agents and private dataother agents and private dataTask – defined by amount of computing, data Task – defined by amount of computing, data size, private datasize, private dataPath – routing abstractionsPath – routing abstractionsChannel – abstraction representing Channel – abstraction representing communication between agentscommunication between agents

Page 28: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Simulation program stepsSimulation program steps

Definition of code for each agentDefinition of code for each agent Modeling applicationModeling application Done with MSG_Task_Get, MSG_Task_Put, Done with MSG_Task_Get, MSG_Task_Put,

MSG_Task_ExecuteMSG_Task_Execute

Creation of resourcesCreation of resources Modeling the physical platformModeling the physical platform Hosts, links, routing table pathsHosts, links, routing table paths MSG_host_create, MSG_link_create, MSG_routing_table_setMSG_host_create, MSG_link_create, MSG_routing_table_set

Creation and allocation of agents to locationsCreation and allocation of agents to locations Application deploymentApplication deployment MSG_process_createMSG_process_create

Starting simulationStarting simulation MSG_main MSG_main

Page 29: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Resource sharing is supported by SimGrid by Resource sharing is supported by SimGrid by supporting different modelssupporting different models FIFOFIFO FRFOFRFO SHARED – fair sharing or priority-based sharingSHARED – fair sharing or priority-based sharing

ChallengesChallenges Users to construct large simulated platformsUsers to construct large simulated platforms To simulate the complex network contention To simulate the complex network contention

behaviors of applications executing on these behaviors of applications executing on these platformsplatforms

Page 30: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Modeling grid topologiesModeling grid topologies

Simgrid allows users to import platform Simgrid allows users to import platform descriptions obtained with Effective descriptions obtained with Effective Network View (ENV).Network View (ENV).

Thus SimGrid uses ENV and NWS to Thus SimGrid uses ENV and NWS to instantiate platform models which instantiate platform models which represent realistic platforms both in terms represent realistic platforms both in terms of topology and in terms of traffic.of topology and in terms of traffic.

Page 31: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Bandwidth sharing modelsBandwidth sharing models

Algorithm first considers all bottleneck links and flows on Algorithm first considers all bottleneck links and flows on these linksthese linksAssigns a bandwidth to flows on these links inversely Assigns a bandwidth to flows on these links inversely proportional to their rtts.proportional to their rtts.Algorithm reduces bandwidths on the links traversed by Algorithm reduces bandwidths on the links traversed by these flowsthese flowsProcess repeated until bandwidths assigned to all flowsProcess repeated until bandwidths assigned to all flowsSimgrid makes it possible to define two types of links: Simgrid makes it possible to define two types of links: those where bandwidth is shared and those where those where bandwidth is shared and those where bandwidth is not sharedbandwidth is not sharedGood for modeling grid computing topology where local Good for modeling grid computing topology where local networks connected by a shared backbonenetworks connected by a shared backbone

Page 32: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

GridSimGridSim

Individual resource Individual resource brokers and central brokers and central schedulersschedulers

Page 33: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

SimjavaSimjava

Simulations in Simjava contain a number Simulations in Simjava contain a number of entities each running as own threadsof entities each running as own threads

Entities call simulation functions Entities call simulation functions (sim_schedule, sim_hold, sim_wait) and (sim_schedule, sim_hold, sim_wait) and events are generated.events are generated.

Page 34: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Every event has source entity and Every event has source entity and destination entitydestination entity

Page 35: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

NPB with MicroGridNPB with MicroGrid

Page 36: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Scheduling quanta length and Scheduling quanta length and Modeling AccuracyModeling Accuracy

Page 37: Simulation, Emulation Sathish Vadhiyar Sources / Credits: Microgrid, Simgrid

Internal PerformanceInternal Performance

NPB run on real Alpha NPB run on real Alpha cluster of 4 machines and cluster of 4 machines and on Microgrid with CPU on Microgrid with CPU fraction 4%fraction 4%The periodic execution The periodic execution times obtained every 1 times obtained every 1 second for alpha cluster second for alpha cluster and _? second(s) for and _? second(s) for MicroGridMicroGridClose match with root Close match with root mean square percentage mean square percentage difference to be 3.08%difference to be 3.08%