ecai 2014 tutorial on a statistical analysis tool for agent-based simulations (meme)

Download ECAI 2014 Tutorial on a statistical analysis tool for agent-based simulations (MEME)

If you can't read please download the document

Upload: tamasmahr

Post on 20-Aug-2015

155 views

Category:

Science


0 download

TRANSCRIPT

  1. 1. Discussion - Matthias Meyer DESIGNING AND EXECUTING COMPUTATIONAL EXPERIMENTS WITH MEME Dr. Lszl Gulys, ELTE & AITIA ([email protected]) Dept. Of History and Philosophy of Science, Lorand Etvs University, Budapest Intelligent Applications and Web Services, AITIA International, Inc.
  2. 2. Computational Experiments with MEME Dr. Lszl Gulys Provide practical help with designing and executing computational experiments (i.e., simulations) Show how to carry out DoE in practice Introduce the Model Exploration Module (MEME) Aims of this talk What are the practical challenges of designing and executing computational experiments? How to implement and execute the DoE approach in practice?
  3. 3. Computational Experiments with MEME Dr. Lszl Gulys Overview ABM research process 3
  4. 4. Computational Experiments with MEME Dr. Lszl Gulys Overview ABM research process 4
  5. 5. Computational Experiments with MEME Dr. Lszl Gulys Motivation (Lorscheid Heine Meyer)
  6. 6. Computational Experiments with MEME Dr. Lszl Gulys Motivation (cont'd) Help to make simulations an open box Help to understand the complex behavior of social simulations Increase the methodological standards of social simulations E.g., in comparison with more traditionally experimental sciences
  7. 7. Computational Experiments with MEME Dr. Lszl Gulys The difficulties of Agent-based Simulations Modeling social systems is a great challenge Theoretically, methodologically, computationally Grasping what is important in a complex social system and understanding what drives its dynamics hard but also makes the endeavor interesting and inspiring Turning the developed model into a simulation (i.e., implementing the abstract roles and rules in a computer program) requires skills and hard work Luckily, today helped by several software tools and packages: Repast, Repast Symphony, NetLogo, MASON
  8. 8. Computational Experiments with MEME Dr. Lszl Gulys The difficulties of Agent-based Simulations (cont'd) Modeling social systems is a great challenge Theoretically, methodologically, computationally Understanding the complex behavior of the created simulation with all its sensitivities and non-linearities another, important challenge in the research process of modeling social systems Facing this challenge requires methodological knowledge tools to help applying them
  9. 9. Computational Experiments with MEME Dr. Lszl Gulys Agenda (Lorscheid Heine Meyer)
  10. 10. Computational Experiments with MEME Dr. Lszl Gulys The General Approach Computer simulations are experiments Where the experimenter tries to determine How the systems response (output) depends On controllable factors (parameters) One may also want to do replicates (cf. RNG seeds) System (p1, p2, p3, p4, ) (r1, r2, r3, r4, )
  11. 11. Computational Experiments with MEME Dr. Lszl Gulys Agenda (Lorscheid Heine Meyer)
  12. 12. Computational Experiments with MEME Dr. Lszl Gulys Agenda (Lorscheid Heine Meyer)
  13. 13. Computational Experiments with MEME Dr. Lszl Gulys Practical Steps of (6) Performing a Simulation Experiment Set the parameters (factor values) Combinations or levels WHAT to record Variables (time series) Agent variables (changing length!) Derived values (statistics) WHEN to record @end, @timestep, @N timesteps, @condition WHEN to STOP the simulation Fixed number of steps, condition reached, etc. WHERE to execute Local computer, local cluster, grid, cloud (comfort, pricing)
  14. 14. Computational Experiments with MEME Dr. Lszl Gulys Practical Steps of (6) Performing a Simulation Experiment (cont'd) Collect the results When using more than a single core/computer, result files end up dispersed Assemble the result set The ordering of the records (table rows) could be arbitrary The number of columns may vary in row output (e.g., when recording raw agent variables) Often one also needs to pre-process the result set Aggregating, Splitting / Slicing (see example) Archive the experiment Keep a 'logbook' of your experiments What results came from what experiment, when and with what settings
  15. 15. Computational Experiments with MEME Dr. Lszl Gulys Agenda How to implement and execute computational experiments? Practicalities Advanced Designs (beyond factorials) Composite Central Box-Behnken Latin HyperCubes IntelliSweep tools Iterative methods Self-guided searches
  16. 16. Computational Experiments with MEME Dr. Lszl Gulys MEME The Model Execution Module
  17. 17. Computational Experiments with MEME Dr. Lszl Gulys MEME Functions Assists the research process from the point when the implementation of the model (or a version of it) is complete until the publication of the collected results Helps configuring the simulation to record the proper variables Data series from program variables or specific statistics of them Offers wizards for a variety of experimental designs Including fractional factorials and more Orchestrates the execution of the experiment on a single computer or on cluster or in the cloud Collects the recorded data in standard data tables
  18. 18. Computational Experiments with MEME Dr. Lszl Gulys MEME Functions (cont'd) Functions to preview the results perform exploratory analysis, preliminary charting Export options and interfaces To standard popular statistical packages like R, SPS, STATA Optionally, a personal 'laboratory logbook' Archiving and documenting the computational experiments performed by the modeler
  19. 19. Computational Experiments with MEME Dr. Lszl Gulys MEME The Model Exploration Module Part of the Multi-Agent Simulation Suite (MASS) Repast J, Repast Sym*, NetLogo, MASON, EMIL-S, FABLES GOAL: User firendly toolset for ABM Hiding coding / implementational difficulties as much as possible Ease of use for non-technical people MEME is responsible for Design Execution Data collection of computational experiments (agent-based simulations)
  20. 20. Computational Experiments with MEME Dr. Lszl Gulys MEME History (since 2005) 2005 Tool to administer and process simulation results (Repast J & FABLES) 2006 Setting up config files for parameter sweeps (Repast J) 2007 Distributed execution on local clusters and grids (QosCosGrid) 2007 Design of Experiment (Classic Designs) 2008 Multi-Platform Support (EMIL-S, NetLogo, Custom Java, Repast Sym*) 2008 Advanced statistics in recording 2009 Standard Interface for results processing 2009 Advanced DoE designs (freely extensible architecture) 2010 Intellisweep plugins (iterative, self-guiding exploration of param space) 2011 Execution in the cloud (http://modelexploration.aitia.ai/) 2012 Support for the MASON simulation package 2013 MEME goes open source
  21. 21. Computational Experiments with MEME Dr. Lszl Gulys Design of Experiments in MEME Classic simulation experiments with parameter files Classic DoE tables DoE Wizards Factorials Fractional Factorials More IntelliSweep experiments Iterative methods Self-guided searching of the parameter space Optimization E.g., Genetic Algorithms
  22. 22. Computational Experiments with MEME Dr. Lszl Gulys Full Factorial Designs Classic parameter sweeps as we know them
  23. 23. Computational Experiments with MEME Dr. Lszl Gulys Full Factorial Design A design in which every setting of every factor appears with every setting of every other factor A specialized version of the brute force strategy Determines the same number of values (levels) for each parameter (factor)
  24. 24. Computational Experiments with MEME Dr. Lszl Gulys Fractional Factorial Designs Full factorial designs may be demanding even with two levels only (k=10, 2k =1024) A factorial experiment is in which only an adequately chosen fraction of the parameter combinations required for the complete factorial experiment is selected to be run
  25. 25. Computational Experiments with MEME Dr. Lszl Gulys What Fraction To Run? Typically, we pick , of the full factorial Properly chosen fractional factorial designs for 2-level experiments have the desirable properties of both being Orthogonal and Balanced
  26. 26. Computational Experiments with MEME Dr. Lszl Gulys Example: 2-Level Fractional Factorial Experiments with MEME
  27. 27. Computational Experiments with MEME Dr. Lszl Gulys Central Composite Design 1 The linear fit provided by the 2-level factorial methods may not be enough To build quadratic, or other higher-order models we need new designs
  28. 28. Computational Experiments with MEME Dr. Lszl Gulys Central Composite Design 2 A factorial design with added star points on the axis of the parameters + a center point The star points can make new extreme values for the parameters (both min and max) The newly added points help to estimate the curvature
  29. 29. Computational Experiments with MEME Dr. Lszl Gulys Central Composite Design 3 There are three different types of CCDs: Circumscribed (CCC) Face centered (CCF) Inscribed(CCI) CCC and CCI are rotatable designs, because every design point is at equal distance from the center The variance of the predicted response of a model based on a rotatable design depends only on the distance from the center point
  30. 30. Computational Experiments with MEME Dr. Lszl Gulys
  31. 31. Computational Experiments with MEME Dr. Lszl Gulys An Alternative Choice to Fit Quadratic Responses The Box-Behnken design an independent quadratic design Does not contain an embedded factorial or fractional factorial Treatment combinations are at the midpoints of edges and at the center A sphere that protrudes through each face
  32. 32. Computational Experiments with MEME Dr. Lszl Gulys Properties of The Box-Behnken Design Rotatable (or near rotatable) Requires 3 levels of each factor Have limited capability for orthogonal blocking compared to the central composite designs
  33. 33. Computational Experiments with MEME Dr. Lszl Gulys Goals and Details of the Box-Behnken Design The design should be sufficient to fit a quadratic model The ratio of the number of experimental points to the number of coefficients in the quadratic model should be reasonable In fact, their designs kept it in the range of 1.5 to 2.6 The estimation variance Should more or less depend only on the distance from the centre This is achieved exactly for the designs with 4 and 7 factors Should not vary too much inside the smallest (hyper)cube containing the experimental points
  34. 34. Computational Experiments with MEME Dr. Lszl Gulys Latin Hypercube Designs, 1 A screening method that Easily handles more than 2 levels and uses much less runs than the factorial design LHD designs operate on A subset of the parameter space defined by a single contiguous interval for each dimension (parameter) a hypercube The subset is defined by giving the low and high values for each tested factor (parameter)
  35. 35. Computational Experiments with MEME Dr. Lszl Gulys Latin Hypercube Designs, 2 A criterion: non-collapsing design If one of the parameters has (almost) no influence, then two experiments that differ only in this parameter collapse They are like measuring the same point twice This is a waste of the resources (in deterministic cases) Therefore, two design points should not share any coordinate values If it is not known a priori, which dimensions are important
  36. 36. Computational Experiments with MEME Dr. Lszl Gulys Latin Hypercube Designs, 3 Definition: A d-dimensional grid of n levels in every dimension Each level occurs only once A non-collapsing design
  37. 37. Computational Experiments with MEME Dr. Lszl Gulys Latin Hypercube Designs, 4 A desirable property: Space filling When no details on the functional behavior of the response parameters are available, it is important to obtain information from the entire design space The points of the design should be evenly spread over the entire hypercube
  38. 38. Computational Experiments with MEME Dr. Lszl Gulys Latin Hypercube Designs, 5 A MAXIMIN design is a set of points, such that The separation distance is maximal I.e., the minimal distance among pairs of points Assuming that the samples represent their surroundings one wants to make sure that we use our sample points efficiently We maximize the r common radius of spheres around the design points so that they dont intersect Any distance metric can be used, but L2 (Euclidean) is a common choice
  39. 39. Computational Experiments with MEME Dr. Lszl Gulys A Note on MAXIMIN LHDs Finding a MAXIMIN, non-collapsing design for many dimensions and a high number of levels is very hard Therefore, often pre-calculated designs are used, and/or the MAXIMIN property is only approximated
  40. 40. Computational Experiments with MEME Dr. Lszl Gulys Latin Hypercube Designs in MEME The LHD plugin in MEME supports Up to 100 levels and Up to 10 dimensions Uses predefined experiment designs Calculated by heuristic methods to approximate MAXIMIN LHD designs http://www.spacefillingdesigns.nl/maximin/info.html
  41. 41. Computational Experiments with MEME Dr. Lszl Gulys Latin Hypercube Designs in MEME
  42. 42. Computational Experiments with MEME Dr. Lszl Gulys Dynamic IntelliSweep methods So far, the entire design was fixed before starting the experiment There was no feedback from the measured responses to the design Various (optimization) methods exist that use a different strategy Hill climbing, simulated annealing, genetic algorithms, etc.
  43. 43. Computational Experiments with MEME Dr. Lszl Gulys Iterative Uniform Interpolation 1 IUI is a response analysis method Refines the parameter domain between iterations to achieve better interpolation (of the response value) Examines interesting subintervals by dividing them further Deviation from the previously observed (assumed) gradient spans new measurements.
  44. 44. Computational Experiments with MEME Dr. Lszl Gulys Iterative Uniform Interpolation 2
  45. 45. Computational Experiments with MEME Dr. Lszl Gulys Genetic Algorithm Driven Methods Optimization Genetic algorithm (GA) is a heuristic optimization method F( o1 , , on ) max Can be directly used for response analysis If we are not interested in the entire response surface, but only in high/low response values
  46. 46. Computational Experiments with MEME Dr. Lszl Gulys Genetic Algorithm Driven Methods Active Non-linear Tests, 1 Active Non-linear Tests (ANTs) were proposed by John H. Miller of CMU and SFI A response analysis method that uses GA to disprove (prove) a user-defined thesis Thesis: P {(o1 , , on ) | } Sample by GA: P {(o1 , , on ) | (p1 , , pm ) Pm }
  47. 47. Computational Experiments with MEME Dr. Lszl Gulys Genetic Algorithm Driven Methods Active Non-linear Tests, 2 Measure the fitness of the sample to the thesis Breed parameter combinations that are farthest from the thesis In the end, the farthest sample provides a level of falsification An effective non-linear optimization method is being used to falsify a thesis!
  48. 48. Computational Experiments with MEME Dr. Lszl Gulys MEME as a Personal Laboratory Logbook The more one gets immersed in (computational) experiments, the more results are filling the hard drive Nothing can be as alien/unknown than your own code after two months Same applies to experimental results and settings, let alone charts Hey, this is a nice chart. I wish I remembered what exact parameter settings I user to run the model with! Being disciplined always helps, but tools may help being disciplined. MEME stores all result sets in a DB (together with settings) Grouped by model, version and batch (enforced) You can add comments, remarks, descriptions to them
  49. 49. Computational Experiments with MEME Dr. Lszl Gulys Param Sweeps Param Sweeps Param Sweeps Param Sweeps Results DB Charts Versioning and Merging Filtering, Processing, Restructuring Views Export (Excell, SPSS, etc.) Import (txt, csv, Excell, etc.)
  50. 50. Computational Experiments with MEME Dr. Lszl Gulys Summary Discussed the technical issues and challenges of executing computational experiments Explained the challenges of applying the DoE approach in practice Introduced MEME as a tool to assist from the point when the implementation of the simulation is complete Discussed the usage of MEME Together with advanced designs for computational experiments
  51. 51. Computational Experiments with MEME Dr. Lszl Gulys MEME History (since 2005) 2005 Tool to administer and process simulation results (Repast J & FABLES) 2006 Setting up config files for parameter sweeps (Repast J) 2007 Distributed execution on local clusters and grids (QosCosGrid) 2007 Design of Experiment (Classic Designs) 2008 Multi-Platform Support (EMIL-S, NetLogo, Custom Java, Repast Sym*) 2008 Advanced statistics in recording 2009 Standard Interface for results processing 2009 Advanced DoE designs (freely extensible architecture) 2010 Intellisweep plugins (iterative, self-guiding exploration of param space) 2011 Execution in the cloud (http://modelexploration.aitia.ai/) 2012 Support for the MASON simulation package 2013 MEME goes open source
  52. 52. Computational Experiments with MEME Dr. Lszl Gulys Links to Software http://mass.aitia.ai/documentation/tutorials/201-ecai-2014- mass-tutorial http://meme.aitia.ai/ http://modelexploration.aitia.ai/ http://mass.aitia.ai http://pet.aitia.ai/
  53. 53. Computational Experiments with MEME Dr. Lszl Gulys Thank you! Questions? [email protected]
  54. 54. Computational Experiments with MEME Dr. Lszl Gulys Backup