[ieee 2011 5th ieee international conference on digital ecosystems and technologies (dest) -...

5
Optimization based Simulation Model Development: Solving Robustness Issues Justin Debuse, Shah Jahan Miah, IEEE Member Business Faculty University of the Sunshine Coast Maroochydore DC, Australia [email protected]; [email protected] Abstract—Mathematical models are becoming popular to represent biological systems. A mathematical model can be based upon existing knowledge from scientific literature, expert opinion, and field and laboratory studies. However, there are significant issues in model development including robustness. This study therefore examines how model quality can be improved automatically using optimization approaches. Specifically, we examine how a recently developed robust model of a forest pest species, with potential application in areas such as risk prediction [3], may have its robustness further increased using optimization. Digital eco-systems provide a powerful and broader methodological foundation and support for the implementation of optimization through application of the design science method 1 . Keywords-mathematical model; simulation; optimization I. INTRODUCTION Systems biology, within which entire biological systems are modeled [1], is based upon mathematical modeling [2]. For instance, in forest pest management, mathematical modeling has been used to address these requirements, resulting in a model of insect pest species population development [3]. The mathematical models are mainly based upon, and tested using, a wide range of data, such as scientific literature, expert opinion, field and laboratory studies [3, 4, 5]. For example, Nahrung et al.’s [3] model uses a combination of general scientific knowledge, laboratory and field data. Mathematical simulation models clearly need to give an accurate representation of problem reality if they are to be useful. Even when models are found to be robust, there may be opportunities for improvement; for example, Nahrung et al. [3] suggest that the predictive power of their simulation could be improved by the incorporation of additional parameters such as tree growth rates. An even more promising approach to improving a simulation is to adjust its parameters to optimize its fit to the data; this has the potential to improve its predictive capability without requiring any additional data or significant 1 The study intends to create an initial basis for an industry-oriented application design research, in which Dr Miah contributes to establish a methodological foundation and Dr Debuse contributes to outline a problem solving approach for a target industry. modifications to the model. This study therefore examines how model quality can be improved automatically using optimization approaches. Specifically, we examine how a recently developed robust model of a forest pest species, with potential application in areas such as risk prediction [3], may have its robustness further increased using optimization. Forest insect pests represent a highly important issue for the forestry industry, with the potential within several species to prevent plantations being viable [6]. The robustness of a simulation can be optimized by modifying its parameters so that the difference between actual data and the data produced by the simulation model and real data is minimized [7]. Many simulation optimization approaches exist, with a classification scheme having been developed [8]. However, these approaches typically do not focus upon optimizing the simulation robustness, but instead optimize the performance of the simulation; for example, the number of production lines, staff and machines in a factory can be optimized to maximize profit. As such, in the above context, the objectives of this study are: 1. Identify issues with current modeling approaches. 2. Describe approaches based on heuristics to improve modeling outcomes. 3. Propose an improved for the target problem model based on design science methodology. A digital ecosystem is an interdisciplinary platform that facilitates self organizing digital system solutions aimed at creating a digital environment for agents in organizations [9]. This paradigm can support development of technologies for knowledge interactions between software solutions and users [10]. Methodologies for design science research address the focus of this study, namely solution development research. Our study therefore advances current digital ecosystem theories as it applies design science methods to innovating and improving existing solution approaches in the problem space. We address this through an investigation of how model quality can be improved automatically for the target user groups. Our work has some similarity with a recent study [11], where digital ecosystems theory was utilized for the implementation of electricity market bidding optimization, using bi-level optimization with an evolutionary (particle swarm) approach. The goal was to optimize bidding behavior modeling accuracy and therefore increase model robustness; this parallels the goal 5th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2011), 31 May -3 June 2011, Daejeon, Korea ISBN: 978-1-4577-0872-5 (c) 2011 IEEE 133

Upload: shah-jahan

Post on 10-Mar-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE 2011 5th IEEE International Conference on Digital Ecosystems and Technologies (DEST) - Daejeon, Korea (South) (2011.05.31-2011.06.3)] 5th IEEE International Conference on Digital

Optimization based Simulation Model Development: Solving Robustness Issues

Justin Debuse, Shah Jahan Miah, IEEE Member

Business FacultyUniversity of the Sunshine Coast

Maroochydore DC, [email protected]; [email protected]

Abstract—Mathematical models are becoming popular to represent biological systems. A mathematical model can be based upon existing knowledge from scientific literature, expert opinion, and field and laboratory studies. However, there are significant issues in model development including robustness. This study therefore examines how model quality can be improved automatically using optimization approaches. Specifically, we examine how a recently developed robust model of a forest pest species, with potential application in areas such as risk prediction [3], may have its robustness further increased using optimization. Digital eco-systems provide a powerful and broader methodological foundation and support for the implementation of optimization through application of the design science method1.

Keywords-mathematical model; simulation; optimization

I. INTRODUCTION

Systems biology, within which entire biological systems are modeled [1], is based upon mathematical modeling [2]. For instance, in forest pest management, mathematical modeling has been used to address these requirements, resulting in a model of insect pest species population development [3]. The mathematical models are mainly based upon, and tested using, a wide range of data, such as scientific literature, expert opinion, field and laboratory studies [3, 4, 5]. For example, Nahrung et al.’s [3] model uses a combination of general scientific knowledge, laboratory and field data.

Mathematical simulation models clearly need to give an accurate representation of problem reality if they are to be useful. Even when models are found to be robust, there may be opportunities for improvement; for example, Nahrung et al. [3] suggest that the predictive power of their simulation could be improved by the incorporation of additional parameters such as tree growth rates. An even more promising approach to improving a simulation is to adjust its parameters to optimize its fit to the data; this has the potential to improve its predictive capability without requiring any additional data or significant

1 The study intends to create an initial basis for an industry-oriented application design research, in which Dr Miah contributes to establish a methodological foundation and Dr Debuse contributes to outline a problem solving approach for a target industry.

modifications to the model. This study therefore examines how model quality can be improved automatically using optimization approaches. Specifically, we examine how a recently developed robust model of a forest pest species, with potential application in areas such as risk prediction [3], may have its robustness further increased using optimization. Forest insect pests represent a highly important issue for the forestry industry, with the potential within several species to prevent plantations being viable [6].

The robustness of a simulation can be optimized by modifying its parameters so that the difference between actual data and the data produced by the simulation model and real data is minimized [7]. Many simulation optimization approaches exist, with a classification scheme having been developed [8]. However, these approaches typically do not focus upon optimizing the simulation robustness, but instead optimize the performance of the simulation; for example, the number of production lines, staff and machines in a factory can be optimized to maximize profit. As such, in the above context, the objectives of this study are:

1. Identify issues with current modeling approaches.

2. Describe approaches based on heuristics to improve modeling outcomes.

3. Propose an improved for the target problem model based on design science methodology.

A digital ecosystem is an interdisciplinary platform that facilitates self organizing digital system solutions aimed at creating a digital environment for agents in organizations [9]. This paradigm can support development of technologies for knowledge interactions between software solutions and users [10]. Methodologies for design science research address the focus of this study, namely solution development research. Our study therefore advances current digital ecosystem theories as it applies design science methods to innovating and improvingexisting solution approaches in the problem space. We address this through an investigation of how model quality can be improved automatically for the target user groups. Our work has some similarity with a recent study [11], where digital ecosystems theory was utilized for the implementation of electricity market bidding optimization, using bi-level optimization with an evolutionary (particle swarm) approach. The goal was to optimize bidding behavior modeling accuracy and therefore increase model robustness; this parallels the goal

5th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2011), 31 May -3 June 2011, Daejeon, Korea

ISBN: 978-1-4577-0872-5 (c) 2011 IEEE 133

Page 2: [IEEE 2011 5th IEEE International Conference on Digital Ecosystems and Technologies (DEST) - Daejeon, Korea (South) (2011.05.31-2011.06.3)] 5th IEEE International Conference on Digital

proposed within this study, which is the optimization of forest pest species model robustness, although this study proposes single rather than bi-level optimization, and applies this to simulation parameters rather than simulation entity behaviors.

This paper proceeds with the background to the problem. The following section describes the problem solution approachthat is proposed. Next, the approach used in designing the solution is given. The paper ends with a discussion and presentation of conclusions.

II. BACKGROUNDThe scientific and practical usefulness of a mathematical

population simulation is dependent primarily upon how closely its results match problem reality; a model with a poor fit to data gathered from the field is clearly of little use in explaining the population scientifically or making predictions of how this will vary in the future. This robustness can be measured in many different ways, such as comparing the number of generations and the timing of peaks in each generation with real field data [3]. Robustness may potentially be improved by increasing the functionality of the model to incorporate new factors such as rainfall [3]. The robustness of a model may also potentially be improved without adding any new factors; instead, attributes of the existing model may be modified to improve its fit [7].

The mathematical model examined within this study simulates a forest insect pest population, using eight separate life stages [3]. For example, at the egg stage the model uses a function to determine how temperature affects the development rate of eggs; a second function determines how many eggs die before reaching the next stage. Each function has a number of parameters that may be modified; for example, the number of individuals that die (mortality rate) may be represented as a simple multiplier [3]. The targeted goal of model robustness is therefore to modify each of the function parameters so that the fit of the model is maximized. This is important scientifically, since improving model fit means it is likely to explain the pest population dynamics more accurately and thus yield higher quality scientific knowledge. More accurate models are also more likely to prove useful in practical applications such as predicting the extent to which forest plantations are affected by pests and in developing management strategies.

A key challenge of model robustness optimization is the number of potential solutions that must be evaluated before the best possible example is found. For example, consider a model where the only aspects to be optimized are eight separate mortality rates. If each mortality rate is constrained to be an integer percentage between 1 and 100 then there will be 1008=10,000,000,000,000,000 possible solutions. Optimization approaches that are capable of finding high quality solutions without evaluating all possible examples are therefore required within this application domain.

A number of general purpose optimization approaches, such as genetic algorithms and simulated annealing, form part of the classification scheme for simulation performance optimization [8]. These approaches work by navigating through the space of possible solutions to search for the best possible answer; for example, genetic algorithms use a simplified form of Darwinian evolution, with survival of the fittest encouraging

the population to converge upon a high quality solution. A recently developed and highly promising optimization technique that could be employed is attribute based hill climbing [12]. The approach has given comparable results to the best existing simulation approaches in an example problem, and also has the advantage of being extremely simple [13].Attribute based hill climbing could be implemented within EVA2 (an Evolutionary Algorithms framework, revised version 2), a framework within which optimization problems may be implemented together with algorithms to solve them [14]. The framework supports algorithms such as simulated annealing and genetic algorithms, and would allow comparisons to be drawn between the performance of optimization approaches within this problem area [14].

The optimization approach that we will use initially is genetic algorithms, a widely used technique with applications such as optimizing the weights of neural network representations of simulation models [15] and simulation optimization [16]. Our proposed approach thus takes an existing mathematical simulation model and improves its robustness by modifying its parameters, with the goal of maximizing how well the model fits the data.

III. PROPOSED APPROACHOur proposed approach is illustrated in Figure 1; the

process begins with an existing mathematical simulation model with proven robustness [3]. The parameters within the model that are to be optimized are then identified; these include what percent of individuals die before reaching the next stage, how quickly individuals develop and so on. For example, the function to determine the number of eggs laid per adult is set so that every 7 days of the simulation 32.5 eggs are produced [3]. The optimization could therefore be set up to determine the value for the number of days and number of eggs resulting in the most robust model; this may result in the simulation producing 40 eggs per adult every 10 days for example.

Figure 1. The Simulation Optimization Approach

Initial Simulation ModelParameters

Egg mortality 0.75Larva 1 & 2 mortality 0.67

Optimised Simulation ModelParameters

Egg mortality 0.83Larva 1 & 2 mortality 0.39

...

Optimisation AlgorithmParameters Optimised

Egg mortality (0.01-1.0) Larva 1 & 2 mortality (0.01-1.0)

5th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2011), 31 May -3 June 2011, Daejeon, Korea

ISBN: 978-1-4577-0872-5 (c) 2011 IEEE 134

Page 3: [IEEE 2011 5th IEEE International Conference on Digital Ecosystems and Technologies (DEST) - Daejeon, Korea (South) (2011.05.31-2011.06.3)] 5th IEEE International Conference on Digital

Each model parameter is represented within the optimization as a number. For example, the number of eggs laid per adult could be represented as a number ranging from 1 to 100; such numbers are constrained to be within ranges that are biologically realistic. Each solution within the genetic algorithm would therefore be represented as a series of numbers, representing the simulation parameters to be optimized.

The genetic algorithm requires an evaluation function, which takes a solution and returns a number corresponding to its quality. The evaluation function would determine the robustness of the simulation in terms of the difference between the results of the simulation and real field data. The genetic algorithm would therefore search for a minimal solution in terms of its evaluation function score.

The genetic algorithm works by creating and evolving a population of solutions, each containing simulation parameter values. New generations of the population are created using operations such as mutation (making small changes to the simulation parameter values) or crossover (creating new members by combining parts of members of the previous generation). Survival of the fittest encourages the population to move towards solutions of higher quality, as measured by the evaluation function. A halting condition is required to determine when the process of evolution stops; this may for example be implemented as a limit on the number of generations that are evolved.

The best solution found by the genetic algorithm is thus an optimized simulation model; this is identical to the original model, except that its simulation parameters have been modified to improve its fit to the field data. Both the original and optimized model would therefore share characteristics such as the number of life stages in the simulation and the routes that individuals may take when graduating from one life stage to the next. Their relative quality may also be measured using the evaluation function, to determine whether the optimized solution improves upon the original, and if so by how much.

IV. DESIGN APPROACHThe optimization approach proposed within the previous

section will be produced using a design science based methodology. The simulation optimization software that will be produced is, from a digital ecosystems viewpoint, an IT artifact whose primary role is to mediate interactions of users and systems [17]. Dini [17] describes computer science as “(being concerned) with the construction of abstract machines (and their) performance, self-optimization, self-healing capacities” (p.27). This philosophical standpoint motivates rectifying or redesigning a model or system that can operate both within objective or subjective reality in a flexible way. Also, Dini [17] states that activities could be improved if user behavior in software could be mapped, although Ciborra and Hanseth [18] argue that it is not easy to make a clear division between the objective view of the technology we build and our subjective human experience. Design science research methods can address the above requirements in designing artifacts.

March and Smith [19] define the IT artifact as system architecture, system models, systems designs or software prototypes that are designed to demonstrate the applicability of the outlined solution model. Gregor and Jones [20] describe the design science paradigm as a problem-solving oriented area that brings innovations (e.g. flexibility) for artifact design. Design science research helps produce solution-relevant and flexible design through a rigorous and iterative process that includes evaluations with target users and communication of the solution to both operational and management users [21].This view of technical design focuses more on technical system design, and is flexible in terms of how it measures quality specifically. This theoretical lens can contribute to the classical method of digital ecosystem research, in that the approach goes beyond the diversity of social implications for flexible technology design. This is because digital ecosystems promise more precisely to describe the interdependent view of technical system design within socio-technical phenomena where we build or design a service system by converting natural settings, such as science-based knowledge, for enhancing operations within our social life [17].

A. Methodology Hevner, March, Park and Ram [21] propose a design

science methodology in which seven steps are used for conducting an artifact design study. It is suggested that the design science method helps acquire knowledge and understanding of a problem domain and its solution, which needs to be achieved through the building and application of the artifact design [21]. We adapted this methodology as we have the combined goals in requirement capturing, as follows:

Design as an artifact. This involves identifying problem tasks for automation; in our problem domain the optimization of forest pest simulation model parameters is a task that has been previously been undertaken manually and has been identified for automation.

Problem relevance. Here, the problem is decomposed into sub-problems, which must be business relevant and important. The forest pest simulation model optimization decomposes into a two key sub-problems: the construction of the forest pest population model in a format that can be optimized; and running an optimization algorithm on this population model. The problem is important since its solution can contribute to improving the viability of forest plantations and the quality of scientific knowledge.

Design evaluation. Every sub-process is examined in terms of cost, sources, scope and information. The cost of both the above sub-processes does not exceed the resources available for the research, and the scope is restricted to a single pest species. The simulation model sources information from laboratory and field data, and was constructed by scientists specializing in this research area. Similarly, the optimization approach uses high quality source data, being based upon a well established algorithm with a wide range of successful applications.

5th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2011), 31 May -3 June 2011, Daejeon, Korea

ISBN: 978-1-4577-0872-5 (c) 2011 IEEE 135

Page 4: [IEEE 2011 5th IEEE International Conference on Digital Ecosystems and Technologies (DEST) - Daejeon, Korea (South) (2011.05.31-2011.06.3)] 5th IEEE International Conference on Digital

Research contributions. Design science requires contributions to be made in the design methodology, foundations and/or artifact; in our study the artifact is the research contribution that will be made. This will be in the form of a forest pest simulation model, and it can be verified by comparing its robustness mathematically with that of the manually produced model upon which it was based.

Research rigor. The requirement within this phase is to ensure rigor in both the artifact design and evaluation. This is achieved in our study by using scientificdomain experts to construct the simulation model; the optimization will be conducted by an appropriately qualified researcher in this field. The scientific domain experts will also select an appropriate mathematical measure of model robustness, to ensure rigorous artifact evaluation occurs.

Design as a search process. Searching is part of the design process, constrained by the problem environment laws. In our problem domain, the search is constrained primarily by the availability of the data, which is used to construct and test the models.

Communication of Research. The results of this research will be of relevance to a number of different audiences, and may be communicated to them through channels such as conferences and journals. If the simulation model proves to be more robust than the manually produced original then the study will be of interest to the scientific community. Technological contributions such as the relative performance of algorithms and the application of design science methodologies are of relevance to the IT research community.

V. DISCUSSION AND CONCLUSIONThis study discussed issues with the modeling approaches

used in forestry pest management and proposed a modified approach through the use of heuristics in improving modeling outcomes for the target problem area. The proposed optimization approach offers the advantage of producing models that are potentially more robust than those produced manually, in terms of how closely they represent real world data. Such models are potentially more useful to scientists, since they paint a more accurate picture of the phenomena under observation, along with practitioners, for whom model accuracy is crucial in accurately predicting future pest infestations.

It has been evident that the paradigm of digital ecosystems has facilitated a three way relationship between the forestry industry, human research and ICT based optimization, resulting in multi-disciplinary research. This paradigm accommodates value creation by making connections in different domains through the adoption of new technology supporting digital system solutions, services and collaborative technology frameworks. We address this need through the utilization of a design science method in a specific problem space (forestry pest management). A key component of our research is the

close collaboration between the authors of this paper, who have an IT research background, and the scientists who create and test the simulation models. This allows the optimization approach to be undertaken within a design science framework to solve problems of importance and relevance to the forest pest management problem domain. Moreover, the scientists are able to provide rigorous evaluation of the quality of artifacts produced, and ensure that the optimized simulations keep within established scientific paradigms. This will prevent the production of models that, whilst accurate, do not make sense scientifically, such as using negative growth rates to represent mortality. At the same time, natural phenomena will be transformed into computer based systems designed to strengthen scientific problem solving. Existing research into simulation model robustness optimization [11] can be used to validate our approach.

Future research is likely to involve the application of optimization to increasingly complex simulation models, as new data becomes available with which to build and test simulations. Ontologies may need to be developed, within which such models can be represented and referred to. This approach may result in categorizations of models by species, complexity and construction approach, so that comparisons between models and their relative robustness can be easily made.

REFERENCES

[1] A. Finkelstein, J. Hetherington, L. Li, O. Margoninski, P. Saffrey, R. Seymour, and A. Warner, “Computational challenges of systems biology,” IEEE Computer, vol. 37, pp. 26-33, 2004.

[2] C. Priami, “Algorithmic systems biology,” Communications of the ACM, vol. 52, pp. 80-88, 2009.

[3] H. F. Nahrung, M. K. Schutze, A. R. Clarke, M. P. Duffy, E. A. Dunlop, and S. A. Lawson, “Thermal requirements, field mortality and population phenology modelling of Paropsis atomaria Olivier, an emergent pest in subtropical hardwood plantations,” Forest Ecology and Management, vol. 255, pp. 3515-3523, 2008.

[4] W. Xi, R. Coulson, J. D. Waldron, M. Tchakeria, C. W. Lafon, D. M. Cairns, A. Birt, and K. D. Klepzig, “Landscape modeling for forest restoration, planning and assessment - lessons from the southern Appalachians,” Journal of Forestry, vol. 106:4, pp. 191-7, 2008.

[5] T. G. Martin, P. M. Kuhnert, K. Mengersen, and H. P. Possingham, “The power of expert opionion in ecological models using Bayesian methods: impact of grazing on birds,” Ecological Applications, vol. 15:1, pp. 266-80, 2005.

[6] A. J. Carnegie, C. Stone, S. A. Lawson, and M.. Matsuki, “Can we grow certified eucalypt plantations in subtropical Australia? an insect pest management perspective,” New Zealand Journal of Forestry Science, vol. 35, pp. 223-245, 2005.

[7] Z. Zhu, J. Chen, Q. Qin, J. Li, and L. Wang, “Optimization of ecosystem model parameters using spatio-temporal soil moisture information,”Ecological modelling, vol. 220, pp. 2121-2136, 2009.

[8] A. Ammeri, W. Hachicha, H. Chabchoub, and F. Masmoudi, “A comprehensive literature classification of simulation-optimization methods,” Proceedings of the The 9th International Conference on Multiple Objective Programming and Goal Programming, Sousse, Tunisia, 2010.

[9] M. Le Louarn, “The technologies for digital ecosystems: cluster of FP6 projects,” Digital Business ecosystems, European Commission, Information Society and Media, 2007.

[10] S. J. Miah, “A new semantic knowledge sharing approach for e-government systems,” 4th IEEE International Conference on DigitalEcosystems, Dubai, UAE, pp. 457-462, 2010.

5th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2011), 31 May -3 June 2011, Daejeon, Korea

ISBN: 978-1-4577-0872-5 (c) 2011 IEEE 136

Page 5: [IEEE 2011 5th IEEE International Conference on Digital Ecosystems and Technologies (DEST) - Daejeon, Korea (South) (2011.05.31-2011.06.3)] 5th IEEE International Conference on Digital

[11] G. Zhang, G. Zhang, Y. Gao, and J. Lu, “Competitive strategic bidding optimization in electricity markets using bi-level programming and swarm technique,” IEEE Transactions on Industrial Electronics, issue 99, 2010.

[12] I. M. Whittley, and G. D. Smith, “The attribute based hill climber,”Journal of Mathematical Modelling and Algorithms, vol. 3:2, pp.167-78, 2004.

[13] U. Derigs, and R. Kaiser, “Applying the attribute based hill climber heuristic to the vehicle routing problem,” European Journal of Operational Research, vol.177: 2, pp. 719-32, 2007.

[14] M. Kronfeld, H. Planatscher, and A. Zell, “The EvA2 optimization framework,” Learning and Intelligent Optimization, vol. 60:73, pp. 247-50, 2010.

[15] B. Jeng, J. Chen, and T. Liang, “Applying data mining to learn system dynamics in a biological model,” Expert Systems with Applications, vol.30, pp. 50-58, 2006.

[16] L. Wang, “A hybrid genetic algorithm-neural network strategy for simulation optimization,” Applied Mathematics and Computation, vol.170, pp. 1329-1343, 2005.

[17] P. Dini, “A scientific foundation for digital ecosystems,” Digital Business Ecosystems, European Commission, Luxembourg, 2008.

[18] C. Ciborra, and O. Hanseth, “From tool to gestell: agendas for managing the information infrastructure,” Information Technology & People, vol. 11:4, pp. 305-327, 1998.

[19] S. T. March, and G. Smith, “Design and natural science research on information technology,” Decision Support Systems, vol. 15:4, pp. 251-266, 1995.

[20] D. Jones, and S. Gregor, “The anatomy of a design theory,” Journal of the Association for Information Systems, vol. 8:5, pp. 312, 2007.

[21] A. R. Hevner, S. T. March, J. Park, and S. Ram, “Design science ininformation systems research,” MIS Quarterly, vol. 28:1, pp. 75-105, 2004.

5th IEEE International Conference on Digital Ecosystems and Technologies (IEEE DEST 2011), 31 May -3 June 2011, Daejeon, Korea

ISBN: 978-1-4577-0872-5 (c) 2011 IEEE 137