springer series in reliability engineering978-3-319-30599-8/1.pdf · gerardo rubino inria...
TRANSCRIPT
More information about this series at http://www.springer.com/series/6917
Lance Fiondella • Antonio PuliafitoEditors
Principles of Performanceand Reliability Modelingand EvaluationEssays in Honor of Kishor Trivedion his 70th Birthday
123
EditorsLance FiondellaUniversity of Massachusetts DartmouthDartmouth, MAUSA
Antonio PuliafitoUniversità degli studi di MessinaMessinaItaly
ISSN 1614-7839 ISSN 2196-999X (electronic)Springer Series in Reliability EngineeringISBN 978-3-319-30597-4 ISBN 978-3-319-30599-8 (eBook)DOI 10.1007/978-3-319-30599-8
Library of Congress Control Number: 2016933802
© Springer International Publishing Switzerland 2016This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or partof the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmissionor information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilarmethodology now known or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in thispublication does not imply, even in the absence of a specific statement, that such names are exempt fromthe relevant protective laws and regulations and therefore free for general use.The publisher, the authors and the editors are safe to assume that the advice and information in thisbook are believed to be true and accurate at the date of publication. Neither the publisher nor theauthors or the editors give a warranty, express or implied, with respect to the material contained herein orfor any errors or omissions that may have been made.
Printed on acid-free paper
This Springer imprint is published by Springer NatureThe registered company is Springer International Publishing AG Switzerland
Non soffocare la tua ispirazione e la tuaimmaginazione, non diventare lo schiavo deltuo modello.(Do not stifle your inspiration and yourimagination, do not become the slave of yourmodel.)
—Vincent Van Gogh
Sbagliando si impara-proverbio italiano(You learn by making mistakes-Italian proverb)
To my wife Maria and my beloved children,Carlo and Andrea, for their unconditionedlove. To my friends, colleaguesand students representing the second halfof my existence.
—Antonio Puliafito
—Lance Fiondella
Contents
Part I Phase Type Distributions, Expectation MaximizationAlgorithms, and Probabilistic Graphical Models
Phase Type and Matrix Exponential Distributionsin Stochastic Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Andras Horvath, Marco Scarpa and Miklos Telek
An Analytical Framework to Deal with Changing Pointsand Variable Distributions in Quality Assessment. . . . . . . . . . . . . . . . . 27Dario Bruneo, Salvatore Distefano, Francesco Longoand Marco Scarpa
Fitting Phase-Type Distributions and Markovian Arrival Processes:Algorithms and Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49Hiroyuki Okamura and Tadashi Dohi
Constant-Stress Accelerated Life-Test Models and Data Analysisfor One-Shot Devices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77Narayanaswamy Balakrishnan, Man Ho Ling and Hon Yiu So
Probabilistic Graphical Models for Fault Diagnosisin Complex Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Ali Abdollahi, Krishna R. Pattipati, Anuradha Kodali, Satnam Singh,Shigang Zhang and Peter B. Luh
Part II Principles of Performance and ReliabilityModeling and Evaluation
From Performability to Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . 143Raymond A. Marie
Sojourn Times in Dependability Modeling . . . . . . . . . . . . . . . . . . . . . . 169Gerardo Rubino and Bruno Sericola
ix
Managed Dependability in Interacting Systems. . . . . . . . . . . . . . . . . . . 197Poul E. Heegaard, Bjarne E. Helvik, Gianfranco Nencioniand Jonas Wäfler
30 Years of GreatSPN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227Elvio Gilberto Amparore, Gianfranco Balbo, Marco Beccuti,Susanna Donatelli and Giuliana Franceschinis
WebSPN: A Flexible Tool for the Analysis of Non-MarkovianStochastic Petri Nets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255Francesco Longo, Marco Scarpa and Antonio Puliafito
Modeling Availability Impact in Cloud Computing . . . . . . . . . . . . . . . . 287Paulo Romero Martins Maciel
Scalable Assessment and Optimization of Power DistributionAutomation Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321Alberto Avritzer, Lucia Happe, Anne Koziolek, Daniel Sadoc Menasche,Sindhu Suresh and Jose Yallouz
Model Checking Two Layers of Mean-Field Models . . . . . . . . . . . . . . . 341Anna Kolesnichenko, Anne Remke, Pieter-Tjerk de Boerand Boudewijn R. Haverkort
Part III Checkpointing and Queueing
Standby Systems with Backups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373Gregory Levitin and Liudong Xing
Reliability Analysis of a Cloud Computing System with Replication:Using Markov Renewal Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401Mitsutaka Kimura, Xufeng Zhao and Toshio Nakagawa
Service Reliability Enhancement in Cloud by Checkpointingand Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425Subrota K. Mondal, Fumio Machida and Jogesh K. Muppala
Linear Algebraic Methods in RESTART Problems in MarkovianSystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449Stephen Thompson, Lester Lipsky and Søren Asmussen
Vacation Queueing Models of Service Systems Subjectto Failure and Repair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481Oliver C. Ibe
Part IV Software Simulation, Testing, Workloads,Aging, Reliability, and Resilience
Combined Simulation and Testing Based on StandardUML Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499Vitali Schneider, Anna Deitsch, Winfried Dulz and Reinhard German
x Contents
Workloads in the Clouds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525Maria Carla Calzarossa, Marco L. Della Vedova, Luisa Massari,Dana Petcu, Momin I.M. Tabash and Daniele Tessera
Reproducibility of Software Bugs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551Flavio Frattini, Roberto Pietrantuono and Stefano Russo
Constraint-Based Virtualization of Industrial Networks . . . . . . . . . . . . 567Waseem Mandarawi, Andreas Fischer, Amine Mohamed Houyou,Hans-Peter Huth and Hermann de Meer
Component-Oriented Reliability Assessment ApproachBased on Decision-Making Frameworks for OpenSource Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587Shigeru Yamada and Yoshinobu Tamura
Measuring the Resiliency of Extreme-ScaleComputing Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609Catello Di Martino, Zbigniew Kalbarczyk and Ravishankar Iyer
Contents xi
Contributors
Ali Abdollahi Department of Electrical and Computer Engineering, University ofConnecticut, Storrs, CT, USA
Elvio Gilberto Amparore Dipartimento di Informatica, Università di Torino,Turin, Italy
Søren Asmussen Department of Mathematics, Aarhus University, Aarhus C,Denmark
Alberto Avritzer Siemens Corporation, Corporate Technology, Princeton, NJ,USA
Narayanaswamy Balakrishnan Department of Mathematics and Statistics,McMaster University, Hamilton, ON, Canada
Gianfranco Balbo Dipartimento di Informatica, Università di Torino, Turin, Italy
Marco Beccuti Dipartimento di Informatica, Università di Torino, Turin, Italy
Dario Bruneo Dipartimento di Ingegneria, Università degli Studi di Messina,Messina, Italy
Maria Carla Calzarossa Dipartimento di Ingegneria Industriale edell’Informazione, Università degli Studi di Pavia, Pavia, Italy
Pieter-Tjerk de Boer Department of Computer Science, University of Twente,Enschede, The Netherlands
Hermann de Meer Chair of Computer Networks and Computer Communications,University of Passau, Passau, Germany
Anna Deitsch Computer Science 7, Friedrich-Alexander-University ofErlangen-Nürnberg, Erlangen, Germany
Marco L. Della Vedova Dipartimento di Matematica e Fisica, UniversitàCattolica del Sacro Cuore, Brescia, Italy
xiii
Catello Di Martino Coordinated Science Laboratory, University of Illinois atUrbana Champaign, Urbana, IL, USA
Salvatore Distefano Higher Institute for Information Technology and InformationSystems Kazan Federal University, Kazan, Russia; Dipartimento di ScienzeMatematiche e Informatiche, Scienze Fisiche e Scienze della Terra, Università degliStudi di Messina, Messina, Italy
Tadashi Dohi Department of Information Engineering, Graduate School ofEngineering, Hiroshima University, Higashihiroshima, Japan
Susanna Donatelli Dipartimento di Informatica, Università di Torino, Turin, Italy
Winfried Dulz Computer Science 7, Friedrich-Alexander-University ofErlangen-Nürnberg, Erlangen, Germany
Andreas Fischer Chair of Computer Networks and Computer Communications,University of Passau, Passau, Germany
Giuliana Franceschinis Dipartimento di Scienze e Innovazione Tecnologica,Università del Piemonte Orientale, Alessandria, Italy
Flavio Frattini Dipartimento di Ingegneria Elettrica e delle Tecnologiedell’Informazione, Università degli Studi di Napoli Federico II, Napoli, Italy
Reinhard German Computer Science 7, Friedrich-Alexander-University ofErlangen-Nürnberg, Erlangen, Germany
Lucia Happe Institute for Program Structures and Data Organization, KarlsruheInstitute of Technology (KIT), Karlsruhe, Germany
Boudewijn R. Haverkort Department of Computer Science, University ofTwente, Enschede, The Netherlands
Poul E. Heegaard Department of Telematics, Norwegian University of Scienceand Technology, Trondheim, Norway
Bjarne E. Helvik Department of Telematics, Norwegian University of Scienceand Technology, Trondheim, Norway
Andras Horvath Dipartimento di Informatica, Università degli Studi di Torino,Turin, Italy
Amine Mohamed Houyou Siemens AG, Corporate Technology, Munich,Germany
Hans-Peter Huth Siemens AG, Corporate Technology, Munich, Germany
Oliver C. Ibe Department of Electrical and Computer Engineering, University ofMassachusetts, Lowell, MA, USA
Ravishankar Iyer Bell Labs—Nokia, New Provicence, NJ, USA
xiv Contributors
Zbigniew Kalbarczyk Bell Labs—Nokia, New Provicence, NJ, USA
Mitsutaka Kimura Department of Cross Cultural Studies, Gifu City Women’sCollege, Gifu, Japan
Anuradha Kodali University of California Santa Cruz, NASA Ames ResearchCenter, Moffett Field, CA, USA
Anna Kolesnichenko UL Transaction Security Division, Leiden, The Netherlands
Anne Koziolek Institute for Program Structures and Data Organization, KarlsruheInstitute of Technology (KIT), Karlsruhe, Germany
Gregory Levitin The Israel Electric Corporation, Haifa, Israel
Man Ho Ling Department of Mathematics and Information Technology, TheHong Kong Institute of Education, Hong Kong Sar, China
Lester Lipsky Department of Computer Science and Engineering, 371 FairfieldWay, Unit 4155, University of Connecticut, Storrs, CT, USA
Francesco Longo Dipartimento di Ingegneria, Università degli Studi di Messina,Messina, Italy
Peter B. Luh Department of Electrical and Computer Engineering, University ofConnecticut, Storrs, CT, USA
Fumio Machida Laboratory for Analysis of System Dependability, Kawasaki,Japan
Waseem Mandarawi Chair of Computer Networks and ComputerCommunications, University of Passau, Passau, Germany
Raymond A. Marie IRISA Campus de Beaulieu, Rennes University, RennesCedex, France
Paulo Romero Martins Maciel Centro de Informática, Universidade Federal dePernambuco, Av. Jornalista Anibal Fernandes, s/n - Cidade Universitária, Recife,Brazil
Luisa Massari Dipartimento di Ingegneria Industriale e dell’Informazione,Università degli Studi di Pavia, Pavia, Italy
Daniel Sadoc Menasche Department of Computer Science, Federal University ofRio de Janeiro, Rio de Janeiro (UFRJ), RJ, Brazil
Subrota K. Mondal The Hong Kong University of Science and Technology,Kowloon, Hong Kong
Jogesh K. Muppala The Hong Kong University of Science and Technology,Kowloon, Hong Kong
Contributors xv
Toshio Nakagawa Department of Business Administration, Aichi Institute ofTechnology, Toyota, Japan
Gianfranco Nencioni Department of Telematics, Norwegian University ofScience and Technology, Trondheim, Norway
Hiroyuki Okamura Department of Information Engineering, Graduate School ofEngineering, Hiroshima University, Higashihiroshima, Japan
Krishna R. Pattipati Department of Electrical and Computer Engineering,University of Connecticut, Storrs, CT, USA
Dana Petcu Departament Informatica, Universitatea de Vest din Timişoara,Timişoara, Romania
Roberto Pietrantuono Dipartimento di Ingegneria Elettrica e delle Tecnologiedell’Informazione, Università degli Studi di Napoli Federico II, Napoli, Italy
Antonio Puliafito Dipartimento di Ingegneria, Università degli Studi di Messina,Messina, Italy
Anne Remke Department of Computer Science, University of Münster, Münster,Germany
Gerardo Rubino Inria Rennes—Bretagne Atlantique, Campus de Beaulieu,Rennes Cedex, France
Stefano Russo Dipartimento di Ingegneria Elettrica e delle Tecnologiedell’Informazione, Università degli Studi di Napoli Federico II, Napoli, Italy
Marco Scarpa Dipartimento di Ingegneria, Università degli Studi di Messina,Messina, Italy
Vitali Schneider Computer Science 7, Friedrich-Alexander-University ofErlangen-Nürnberg, Erlangen, Germany
Bruno Sericola Inria Rennes—Bretagne Atlantique, Campus de Beaulieu, RennesCedex, France
Satnam Singh Department of Electrical and Computer Engineering, University ofConnecticut, Storrs, CT, USA
Hon Yiu So Department of Mathematics and Statistics, McMaster University,Hamilton, ON, Canada
Sindhu Suresh Siemens Corporation, Corporate Technology, Princeton, NJ, USA
Momin I.M. Tabash Dipartimento di Ingegneria Industriale e dell’Informazione,Università degli Studi di Pavia, Pavia, Italy
Yoshinobu Tamura Department of Electronic and Information SystemEngineering, Yamaguchi University, Ube-shi, Yamaguchi, Japan
xvi Contributors
Miklos Telek MTA-BME Information Systems Research Group, Department ofNetworked Systems and Services, Budapest University of Technology andEconomics, Budapest, Hungary
Daniele Tessera Dipartimento di Matematica e Fisica, Università Cattolica delSacro Cuore, Brescia, Italy
Stephen Thompson Department of Computer Science and Engineering, 371Fairfield Way, Unit 4155, University of Connecticut, Storrs, CT, USA
Jonas Wäfler Department of Telematics, Norwegian University of Science andTechnology, Trondheim, Norway
Liudong Xing University of Massachusetts Dartmouth, Dartmouth, MA, USA
Jose Yallouz Department of Electrical Engineering, Israel Institute of Technology(Technion), Haifa, Israel
Shigeru Yamada Department of Social Management Engineering, TottoriUniversity, Tottori-shi, Japan
Shigang Zhang Department of Electrical and Computer Engineering, Universityof Connecticut, Storrs, CT, USA
Xufeng Zhao Department of Mechanical and Industrial Engineering, QatarUniversity, Doha, Qatar
Contributors xvii
Review of Prof. Trivedi’s Contributionsto the Field
Kishor S. Trivedi holds the Hudson Chair in the Department of Electrical andComputer Engineering at Duke University, Durham, NC. He has been on the Dukefaculty since 1975.
Professor Trivedi is a leading international expert in the domain of reliability andperformability evaluation of fault-tolerant systems. He has made seminal contri-butions to stochastic modeling formalisms and their efficient solution. He haswritten influential textbooks that contain not only tutorial material but also the latestadvances. He has encapsulated the algorithms developed into usable andwell-circulated software packages. He has made key contributions by applying theresearch results to practical problems, working directly with industry and has beenable to not only solve difficult real-world problems but also develop new researchresults based on these problems.
Professor Trivedi is the author of the well-known text, Probability and Statisticswith Reliability, Queuing and Computer Science Applications, published byPrentice-Hall, which has allowed hundreds of students, researchers, and practi-tioners to learn analytical modeling techniques. A thoroughly revised secondedition (including its Indian edition) of this book was published by John Wiley in2001. A comprehensive solution manual for the second edition containing morethan 300 problem solutions is available from the publisher. This book is the first ofits kind to present a balanced treatment of reliability and performance evaluation,while introducing the basic concepts of stochastic models. This book has made aunique contribution to the field of reliability and performance evaluation.
Professor Trivedi has also authored two books: Performance and ReliabilityAnalysis of Computer Systems, published by Kluwer Academic Publishers in 1995,and Queueing Networks and Markov Chains, published by John Wiley in 1998,which are also well known and focused on performance and reliability evaluationtechniques and methodologies. His second book has already made inroads forself-study with practicing engineers and has been adopted by several universities.The third book is considered to be an important book in queueing theory withseveral interesting example applications. The second edition of this book waspublished in 2006. He has also edited two books, Advanced Computer System
xix
Design, published by Gordon and Breach Science Publishers, and PerformabilityModeling Tools and Techniques, published by John Wiley & Sons.
He is a Fellow of the Institute of Electrical and Electronics Engineers and aGolden Core Member of the IEEE Computer Society. He has published over500 articles and has supervised 44 Ph.D. dissertations and 30 postdoctoral asso-ciates. He is on the editorial boards of IEEE Transactions on Dependable andSecure Computing, Journal of Risk and Reliability, International Journal ofPerformability Engineering, and International Journal of Quality, Reliability andSafety Engineering.
Professor Trivedi has taken significant steps to implement his modeling tech-niques in tools, which is necessary to the transition from state-of-the-art research tobest practices for industry. These tools are used extensively by practicing engineers,researchers, and instructors. Tools include HARP (Hybrid Automated ReliabilityPredictor) which has been installed at nearly 120 sites; SHARPE (SymbolicHierarchical Automated Reliability and Performance Evaluator) is used at over550 sites; and SPNP (Stochastic Petri Net Package) has been installed at over400 sites. Trivedi also helped design and develop IBM’s SAVE (SystemAvailability Estimator), Software Productivity Consortium’s DAT (DynamicAssessment Toolset), Boeing’s IRAP (Integrated Reliability Analysis Package), andSoHar’s SDDS. Graphical user interfaces for these tools have been developed.These packages have been widely circulated and represent a reference point for allresearchers in the field of performance evaluation. Most researchers in the field areprobably aware of such tools and have very likely used one or more of them.
Trivedi has helped several companies carry out reliability and availability pre-diction for existing products as well as the ones undergoing design. A partial list ofcompanies for which he has provided consulting services includes 3Com, Avaya,Boeing, DEC, EMC, GE, HP, IBM, Lucent, NEC, TCS, and Wipro. Notable amongthese is his help to model the reliability of the current return network subsystemof the Boeing 787 for FAA certification. The algorithm he developed for thisproblem has been jointly patented by Boeing. He led the reliability and availabilitymodel of SIP on IBM Websphere, a model that facilitated it sale of the system to atelco giant.
Kishor has developed polynomial time algorithms for performability analysis,numerical solution techniques for completion time problems, algorithms for thenumerical solution of the response time distribution in a closed queueing network,techniques to solve large and stiff Markov chains, and algorithms for the automatedgeneration and solution of stochastic reward nets, including sensitivity and transientanalysis. His contributions to numerical solution and decomposition techniques forlarge and stiff stochastic models have considerably relaxed the limits on the size andstiffness of the problems that could be solved by alternative contemporary tech-niques. He has also developed fast algorithms for the solution of large fault treesand reliability graphs, including multistate components and phase missionsystems analysis. He has defined several new paradigms of stochastic Petri nets(Markov regenerative stochastic Petri nets and fluid stochastic Petri nets), which
xx Review of Prof. Trivedi’s Contributions to the Field
further extend the power of performance and reliability modeling intonon-Markovian and hybrid (mixed, continuous-state, and discrete-state) domains.
His recent work on architecture-based software reliability is very well cited. Heand his group have pioneered the areas of software aging and rejuvenation. Hisgroup has not only made many theoretical contributions to the understanding ofsoftware aging and rejuvenation scheduling but has also been the first to collect dataand develop algorithms for the prediction of time to resource exhaustion, leading toadaptive and on-line control of rejuvenation. His methods of software rejuvenationhave been implemented in the IBM X-series servers three years after the researchwas first done; a record time in technology transfer. In recognition of his pioneeringwork on the aging and rejuvenation of software, Kishor was awarded the title ofDoctor Honoris Causa from the Universidad de San Martín de Porres, Lima Peru’on August 12, 2012. He has now moved on to experimental research into softwarereliability during operation where he is studying affordable software fault tolerancethrough environmental diversity.
Professor Trivedi developed solution methods for Markov regenerative pro-cesses and used them for performance and reliability analysis. He has applied hismodeling techniques to a variety of real-world applications, including performanceanalysis of polling systems and client–server systems, wireless handoff, connectionadmission control in CDMA systems, reliability analysis of RAID and FDDI tokenrings, availability analysis of Vaxcluster systems, transient performance analysis ofleaky bucket rate control scheme, and the analysis of real-time systems.
The contributions of Trivedi’s work are fundamental to the design of reliabledigital systems. His papers on modeling fault coverage are recognized as mile-stones. His papers on hierarchical modeling and fixed-point iteration representbreakthroughs to solve large reliability models. His advances in transient analysisare recognized as leading the state of the art by computer scientists. His contri-butions to the field of stochastic Petri nets and his work on performability modelingappear as benchmark references on these topics.
Among the enormous scientific production of Prof. Trivedi, the following fivepapers are representative of his contribution to the scientific community:
[1] Gianfranco Ciardo and Kishor S. Trivedi. “A decomposition approach forstochastic reward net models.” Performance Evaluation 18(1) (1993): 37–59.Develops a decomposition approach to large Markovian Petri nets; boththeoretical and practical aspects are considered.
[2] Hoon Choi, Vidyadhar G. Kulkarni, and Kishor S. Trivedi. “Markov regen-erative stochastic Petri nets.” Performance Evaluation 20(1) (1994): 337–357.Develops a new formalism of MRSPN that allows generally distributed firingtimes concurrently with exponentially distributed firing times.
[3] Vittorio Castelli, Richard E. Harper, Philip Heidelberger, Steven W. Hunter,Kishor S. Trivedi, Kalyanaraman Vaidyanathan, and William P. Zeggert.“Proactive management of software aging.” IBM Journal of Research andDevelopment 45(2) (2001): 311–332. Details the implementation of software
Review of Prof. Trivedi’s Contributions to the Field xxi
rejuvenation in IBM X-series as an example of tech transfer from academia toindustry.
[4] Sachin Garg, Aad van Moorsel, Kalyanaraman Vaidyanathan, KishorS. Trivedi. “A methodology for detection and estimation of software aging.”Proc. Ninth International Symposium on Software Reliability Engineering(1998): 283–292. First experiment to demonstrate the existence of softwareaging and prediction of time to resource exhaustion.
[5] Andrew Reibman, and Kishor S. Trivedi. “Numerical transient analysis ofMarkov models.” Computers & Operations Research 15(1) (1988): 19–36.Benchmark paper on numerical transient analysis of Markov models.
xxii Review of Prof. Trivedi’s Contributions to the Field