approaches to ultra long-term system maintenance · 2017-12-14 · backporting i leverage ltsi...

37
Corporate Technology Approaches to Ultra Long-Term System Maintenance Embedded Linux Conference Europe 2016 Prof. Dr. Wolfgang Mauerer Siemens AG, Corporate Research and Technologies Smart Embedded Systems Corporate Competence Centre Embedded Linux Copyright c 2016, Siemens AG. All rights reserved. Page 1 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Upload: others

Post on 19-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Corporate Technology

Approaches to Ultra Long-TermSystem Maintenance

Embedded Linux Conference Europe 2016

Prof. Dr. Wolfgang MauererSiemens AG, Corporate Research and TechnologiesSmart Embedded SystemsCorporate Competence Centre Embedded Linux

Copyright c© 2016, Siemens AG. All rights reserved.

Page 1 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 2: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Overview

1 Introduction

2 Aspects of Long-Term MaintenanceArchitectural CharacteristicsThreats and Risks

3 Technical AspectsPayload SoftwareDeveloping, Building and TestingIn-Field Strategy

4 Backporting & ProcessesBackporting: Conceptual and Technical Issues

Page 2 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 3: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Outline

1 Introduction

2 Aspects of Long-Term MaintenanceArchitectural CharacteristicsThreats and Risks

3 Technical AspectsPayload SoftwareDeveloping, Building and TestingIn-Field Strategy

4 Backporting & ProcessesBackporting: Conceptual and Technical Issues

Page 3 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 4: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Introduction 0

Disclaimer

Many statements: Extremely obviousRealisation: Quite remote for many problematic appliancesQuantification: Astonishingly hard. . .

Page 4 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 5: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Introduction I

Consumer Electronics

Mobile Phones, Notebooks,Tablets, . . .Entertainment systems(Radio, TV, DVD/BlueRay, . . . )Ovens, Washing Machines,Home Control/Automation

Industrial Systems

Medical devicesComputed tomography,X-Ray Imaging,Ultrasound, . . .

InfrastructureGas, Power, Water supplyPowerstations andtransformersTraffic lights, park spacemanagement

MobilityPlanes, trains,automobiles, mars rovers,space stations

. . .Page 5 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 6: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Page 6 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 7: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Introduction II

Fundamental questions

Is long-term maintenance reasonable/doable?System architecture for LTM?

Page 7 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 8: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Innovation Cycles I

Lifespans

Consumer devices: 2-5 yearsMobility: 5-20 yearsIndustrial: 10-30 yearsInfrastructure: 30-80 years (and up!)

All domains: Linux, of course!

Long-life requirements not restricted to industrial appliances!IoT, smart home, connected devices: Longevity requirementspervade everyday devicesShort lifespans: Exception, not rule!

Page 8 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 9: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Innovation Cycles II

Page 9 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 10: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Innovation Cycle III

Fundamental Questions

Risks and benefits ofupdates?How to restrict updates to(isolated) areas?How to avoid updates?

Beyond components

Questions not addressed bysimply using LTScomponents/distrosLTM: Architectural issueLTM: Mindset issue

Some field observations/bogus assumptions

All components can be upgraded in-field 7

Updates fix more problems than they create 7

Upstream integration always reduces maintenance effort 7

Long-term component versions solve maintenance problems 7

Page 10 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 11: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Innovation Cycle III

Fundamental Questions

Risks and benefits ofupdates?How to restrict updates to(isolated) areas?How to avoid updates?

Beyond components

Questions not addressed bysimply using LTScomponents/distrosLTM: Architectural issueLTM: Mindset issue

Some field observations/bogus assumptions

All components can be upgraded in-field 7

Updates fix more problems than they create 7

Upstream integration always reduces maintenance effort 7

Long-term component versions solve maintenance problems 7

Page 10 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 12: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Outline

1 Introduction

2 Aspects of Long-Term MaintenanceArchitectural CharacteristicsThreats and Risks

3 Technical AspectsPayload SoftwareDeveloping, Building and TestingIn-Field Strategy

4 Backporting & ProcessesBackporting: Conceptual and Technical Issues

Page 11 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 13: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Appliance Architecture

Long-term maintenance vs. periodic re-building

Fixed (trusted) ↔ arbitrary payload softwareIsolated ↔ universally accessibleHardware stability ↔ varianceFixed hardware ↔ extensibility (e.g., USB)Verification and safety requirementsCost sensitivity (core payload inside virtual environments?)

Software aspects

System base softwarePayload software + architecture

Page 12 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 14: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Appliance Architecture

Long-term maintenance vs. periodic re-building

Fixed (trusted) ↔ arbitrary payload softwareIsolated ↔ universally accessibleHardware stability ↔ varianceFixed hardware ↔ extensibility (e.g., USB)Verification and safety requirementsCost sensitivity (core payload inside virtual environments?)

Software aspects

System base softwarePayload software + architecture

Page 12 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 15: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Appliance Architecture

Long-term maintenance vs. periodic re-building

Fixed (trusted) ↔ arbitrary payload softwareIsolated ↔ universally accessibleHardware stability ↔ varianceFixed hardware ↔ extensibility (e.g., USB)Verification and safety requirementsCost sensitivity (core payload inside virtual environments?)

Software aspects

System base software: Little/no controlPayload software + architecture: Full control ⇒ LTM Focus!

Page 12 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 16: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Threats and Risks

What should LTM prevent in your case?

Device stops workingDevice faults cannot be repaired/debuggedDevice can be influenced from outsideDevice does not meet changed expectations (functionality,interoperability, . . . )

Response catalogue

Ignore issues (can be reasonable, on rare occasions)Replace device (HW + SW; component)Modify SW

Page 13 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 17: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Threats and Risks

What should LTM prevent in your case?

Device stops workingDevice faults cannot be repaired/debuggedDevice can be influenced from outsideDevice does not meet changed expectations (functionality,interoperability, . . . )

Response catalogue

Ignore issues (can be reasonable, on rare occasions)Replace device (HW + SW; component)Modify SW ⇐ case of interest

Page 13 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 18: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Outline

1 Introduction

2 Aspects of Long-Term MaintenanceArchitectural CharacteristicsThreats and Risks

3 Technical AspectsPayload SoftwareDeveloping, Building and TestingIn-Field Strategy

4 Backporting & ProcessesBackporting: Conceptual and Technical Issues

Page 14 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 19: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Payload Software

Software Engineering Considerations

Deliver maintainable software/architecture in the first placeMinimise cross-cutting issuesHarmonise technical and social organisationMeaningful and reproducible history

Think three times before connecting systems to networks; thenthink three more times“Translator” with domain and (base component) communityknowledgeMake components (run-time) replaceable; prefer userland tokernel

Page 15 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 20: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Payload Software

Software Engineering Considerations

Deliver maintainable software/architecture in the first placeMinimise cross-cutting issuesHarmonise technical and social organisationMeaningful and reproducible history

Think three times before connecting systems to networks; thenthink three more times“Translator” with domain and (base component) communityknowledgeMake components (run-time) replaceable; prefer userland tokernel

Page 15 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 21: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Development & Building I

Reproducible Builds

Produce binaries. . . 20 years after initial launchPayload application + modifiable system componentsPreserve base component binaries

Documentation of (seemingly trivial) details essentialDocumentation availability (hardcopy is a serious alternative)Avoid custom build systems

Page 16 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 22: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Development & Building I

Reproducible Builds

Produce binaries. . . 20 years after initial launchPayload application + modifiable system componentsPreserve base component binaries

Documentation of (seemingly trivial) details essentialDocumentation availability (hardcopy is a serious alternative)Avoid custom build systems

Source Code

Availability of source code + history (e.g., Bitkeeper. . . )Component states + local provision of dependenciesIncludes build infrastructure!

Page 16 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 23: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Development & Building I

Reproducible Builds

Produce binaries. . . 20 years after initial launchPayload application + modifiable system componentsPreserve base component binaries

Documentation of (seemingly trivial) details essentialDocumentation availability (hardcopy is a serious alternative)Avoid custom build systems

Tool Chain

Cross-Building: (subtle) dependencies!Isolate build environment in VM (strict freeze!)Bugs in ancient toolchain: Payload SW workaroundsEclipse etc.: harder. . . 7

Page 16 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 24: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Development & Building II

Component Selection and Integration

Consider cost of librariesDynamically changing dependencies (version requirement specsoften unreliable)Changes in components⇒ (silent) breakage in library

Distinguish between prototype and deliverableExperiment with 17 machine learning algorithmsDeploy one (+ rewrite)

Page 17 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 25: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Development & Building II

Development prior to market release

Develop against latest mainline state (rebasing preferred)Avoid vendor BSPs. Board support essential, not BSPs!

Only chance: Prior to purchasing 1.8× 1023 units

System changes: Upstream first policyAvoid component modifications (socio-technical congruence)

Especially for features useless for upstream

Minimise divergence between upstream state and product atrelease time

Page 18 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 26: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Development & Building III

Five Recommendations

1 Avoid complex development environments and generated code2 Avoid web technologies3 Use convenience libraries judiciously4 Avoid integration/consolidation; delegate

communication/networking to separate entities5 Document and automate excessively

Page 19 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 27: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Page 20 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 28: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Options for post-release development

Rolling Development

Continuous updates of(selected) base componentsUncouple progress fromdistribution (e.g., after eol)Detect issues early,re-invent distribution wheel

Distribution schedule

sudo apt-get upgrade

Requires support bydistribution!

Page 21 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 29: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Options for post-release development

System schedule

Update in appliance-specificintervals (periodic orirregular)Combine disadvantages ofdistribution and continuousupdates

Invariant base system

Don’t update base systemPayload applicationdevelopment onlyRequires (extremely) smallattack surface/virtualisedbase system

Page 21 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 30: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Outline

1 Introduction

2 Aspects of Long-Term MaintenanceArchitectural CharacteristicsThreats and Risks

3 Technical AspectsPayload SoftwareDeveloping, Building and TestingIn-Field Strategy

4 Backporting & ProcessesBackporting: Conceptual and Technical Issues

Page 22 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 31: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Backporting I

Leverage LTSI kernel

LTSI support period: Comprehensive coveragePost-LTSI: Backport only

Critical issues re/ attack surfacesOrthogonal drivers/componentsFeature not required during first 5 years⇒ unlikely required in nextdecade(s)Major change required (debugging, tracing etc.): Time for newrelease. . .

Backport patch stack

Organise backports in proper orthogonal patch stackRebase! Living organism, not code dump

Page 23 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 32: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Backporting II: What and when to backport

1.) What to backport

Most upstream changes donot require back-portingSelection crucialSelection criteria differdepending on use case

Approaches

Keyword filtering (possiblefor well-tended projects)Content/file/path basedfiltering (tremendous volumereduction)Manual review + long-termexpert involvementnecessary

Page 24 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 33: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Backporting II: What and when to backport

2.) When to backport

ProactivelyAfter incidents/bugs

Simple criterion

# incidents > # backportregressionsHistorical data: Noconclusive evidenceExpert assessment required

Page 24 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 34: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Backporting III

The human touch

Determine when no action is requiredNotify users/customers

Page 25 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 35: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Backporting IV

Goal: Simplify patch selection for everyone

Fully automatic approach: unrealisticAvoid duplicated manual efforts

Wishlist: Improvements

Maintenance classes (for patches), consistent across projectsCurrent schemes dating back to 70ies⇒ survey!

Applicability range (releases) annotationWide-spread use of automated approaches (backwards integration,testing)

Extend use of semantic vs. text-based modifications

Page 26 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 36: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Summary

LTM best practices: Similar to proper (OSS-style) softwaredevelopmentSystem and application architecture: crucialLTM: not rocket science, but still more art than science –quantitative data required!

Page 27 11. Oct. 2016 W. Mauerer Siemens Corporate Technology

Page 37: Approaches to Ultra Long-Term System Maintenance · 2017-12-14 · Backporting I Leverage LTSI kernel LTSI support period: Comprehensive coverage Post-LTSI: Backport only Critical

Thanks for your interest!

Page 28 11. Oct. 2016 W. Mauerer Siemens Corporate Technology