sustainable scientific software development · continuous integration tools regularly run tests for...

37
Sustainable Scientific Software Development Europython 2017 Alice Harpole

Upload: others

Post on 01-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

SustainableScientificSoftwareDevelopment

Europython2017

AliceHarpole

Page 2: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

MotivationImodel'explosionsinspace'or:theeffectsofincludinggeneralrelativityinmodelsofTypeIX-rayburstsinneutronstaroceans

Page 3: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier
Page 4: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

MotivationFedupofreadingaboutexcitingcodes,onlytofindthey'renotopensourcetheyhavenexttonodocumentationquestionableapproachestotesting

Thisisnotgoodscience!

Page 5: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

OverviewWhatissoftwaresustainability(andwhyshouldIcare)?WhyscientificsoftwareisdifferentScientificsoftwaredevelopmentworkflowVersioncontrolTestingContinuousintegration&codecoverageDocumentationDistribution

Conclusions

Page 6: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

Whatissoftwaresustainability(andwhyshouldIcare)?Willmycodestillworkin5/10/20years'time?Canitbefound?Canitberun?

Ifnot,harmsfuturescientificprogress

Page 7: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

Whatmakesscientificsoftwaredifferent?

Builttoinvestigatecomplex,unknownphenomenaOftendevelopedoverlongperiodsoftimeCaninvolvelotsofcollaborationBuiltbyscientists,notsoftwareengineers

TurbulencemodelledbyDedalus

Page 8: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

TheScientificMethodInexperimentalscience,resultsarenottrustedunlessfollowscientificmethod:

testingofapparatusdocumentationofmethod

Demonstrateexperiment'sresultsareaccurate,reproducibleandreliable

Page 9: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

TheScientificMethodIncomputationalscience,wearedoingexperimentswiththecomputerasourapparatusWeshouldalsofollowscientificmethodandnottrustresultsfromcodeswithoutpropertestingordocumentation

Page 11: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

PhDComics

Page 12: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

DevelopmentworkflowGoal:implementsustainablepracticesthroughoutdevelopmentFortunately,therearelotsoftoolsthatwillhelpusautomatethings!

Page 13: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

VersioncontrolKeepsalogofallchangestocodeComputationalscienceversionofalabbook

Page 15: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier
Page 16: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

VersioncontrolAidscollaboration-nooverwritingeachother'schangesCanhackwithoutfear-developonabranch,sonodangerofirreversiblybreakingeverything

Page 17: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

TestingShouldnottrustresultsunlessapparatus&method(i.e.thesoftware)thatproducedthemhasbeendemonstratedtoworkanylimitations(e.g.numericalerror,algorithmchoice)areunderstoodandquantified

Page 18: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier
Page 19: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier
Page 20: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

TestingScientificcodescanbehardtotestastheyareoftencomplexinvestigateunknowns

Doesnotmeanweshouldgiveup!

Page 21: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

Testing:Step1BreakitdownwithunittestsCan'ttrustthesumifthepartsdon'tworkMakestestingcomplexcodesmoremanageableMakesurethesecoverentireparameterspaceandcheckcodebreakswhenitshould

Page 22: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

import unittest

def squared(x): return x*x

class test_units(unittest.TestCase):

def test_squared(self): self.assertTrue(squared(-5) == 25) self.assertTrue(squared(1e5) == 1e10) self.assertRaises(TypeError, squared, "A string")

Page 23: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

Testing:Step2BuilditbackupwithintegrationtestsNeedtocheckallpartsworktogetherCangetmoredifficulthere

Page 24: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

Testing:Step3MonitordevelopmentwithregressiontestsCheckversionsagainsteachotherPerformanceshouldimprove(oratleastnotgetworse)Bonus!Helpsenforcebackwardscompatibilityforusers

Page 25: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

Science-specificissuesUnknownbehaviourUsecontrols-simpleinputdatawithknownsolution

Randomnessisolaterandompartstestaverages,checklimits,conservationofphysicalquantities

Page 26: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

data = rand(80,80) # declare some random data

def func(a): # function to apply to data return a**2 * numpy.sin(a)

output = func(data) # calculate & plot some function of random dataplt.imshow(output); plt.colorbar(); plt.show()

Page 27: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

Input is , so output must be 0 ≤ x ≤ 1

0 ≤ f (x) ≤ sin(1) ≃ 0.841

= f (x) dx ≃ 0.223f (x)⎯ ⎯⎯⎯⎯⎯⎯⎯⎯

∫1

0

def test_limits(a): if numpy.all(a >= 0.) and numpy.all(a <= 0.842): return True return False

def test_average(a): if numpy.isclose(numpy.average(a), 0.223, rtol=5.e-2): return True return False

if test_limits(output): print('Function output within correct limits') else: print('Function output is not within correct limits') if test_average(output): print('Function output has correct average') else: print('Function output does not have correct average')

Function output within correct limits Function output has correct average

Page 28: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

Science-specificissuesSimulations

convergencetests-doesaccuracyofsolutionimprovewithorderofalgorithmused?ifnot,algorithmmaynotbeimplementedcorrectly

Numericalerrorusenumpy.isclose&numpy.allclose

Page 29: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

# use trapezium rule to find integral of sin x between 0,1 hs = numpy.array([1. / (4. * 2.**n) for n in range(8)]) errors = numpy.zeros_like(hs)

for i, h in enumerate(hs): xs = numpy.arange(0., 1.+h, h) ys = numpy.sin(xs)

# use trapezium rule to approximate integral of sin(x) integral_approx = sum((xs[1:] - xs[:-1]) * 0.5 * (ys[1:] + ys[:-1])) errors[i] = -numpy.cos(1) + numpy.cos(0) - integral_approx

plt.loglog(hs, errors, 'x', label='Error') plt.plot(hs, 0.1*hs**2, label=r'$h^2$') plt.xlabel(r'$h$'); plt.ylabel('error')

Page 30: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

Continuousintegration&codecoverage

Continuousintegrationtoolsregularlyruntestsforyou&reportbackresults

&Findoutwhenbugsoccurmuchsooner-mucheasiertofix!Danger:outdatedtestsalmostasuselessasnotestsIftestsonlycover20%ofcode,whyshouldyoutrusttheother80%?

Codecoverage!

TravisCI CircleCI

Codecov

Page 31: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier
Page 32: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier
Page 33: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

DocumentationIdeal:someoneelseinyourfieldshouldbeabletosetupanduseyourcodewithoutextrahelpfromyouIncludecomprehensiveinstallationinstructionsDocumentthecodeitself(sensiblefunction&variablenames,comments)Userguidewithexamplestodemonstrateusagejupyternotebooksgreatforthis

Automatewith ,hostatSphinx ReadtheDocs

Page 34: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier
Page 35: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier
Page 36: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

DistributionMakeitfindableOpensource!(wherepossible)DOIe.g.from

Reproducibleresultsrequireareproducibleruntimeenvironmentpackagecodeine.g.dockercontainer,condaenvironment,PyPI

Installationshouldbeaspainlessaspossiblemakefiles,trytolimitrelianceonnon-opensourcelibraries/material

zenodo

Page 37: Sustainable Scientific Software Development · Continuous integration tools regularly run tests for you & report back results & Find out when bugs occur much sooner - much easier

ConclusionsWeneedtofuture-proofoursoftwareApplythescientificmethodtosoftwaredevelopmentOnlytrustresultsfromcodesthatare

reproducible(opensource!)testeddocumented

CheckouttheSSIwebsite formorewww.software.ac.uk