ship happens: a better firefox build and release pipeline

44
Kim Moir (kmoir), Mozilla Release Engineering Ship Happens: A better Firefox build & release pipeline

Upload: kim-moir

Post on 28-Jan-2018

743 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Ship happens:  A better firefox build and release pipeline

Kim Moir (kmoir), Mozilla Release Engineering

Ship Happens:A better Firefox build & release pipeline

Page 3: Ship happens:  A better firefox build and release pipeline

Today’s agenda

● Faster pipelines and what they mean for you

● How to try it yourself!

● Lessons learned and what’s next

Page 4: Ship happens:  A better firefox build and release pipeline
Page 5: Ship happens:  A better firefox build and release pipeline
Page 6: Ship happens:  A better firefox build and release pipeline

Mozilla Releng live here

Page 7: Ship happens:  A better firefox build and release pipeline
Page 8: Ship happens:  A better firefox build and release pipeline
Page 9: Ship happens:  A better firefox build and release pipeline
Page 10: Ship happens:  A better firefox build and release pipeline

Release times

● 2013 - 11 hours

● 2017 - 4-5 hours

Page 11: Ship happens:  A better firefox build and release pipeline

Continuous integration

Land code

Unit tests

Decision

graph

Builds x N

platforms

Performance

tests

Sign Builds

Page 12: Ship happens:  A better firefox build and release pipeline

Nightlies

Land code

Unit tests

Decision

graph

Builds x N

platforms

Performance

tests

Sign Builds

Generate

updates

L10n

Page 13: Ship happens:  A better firefox build and release pipeline

Release process using release promotion

Use existing

build

artifacts

Generate

updates

L10n

Unit tests

Decision

graphSign Builds

Performance

tests

Repackage

Builds

+

Move

artifacts

Refresh

update db

rules

Update

websites

with release

Page 14: Ship happens:  A better firefox build and release pipeline

About:Taskcluster

● Taskcluster is a task execution framework that supports Mozilla’s continuous

integration farm + release pipeline

It is a set of components that manages task queuing, scheduling, execution and

provisioning of resources.

Page 15: Ship happens:  A better firefox build and release pipeline
Page 16: Ship happens:  A better firefox build and release pipeline

Why: In-tree and Decision Graph

● Build and test configs are all in tree

○ Good news: Developer autonomy

○ Bad news: Developer autonomy

● Decision graph upon push identifies failures more quickly

● Changes can be tested locally and on try

Page 17: Ship happens:  A better firefox build and release pipeline

Testing the graph locally

● Generates the full taskgraph.

○ ./mach taskgraph full > full.txt

● Generates an optimized taskgraph

○ ./mach taskgraph optimized > full.txt

● Generates a target taskgraph

○ ./mach taskgraph target -p parameters.yml > target.txt

● Generates a target taskgraph with json to inspect content of graph

○ ./mach taskgraph target --json -p parameters.yml > target.txt

Page 18: Ship happens:  A better firefox build and release pipeline
Page 19: Ship happens:  A better firefox build and release pipeline
Page 20: Ship happens:  A better firefox build and release pipeline

● Taskcluster config files are under taskcluster/ in tree

○ Example: taskcluster/ci/build/macosx.yml defines mac builds (which

actually run on Linux)

Page 21: Ship happens:  A better firefox build and release pipeline

Changing tests

● YAML files in taskcluster/ci/test/ files define tests groups by suite name - e.g.

mochitest, reftest, talos etc

Page 22: Ship happens:  A better firefox build and release pipeline
Page 23: Ship happens:  A better firefox build and release pipeline
Page 24: Ship happens:  A better firefox build and release pipeline

Why: Docker Containers

● Docker containers for test and build images (not all platforms)○ Consistent environment to debug build and test failures via one click loaners

○ More self-serve developer loaners

Page 25: Ship happens:  A better firefox build and release pipeline
Page 26: Ship happens:  A better firefox build and release pipeline
Page 27: Ship happens:  A better firefox build and release pipeline
Page 28: Ship happens:  A better firefox build and release pipeline
Page 29: Ship happens:  A better firefox build and release pipeline

Why: More autoscaling

● Moved more platforms to AWS enable autoscaling in response to bursty load

○ Moved Macosx builds to Linux cross-compile on AWS

○ Moved many Windows builds/tests to AWS

Page 30: Ship happens:  A better firefox build and release pipeline

Why: More security

● Better security - Chain of Trust (CoT) between artifacts as they are built,

signed and moved to AWS S3/CDNs for download on releases/nightlies

● CoT is the security model for releases

● Task execution is restricted by taskcluster scopes, but that is only one type of

authentication

● CoT allows us to trace requests back to the tree and verify each previous task

in the chain.

● If CoT fails, the task is marked as invalid

Page 31: Ship happens:  A better firefox build and release pipeline

Why+?

● Team learned new things - Docker, transforms, migration strategies,

microservices, monitoring

● Future efficiencies - allow us to continue to scale

● Migrate off technologies that did not scale to our needs

● Re-evaluate existing jobs: Are they still needed? Could they be improved?

Page 32: Ship happens:  A better firefox build and release pipeline

Timeline for migration

● Jan 20 - Linux Desktop and Android Firefox nightly builds from Taskcluster

● Mar 13 - Mobile beta in Taskcluster

● July 2 - Mac Nightlies in Taskcluster

● Aug 30 - Windows nightlies in Taskcluster

● Nov 14 - Shipped Firefox Quantum in Taskcluster

Page 33: Ship happens:  A better firefox build and release pipeline

Approach to migration

● Incremental portions of pool

● Communication

● Checklist

● Monitor capacity and wait times

● Monitor state after migration

● Rollback plan

● Decommission old

● Migrate more

Page 34: Ship happens:  A better firefox build and release pipeline
Page 35: Ship happens:  A better firefox build and release pipeline

Strangler Application - Martin Fowler

Page 36: Ship happens:  A better firefox build and release pipeline

56 was a rough release

● We had many automation changes

○ New compression format for updates

○ Watersheds for win32->win64 migration for people on 64 bit hardware

○ Win32/Win64 on taskcluster

Page 37: Ship happens:  A better firefox build and release pipeline
Page 38: Ship happens:  A better firefox build and release pipeline

Operation: Don’t F*ck up 57

● Implement missing release automation

● Fix our staging environment

● Smooth our merge day process

● Train team members on merges and staging releases

● Run staging releases and merges to iron out any issues

before 57 releases

● Write tests to validate update rules for 57

● Spreadsheet to coordinate update rules with relman

Page 39: Ship happens:  A better firefox build and release pipeline

What have we learned?

● Incrementalism - change one thing, evaluate, then change

another

● Expectations change. The faster we build, the faster other

groups expect to be able to ship

● Staging environment is important to test new automation

● Communication

● Organizational changes

● Consider the operational side, not just landing code

Page 40: Ship happens:  A better firefox build and release pipeline

Upcoming work

● In tree release promotion for beta and release builds

● Release process optimizations: measure our release end-

to-end times, common failure points with the aim of

providing more predictable and stable releases

● Staging releases on try

● More incremental fixes to make things faster

Page 41: Ship happens:  A better firefox build and release pipeline

I embrace mistakes, they make you who you are

―Beyoncé

Page 42: Ship happens:  A better firefox build and release pipeline

Questions?

Page 43: Ship happens:  A better firefox build and release pipeline

Additional Reading

● Justin Wood’s (Callek’s) talks on transforms

https://gitpitch.com/Callek/slideshows/transforms_2017

● All your nightlies are belong to Taskcluster

https://atlee.ca/blog/posts/migration-status.html

● Nightly builds from Taskcluster https://atlee.ca/blog/posts/nightly-builds-from-

taskcluster.html

● 2016 retrospective https://atlee.ca/blog/posts/2016-releng-retrospective.html

● What's So Special About "In-Tree?"

http://code.v.igoro.us/posts/2016/08/whats-so-special-about-in-tree.html

Page 44: Ship happens:  A better firefox build and release pipeline

Additional Reading

● Chris Cooper Nightlies in Taskcluster

http://coopcoopbware.tumblr.com/post/156133487075/nightlies-in-taskcluster-

go-team

● Chris Cooper Mobile Betas in TC

http://coopcoopbware.tumblr.com/post/158362146735/shameless-self-

release-promotion-firefox-530b1

● So you want to rewrite that - Camille Fournier, GOTO conference, Chicago,

2014 https://www.youtube.com/watch?v=PhYUvtifJXk