parallelizing ci using docker swarm-mode · 2017. 12. 14. · solution: parallelize ci across...

17
Copyright©2017 NTT Corp. All Rights Reserved. Akihiro Suda <[email protected]> NTT Software Innovation Center Parallelizing CI using Docker Swarm-Mode

Upload: others

Post on 30-Dec-2020

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

Copyright©2017 NTT Corp. All Rights Reserved.

Akihiro Suda <[email protected]>

NTT Software Innovation Center

Parallelizing CI using Docker Swarm-Mode

Page 2: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

2Copyright©2017 NTT Corp. All Rights Reserved.

https://github.com/AkihiroSuda

• Software Engineer at NTT Corporation

• Docker core maintainer

• Previous talks at FLOSS community

• FOSDEM 2016

• ApacheCon Core North America 2016

• ...

Who am I

Page 3: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

3Copyright©2017 NTT Corp. All Rights Reserved.

A problem in Docker project: CI is slow

https://jenkins.dockerproject.org/job/Docker-PRs/buildTimeTrend

capture: March 3, 2017

120 min

Page 4: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

4Copyright©2017 NTT Corp. All Rights Reserved.

How about other FLOSS projects?

https://builds.apache.org/view/All/job/Pr

eCommit-HDFS-Build/buildTimeTrendhttps://builds.apache.org/view/All/job/Pr

eCommit-YARN-Build/buildTimeTrend

https://amplab.cs.berkeley.edu/jenkins/j

ob/SparkPullRequestBuilder/buildTimeTr

end

capture: March 3, 2017200 min

260 min

240 min

https://grpc-

testing.appspot.com/job/gRPC_pull_req

uests_linux/buildTimeTrend

550 min

(for ease of visualization, picked up some projects that use Jenkins from various categories)

Page 5: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

5Copyright©2017 NTT Corp. All Rights Reserved.

• Blocker for reviewing/merging patches

• Discourages developers from writing tests

• Discourages developers from enabling additional testing features (e.g. race detector)

Why slow CI matters?

Poor implementation quality &Slow development cycle

Page 6: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

6Copyright©2017 NTT Corp. All Rights Reserved.

e.g.

• `go test –parallel N`

• `mvn –-threads N`

• `parallel`

• ...

But parallelization is not enough

• No isolation

• Concurrent test tasks may race for certain shared resources(e.g. files under `/tmp`, TCP port, ...)

• Poor scalability

• CPU/RAM resource limitation

• I/O parallelism limitation

Solution: Parallelize CI ?

Page 7: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

7Copyright©2017 NTT Corp. All Rights Reserved.

Solution: Parallelize CI across Docker Swarm-mode

Distributeacross Swarm-mode

Parallelize & Isolateusing Docker containers

✓Isolation ✓Scalability

• Docker provides isolation

• Docker Swarm-mode provides scalability,plus allows you to manage multiple Docker nodes as if they were a single node

Page 8: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

8Copyright©2017 NTT Corp. All Rights Reserved.

• For ideal isolation, each of the test functions should be encapsulated into independent containers

• But this is not optimal in actual due to setup/teardown code

Chunking

func TestMain(m *testing.M) {

setUp()

m.Run()

tearDown()

}

func TestFoo(t *testing.T)

func TestBar(t *testing.T)

redundantly executed

setUp()

testFoo.Run()

tearDown()

container

setUp()

testBar.Run()

tearDown()

container

Page 9: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

9Copyright©2017 NTT Corp. All Rights Reserved.

• So we execute a chunk of multiple test functions sequentially in a single container

Chunking

setUp()

testFoo.Run()

tearDown()

container

setUp()

testBar.Run()

tearDown()

container

setUp()

testFoo.Run()

testBar.Run()

tearDown()

container

chunk

Page 10: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

10Copyright©2017 NTT Corp. All Rights Reserved.

• makespans of the chunks are not uniform

• So we shuffle the chunks for "hope" of optimal scheduling

• No guarantee for optimal schedule

Shuffling

Test1

Test2

Test3

Test4

Test4 Test1

Test2

Test3

Test1

Test2Test3

Test4shufflelongest

makespannonuniformity

decrease inmakespan

Page 11: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

11Copyright©2017 NTT Corp. All Rights Reserved.

• We use Funker (github.com/bfirsh/funker)

• FaaS-like architecture

• Loads are automatically balanced via Docker's built-in LB

• No explicit task queue

• No `docker exec` hack

• Could be easily portable to other container orchestrators

Implementation

masterBuilt-in

LB

Client

worker.2

worker.1

worker.3

Funker!

Docker Swarm-mode cluster(typically on cloud)

(typically on laptop)

Page 12: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

12Copyright©2017 NTT Corp. All Rights Reserved.

Experimental result

• Evaluated my hack against the CI of Docker itself• Of course, this hack can be applicable to CI of other software as well

• Target: Docker 16.04-dev (git:7fb83eb7)• Contains 1,648 test functions

• Machine: Google Computing Enginen1-standard-4 instances (4 vCPUS, 15GB RAM)

1h22m7s(traditional testing)

4m10s

1 node

10 nodes

Cost: x10

Effectiveness: x20

Time: average of 5 runs

Page 13: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

13Copyright©2017 NTT Corp. All Rights Reserved.

Detailed result

Even with a single node,

the result is significantly better than

the traditional testing (1h22m7s)

Nodes 10 30 50 70

1 15m3s N/A(more than 30m)

2 12m7s 10m12s 11m25s 13m57s

5 10m16s 6m18s 5m46s 6m28s

10 8m26s 4m31s 4m10s 4m20s

Number of containers running in parallel(chunk size is inversely proportional to this)

Fastest

configuration

Page 14: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

14Copyright©2017 NTT Corp. All Rights Reserved.

PR (merged): https://github.com/docker/docker/pull/29775

The code is available!

$ cd $GOPATH/src/github.com/docker/docker

$ make build-integration-cli-on-swarm

$ ./hack/integration-cli-on-swarm/integration-cli-on-swarm ╲

-replicas 50 ╲

-shuffle ╲

-push-worker-image your-docker-registry.example.com/worker

Page 15: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

15Copyright©2017 NTT Corp. All Rights Reserved.

• Yes, with just a few line of modifications

• Planning to split this hack into an independent generic test tool

Is it applicable to other software as well?

Page 16: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

16Copyright©2017 NTT Corp. All Rights Reserved.

• "Shuffling" does not always result in optimal schedule

• Future work: learn the past execution history to optimize the schedule

Future work

Page 17: Parallelizing CI using Docker Swarm-Mode · 2017. 12. 14. · Solution: Parallelize CI across Docker Swarm-mode Distribute across Swarm-mode Parallelize & Isolate using Docker containers

17Copyright©2017 NTT Corp. All Rights Reserved.

• Docker Swarm-mode is effective for parallelizing CI jobs

• Some hacks for optimizing schedule

Recap

1h22m7s(traditional testing)

4m10s

1 node

10 nodes

Cost: x10

Effectiveness: x20