osdc 2014 test driven infrastructure

26
Test Driven Infrastructure www.immobilienscout24.de Berlin | 09.04.2014 | Schlomo Schapiro Systemarchitekt, Open Source Evangelist License: http://creativecommons.org/licenses/by-nc-nd/3.0/ DevOps Risk Mitigation @schlomoschapiro [email protected]

Upload: schlomo-schapiro

Post on 23-Aug-2014

484 views

Category:

Internet


4 download

DESCRIPTION

Common wisdom has it that the test effort should be related to the risk of a change. However, the reality is different: Developers build elaborate automated test chains to test every single commit of their application. Admins regularly “test” changes on the live platform in production. But which change carries a higher risk of taking the live platform down? What about the software that runs at the “lower levels” of your platform, e.g. systems automation, provisioning, proxy configuration, mail server configuration, database systems etc. An outage of any of those systems can have a financial impact that is as severe as a bug in the “main” software! One of the biggest learnings that any Ops person can learn from a Dev person is Test Driven Development. Easy to say - difficult to apply is my personal experience with the TDD challenge. This talk throws some light on recent developments at ImmobilienScout24 that help us to develop the core of our infrastructure services with a test driven approach: - How to do unit tests, integration tests and systems tests for infrastructure services? - How to automatically verify Proxy, DNS, Postfix configurations before deploying them on live servers? - How to test “dangerous” services like our PXE boot environment or the automated SAN mounting scripts? - How to add a little bit of test coverage to everything we do. - Test Driven: First write a failing test and then the code that fixes it. The tools that we use are Bash, Python, Unit Test frameworks and Teamcity for build and test automation. See http://blog.schlomo.schapiro.org/2013/12/test-driven-infrastructure.html for more about this topic.

TRANSCRIPT

Page 1: OSDC 2014 Test Driven Infrastructure

Test Driven Infrastructurewww.immobilienscout24.de

Berlin | 09.04.2014 | Schlomo SchapiroSystemarchitekt, Open Source Evangelist

License: http://creativecommons.org/licenses/by-nc-nd/3.0/

DevOps Risk Mitigation

@[email protected]

Page 2: OSDC 2014 Test Driven Infrastructure

Slide 2 | Test Driven Infrastructure | Schlomo Schapiro

www.ImmobilienScout24.de

>2 billion PI per month

2 data center with ~1600 VM

~2.5 million outgoing email/day

total of ~600 employees

~30 crossfunctional IT teams

~160 in IT

15 years in business

part of Deutsche Telekom

Page 3: OSDC 2014 Test Driven Infrastructure

Slide 3 | Test Driven Infrastructure | Schlomo Schapiro

ExpensiveFix

CheapFix

Costs Of Finding Bugs In Production

Page 4: OSDC 2014 Test Driven Infrastructure

Slide 4 | Test Driven Infrastructure | Schlomo Schapiro

PlanDesignBudget

Develop Test

RU

NDEV

PlanDesignBudget

Develop Test RUN OPSTIME

Page 5: OSDC 2014 Test Driven Infrastructure

Slide 5 | Test Driven Infrastructure | Schlomo Schapiro

PlanDesignBudget

Develop Test

PlanDesignBudget

Develop Test RUN

Proxyconfig

Databaseborked

„Buy Now“button broken

MTA dropall mail

Load BalancerConfiguration

Everything costsonly 0 €

Login possiblewithout password

NFS notavailable

DB Replicationstopped

No adsshown

BrokenCSS / JS

Tomcat won'tstart/stop

Service usernot defined

sudoersinvalid

Page 6: OSDC 2014 Test Driven Infrastructure

Slide 6 | Test Driven Infrastructure | Schlomo Schapiro

DevOps: Respect & Learning

DEV learn from OPS to think about:

● Resources (CPU, RAM, Disk)● Services (Start, Stop, Status)● Dependencies (Start DB before App)● Logfiles (Rotate, Remove)● Disk Space● Monitoring and Alarming● ...

OPS learn from DEV to think about:

● Incremental Improvement● Infrastructure as Code● Version Control System● Coding (OO, Functions, Libraries …)● Code Quality● Unit & Integration Tests● Test Automation● ...

Page 7: OSDC 2014 Test Driven Infrastructure

Slide 7 | Test Driven Infrastructure | Schlomo Schapiro

Untested=

Broken

Page 8: OSDC 2014 Test Driven Infrastructure

Slide 8 | Test Driven Infrastructure | Schlomo Schapiro

Unit TestsTest the smallest possible components in an artifical environment.

System TestsTest the entire application in a real(istic) environment together with other applications.

Page 9: OSDC 2014 Test Driven Infrastructure

Slide 9 | Test Driven Infrastructure | Schlomo Schapiro

Part of build process

Syntax checks

Scripts

Config Files

Data Files

Unit tests for functions/libs

Run program with test data

Check result

Check program behaviour with wrong/broken test data

Also run on Developer desktop

Quick feedback (~ seconds)

Install on test server

Run tests from outside

HTTP calls

Send emails

Try to login

Run tests from inside

Remote Exec (rsh, ssh …)http://go.schapiro.org/rshpitfall

Service Start, Stop & Status

Modify server to create good & bad test scenarios

Reboot

Unit Tests System Tests

Page 10: OSDC 2014 Test Driven Infrastructure

Slide 10 | Test Driven Infrastructure | Schlomo Schapiro

UnitTests

Page 11: OSDC 2014 Test Driven Infrastructure

Slide 11 | Test Driven Infrastructure | Schlomo Schapiro

%prep%setup -q

%installinstall … %{buildroot}/…install … %{buildroot}/…

%files%defattr(-,root,root,-)/...

Page 12: OSDC 2014 Test Driven Infrastructure

Slide 12 | Test Driven Infrastructure | Schlomo Schapiro

BuildRequires: sudo%buildset -evisudo -c -f sudoers %installinstall -m 0440 sudoers -D \ %{buildroot}/etc/suoders.d/%{name} %files%defattr(-,root,root,-)/etc/suoders.d/%{name}

Page 13: OSDC 2014 Test Driven Infrastructure

Slide 13 | Test Driven Infrastructure | Schlomo Schapiro

BuildRequires: PyYAML, pylint%buildset -e# syntax checksbash -n my_script.sh# Should be valid python codepylint -E yum-repo-propagate# should be valid YAML filepython -c "↩import yaml ↩yaml.safe_load(open('config.yaml'))↩"...

Page 14: OSDC 2014 Test Driven Infrastructure

Slide 14 | Test Driven Infrastructure | Schlomo Schapiro

BuildRequires: python-unittest2, python-teamcity-messages, ...

%buildset +x +e # for teamcity

exec 1>&2 # join stdout to stderr to synchronize between them for teamcity output

FAILED=0TEAMCITY_PROJECT_NAME=1 python unit_tests.py || let FAILED++bash -n oldhomeinfo.sh || let FAILED++

(( FAILED == 0 )) || exit 1

Page 15: OSDC 2014 Test Driven Infrastructure

Slide 15 | Test Driven Infrastructure | Schlomo Schapiro

More Examples for Unit Tests

Patching nsswitch.conf and PAM files

Syntax checking HTTPD, DNS, DHCP ... configuration files

Checking SSH Server & Client configurationshttp://go.schapiro.org/sshconfigtest

...

Page 16: OSDC 2014 Test Driven Infrastructure

Slide 16 | Test Driven Infrastructure | Schlomo Schapiro

SystemTests

http://impreza-gt-club.ch/V2.0/Tests/WRX08/Koch2.jpg

Page 17: OSDC 2014 Test Driven Infrastructure

Slide 17 | Test Driven Infrastructure | Schlomo Schapiro

BuildAuto-

mationSourceCode(SVN)

Monitor

Changes

Build Server

Run BuildJob

● Check out source● Run Unit Tests● Create RPM●

Sourc

eCod

eDEVYUMRepo

UploadPROYUMRepo

PropagateRPM

Test Server

Deploy

and

Run

Test

Job

yum

Prod Server

Deploy

to PROD

yum

Page 18: OSDC 2014 Test Driven Infrastructure

Slide 18 | Test Driven Infrastructure | Schlomo Schapiro

BuildAuto-

mationSourceCode(SVN)

Monitor

Changes

Build Server

Run BuildJob

● Check out source● Run Unit Tests● Create RPM●

Sou

rce

Cod

eDEVYUMRepo

UploadPROYUMRepo

PropagateRPM

Test Server

Deploy

and

Run

Test Job

yum

Prod Server

Deploy

to PROD

yum

Page 19: OSDC 2014 Test Driven Infrastructure

Slide 19 | Test Driven Infrastructure | Schlomo Schapiro

SAN mount service

Test via rsh

Mock SAN deviceswith losetup

service start, stop mounts/umounts

Error handling

Page 20: OSDC 2014 Test Driven Infrastructure

Slide 20 | Test Driven Infrastructure | Schlomo Schapiro

HTTP Proxy Configuration

X-Forwarded-For header spoofs source

Check result for ERR_ACCESS_DENIED

Run tests for all function groups!

Test Server

GET http://external.com/X-Forwarded-For: 10.11.12.01

502 Bad Gateway✔GET http://external.com/X-Forwarded-For: 10.34.56.01

403 Forbidden

ERR_ACCESS_DENIED by proxy server

✘✘

Page 21: OSDC 2014 Test Driven Infrastructure

Slide 21 | Test Driven Infrastructure | Schlomo Schapiro

Subversion Server Configuration

Tests Config RPM Makerhttps://github.com/yadt/yadt-config-rpm-maker

Tests 2 servers: Master & Slave

Replication

Failure and Recovery

Backup and Restore

MasterSVN svnsync

Backup

SlaveSVN

Backup

Page 22: OSDC 2014 Test Driven Infrastructure

Slide 22 | Test Driven Infrastructure | Schlomo Schapiro

PAM & NSS Configuration

PAM & nsswitch.conf patching

Mock setup via rsh

Mock AD groups and users with nss_db

Check service status

Test login via ssh

Thorben Wengert / pixelio.de

Page 23: OSDC 2014 Test Driven Infrastructure

Slide 23 | Test Driven Infrastructure | Schlomo Schapiro

VM Provisioning & Kickstart Installation

Test via HTTP API

Create broken VMs and check error reports

Create valid VM and install Linux OS

Scrape VM screen via OCR

http://github.com/Immobilienscout24/lab-manager-light

Page 24: OSDC 2014 Test Driven Infrastructure

Slide 24 | Test Driven Infrastructure | Schlomo Schapiro

34 35 36

2 3

53

87 88 89 90 91 92 93 94 95 96 97

TIME

Continous Live Deployment

Deploy every application when it is ready.Automate the delivery chain from source till production.

Page 25: OSDC 2014 Test Driven Infrastructure

Slide 25 | Test Driven Infrastructure | Schlomo SchapiroLow Risk – Lots of Fun

http://go.schapiro.org/slides

Page 26: OSDC 2014 Test Driven Infrastructure

Slide 26 | Test Driven Infrastructure | Schlomo Schapiro

Kontakt:Immobilien Scout GmbHAndreasstraße 1010243 Berlin

Fon: +49 30 243 01-1229 Email: [email protected]: www.immobilienscout24.de

Thank you very much!Please contact me for further questions and discussions.