mincs - containers in the shell script (eng. ver.)

Post on 24-Jan-2017

966 Views

Category:

Software

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

MINCS – containers in the shell script

@mhiramatGithub.com/mhiramat/

2

Who

@mhiramatA linux kernel hacker but less chance to coding (><)o

Maintain perf-probe and kprobes

3

At First

This presentation is almost 100% about shell script.

Not about kernel 'C' source code.

4

What is the container?

Container == Docker?

There are other OSS implementations!LXC

Runc

OpenVZ

etc…

So what the container is ...?

5

What is the Docker ?

Docker provides many container related features.Containerize

Packaging software

Managing Layers and its catalog

REST API

etc…

How does it work??

6

Docker is Great, but...

It seems a bit .. too BIGAll the features are hidden in one binary

It is hard to know how it works

Remember the Unix philosophyKeep It Simple, Stupid

We can do it with existing tools

7

Let's mimic it!

Let's try to make a minimal containerHow to use the namespaces

How to bind the devices

How to change the rootfs with chroot/pivot_root

How to use Capabilities and CPUSET etc.

Let's try to overlay the layersNow we have the overlayfs!

How to manage layers

8

 MINCS

Minimum Container Shell-scriptshttps://github.com/mhiramat/mincs

Basic functionsUse PID/Net/UTS/Mount namespaces

Layering with overlayfs

Capabilities, CPUSET and more

POSIX shell script (not bash script)This can work with busybox shell/dash

9

The MINCS

Frontendminc

marten

polecat

Backendminc-exec

minc-coat

minc-leash

minc-farm

minc-trapper

10

Frontend Scripts

Frontends of MINCSMinc : run a command in a container

Marten : manage layered container images

Polecat : make a self executable containerized command

Frontend == parsing optionsSet options to environment vars and call backend scripts

The pair of marten/minc-farm is exception

11

minc

The main tool of MINCSRun a command in a container

Works as chroot

(Or Docker run? :)

Setup namespaces and workspaces by overlayfsDo not need any container images like Docker

No need rootfs dir as chroot (we can reuse current rootfs)

Netns is not enabled by default

[mhiramat@localhost mincs]$ sudo ./minc ps -efUID PID PPID C STIME TTY TIME CMDroot 1 0 0 10:58 ? 00:00:00 ps -ef

12

minc: Usage

Usage: minc [options] [command]Options:

-r/--root ROOTDIR Specify a directory as a rootfs. If omitted, use “/”.

-t/--temp TEMPDIR Specify a working directory. If omitted, use a tmpdir by mkdir.

-k Do not remove working directory

--name UTSNAME Specify the host name in the container

--debug Show the debug log

If the command is omitted, run $SHELL.

13

Dive into the shell script

Let's look into the minc command

Phase 1: Parse the command line and setup env-vars.

Phase 2: Invoke minc-execSetup netns and cpumask (if needed)

Move to the new namespaces

Get correct PID and setup UTSNAME

Setup rootfs for container

Bind device files

Unmount original mounts

Chroot to new rootfs and setup capabilities

14

Minc: command line parsing

Case and while loopGetopts is not used (not so flexible)

While { case & shift } loop

Mainly setup the environment value

After loopCall minc-farm to get image based on UUID

Post-scripting by trap command

Call minc-exec

15

Minc-exec(1) : Overview

Self execution shellscriptUnshare requires some other command to execute, so call the script itself

This is a historical reason – previously minc-exec was chns – 1 script

The first execution is outside a containerSetup netns and cpuset

Call unshare to make a container (namespace)

The second is inside of the containerSwitch the script by checking PID == 1

Hide something from the program running in the container

Device files / unused mount points

16

Minc-exec(2) : netns/cpuset

netnsUse “ip netns” to create new network namespace if needed

Use trap command to remove when the shell exits

Just create an eth pair on the namespacesDo not assign IP address

We can use “pipework” for more networking options

CPUSETJust setup a CPUSET bitmask by using taskset.

Still not using cgroups

17

Intermission: Trap command

Trap is great :)We can handle signal interrupts and exit

Able to call shell script functions

Minc usually use trap for...Remove temporary files/PID file

Show the information messages when exits

Suppress ^C

18

Minc-exec(3): Change namespace

Use unshare to change namespacesRun unshare by passing $0

Pid, mount, ipc, uts namespaces are unshared unshare -iumpf $0 “$@”

For the netns, we use ip netns exec

ip netns exec $MINC_NETNS unshare -iumpf $0 “$@”

19

Minc-exec(4): Setup PID and utsname

Get the original PID (PID in parent namespace)The PID outside container is good to send signal

Since unshare command forks, we can know the PID inside the container.

Even if we separate mount namespace, /proc is still same until remount it.This means we can see /proc/self.

Set up utsnameUse hostname command to setup utsname

20

Minc-exec(5):Mount namespace

Setup mount namespaceIn some environment (with systemd?), mount information propagates to other namespaces

Mount --make-rprivate /Do not propagate all the mount operations

Overlaying workspace via minc-coatMinc-coat backend does overlay on rootfs image.

Do not change rootfs afterwords.

If the rootfs can be changed, use --direct option

21

Minc-coat: Implement overlays

Make root/, storage/, work/ under tempdirRoot/: The mountpoint for overlayfs → $RD

Storage/: Overlayfs top directory →$UD

work: a workdir for overlayfs → $WD

Build a new rootfs via OverlayfsNot only using mount namespaces, but also layering for storage isolation

Some differences are there depends on the versionOverlayfs for upstream kernel

mount -t overlay -o upperdir=$UD,lowerdir=$BASEDIR,workdir=$WD overlayfs $RD

Overlayfs for Ubuntu14.10 (out-of-tree)

mount -t overlayfs -o upperdir=$UD,lowerdir=$BASEDIR overlayfs $RD

22

Minc-exec(6): Special Files

Special files and directoriesMake /etc, /dev, /sys and /proc on new rootfs

Bind mounts under /devTouch dummy files and bind it (like symlink)

/dev/console, /dev/null, /dev/zero, /dev/random, /dev/urandom, /dev/mqueue

(and others, if you need)

/dev/pts are mounted with newinstance

Mount /proc for new PID namespaceOld /proc should be ro remount.

Some files to be readonly (/proc/sys etc.), should be bind-mounted the ro /proc.

Bind mounts /sysThis could be skipped or be read only

23

Intermission: Debug

How to debug it?Just for checking the commands, run it with --debug

This option enables “set -x”

If you want to break into it, write “bash”(or other shell you like)You can do anything :)

Or write a command what you run

MINCS is just a set of shell scriptsYou can change it as you want.

24

Minc-exec(7): Post-process Mountpoint

Remove old mountpointsIf we keep it, it can still be visible after chroot

Use pivot_root to unmount somethings

Let's monitor it with “df -h”

At last, call minc-leash to chroot.

25

Minc-exec(7): Post-process Mountpoint

Remove old mountpointsIf we keep it, it can still be visible after chroot

Use pivot_root to unmount somethings

Let's monitor it with “df -h”

Filesystem Size Used Avail Use% Mounted ondevtmpfs 740M 0 740M 0% /devtmpfs 748M 0 748M 0% /dev/shmtmpfs 748M 8.5M 740M 2% /runtmpfs 748M 0 748M 0% /sys/fs/cgroup/dev/sda2 15G 8.6G 6.5G 58% /

Before minc

26

Minc-exec(7): Post-process Mountpoint

Remove old mountpointsIf we keep it, it can still be visible after chroot

Use pivot_root to unmount somethings

Let's monitor it with “df -h”Filesystem Size Used Avail Use% Mounted on/dev/sda2 15G 8.6G 6.5G 58% /devtmpfs 740M 0 740M 0% /devtmpfs 748M 0 748M 0% /dev/shmtmpfs 748M 0 748M 0% /sys/fs/cgrouptmpfs 748M 8.5M 740M 2% /runoverlayfs 15G 8.6G 6.5G 58% /tmp/minc1012-NpuyIA/roottmpfs 748M 0 748M 0% /tmp/minc1012-NpuyIA/root/devdevtmpfs 740M 0 740M 0% /tmp/minc1012-NpuyIA/root/dev/consoledevtmpfs 740M 0 740M 0% /tmp/minc1012-NpuyIA/root/dev/nulldevtmpfs 740M 0 740M 0% /tmp/minc1012-NpuyIA/root/dev/zerodevtmpfs 740M 0 740M 0% /tmp/minc1012-NpuyIA/root/dev/randomdevtmpfs 740M 0 740M 0% /tmp/minc1012-NpuyIA/root/dev/urandom

Special files

27

Minc-exec(7): Post-process Mountpoint

Remove old mountpointsIf we keep it, it can still be visible after chroot

Use pivot_root to unmount somethings

Let's monitor it with “df -h”Filesystem Size Used Avail Use% Mounted on/dev/sda2 15G 8.6G 6.5G 58% /.origdevtmpfs 740M 0 740M 0% /.orig/devtmpfs 748M 0 748M 0% /.orig/dev/shmtmpfs 748M 0 748M 0% /.orig/sys/fs/cgrouptmpfs 748M 8.5M 740M 2% /.orig/runoverlayfs 15G 8.6G 6.5G 58% /tmpfs 748M 0 748M 0% /devdevtmpfs 740M 0 740M 0% /dev/consoledevtmpfs 740M 0 740M 0% /dev/nulldevtmpfs 740M 0 740M 0% /dev/zerodevtmpfs 740M 0 740M 0% /dev/randomdevtmpfs 740M 0 740M 0% /dev/urandom

After the first pivot_root

28

Minc-exec(7): Post-process Mountpoint

Remove old mountpointsIf we keep it, it can still be visible after chroot

Use pivot_root to unmount somethings

Let's monitor it with “df -h”Filesystem Size Used Avail Use% Mounted on/dev/sda2 15G 8.6G 6.5G 58% /.origoverlayfs 15G 8.6G 6.5G 58% /tmpfs 748M 0 748M 0% /devdevtmpfs 740M 0 740M 0% /dev/consoledevtmpfs 740M 0 740M 0% /dev/nulldevtmpfs 740M 0 740M 0% /dev/zerodevtmpfs 740M 0 740M 0% /dev/randomdevtmpfs 740M 0 740M 0% /dev/urandom

Remove oldProcfs, etc.

29

Minc-exec(7): Post-process Mountpoint

Remove old mountpointsIf we keep it, it can still be visible after chroot

Use pivot_root to unmount somethings

Let's monitor it with “df -h”Filesystem Size Used Avail Use% Mounted onoverlayfs 15G 8.6G 6.5G 58% /tmpfs 748M 0 748M 0% /devdevtmpfs 740M 0 740M 0% /dev/consoledevtmpfs 740M 0 740M 0% /dev/nulldevtmpfs 740M 0 740M 0% /dev/zerodevtmpfs 740M 0 740M 0% /dev/randomdevtmpfs 740M 0 740M 0% /dev/urandom

2nd pivot_root andChroot to new rootfs

30

Minc-leash: capabilities and chroot

Leash() = “Least capabilities shell”Limits capabilities and chroot by using capsh(libcap)

Change UID/GID too

If we skip capabilities setting, just do chroot

Wash() = “Wash out the environment variables”MINCS use environment variables internally, clean it up

Unset all the vars start with MINC_*

31

Use cases of MINCS

Good learning material for containersIf you hits some limitations on docker, you can try it, and understand.

Prototyping new features

Containers for embedded devicesIs it wrong to desire running applications in containers on embedded device? :)

Docker(>14MB, docker only) vs MINCS+Busybox(<4MB, +shell and tools)→ Boot2MINC

32

Boot2minc

Minimal ISO image +MINCShttps://github.com/mhiramat/boot2minc

Forked from minimal Linux Live (https://github.com/ivandavidov/minimal  )

IncludingLinux kernel

Busybox(+unshare patch)

MINCS

8MB image including kernel (can run on Qemu-kvm)Able to reduce the size if we optimize the configuration

33

Marten: Manage container images

Minc provides only container featureShould we prepare rootfs via debootstrap?

How to get the rootfs of Fedora/CentOS etc.?

Want to reuse the result of previous container easily

Overlayfs-based container image managerIdentify container images by Docker-like UUID

Track the dependency between images

Import Docker export/saved images

34

Demonstration

Minc

Marten

Boot2minc

35

TODO

mincWork with pipework

Correct TTY support via tmux/screen

Use cgroups to limit cpu/memory/io usage (minc-cage?)

Plugin support of btrfs and dm-thin

MartenContainer execution command (like docker run)

Support OCI compatible container export/import and signing

36

Known Issues

TestcasesWell, we can make it by shell script too :)

CapshCapsh only accepts “sh -c” type command

It doesn't accept escape characters…

37

Conclusion

What I'd like to say is

“We can run a container by combining commands”Docker etc. is not a special, we've already have fundamental tools.

And

“Shell script is great!”

38

END

Thank you very much!:)

https://github.com/mhiramat/mincs

39

Example: Import image from Docker

# docker save centos | gzip - > centos.tar.gz # marten import centos.tar.gzImporting image: centos511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c1585b12ef8fd57065237a6833039acc0e7f68e363c15d8abb5cacce7143a1f7de8a8efe422e6104930bd0975c199faa15da985b6694513d2e873aa2da9ee402174c # marten imagesID SIZE NAME511136ea3c5a 4.0K (noname)5b12ef8fd570 4.0K (noname)8efe422e6104 224M centos # minc -r centos /bin/bash

top related