Container Creation Using Namespaces and Bash
A few weeks ago, I saw a great video explaining how Docker works under the hood (see video below). The video ends with a demo where Jérôme Petazzoni creates a container using nothing but bash. I found many of the commands that he used pretty cryptic, so I decided to explain what he did and the purpose of each command.
Video
(The demo starts around minute 41)
Terminology
Before we dive into the demo, let’s get some terminology out of the way.
Container
A container is a combination of a few technologies including namespaces, cgroups, and capabilities. In this post, we’re going to focus on namespaces.
Namespaces
From Namespaces in operation:
The purpose of each namespace is to wrap a particular global system resource in an abstraction that makes it appear to the processes within the namespace that they have their own isolated instance of the global resource.
There are six types of namespaces in Linux:
- Pid: Isolates process identifiers. For example, two processes in different namespaces can have the same PID.
- User: Isolates user ids and group ids. Two users in two different user namespaces can have the same user ids. This is pretty useful because it allows mapping an unprivileged user id outside of the namespace to be root inside of the namespace.
- Net: This namespace provides network isolation. Processes running in a separate net namespace don’t see the network interfaces of other namespaces.
- Mnt: The mount namespace allows containers to have their own mount points without polluting the global namespace. It also provides a way to hide the global mount points from the container.
- Uts: This allows a container to have its own hostname for the processes running in the container.
- Ipc: Gives containers their own inter-process communication namespace.
Note: There is a lot more to namespaces than this. If you want to learn more, take a look at Namespaces in operation.
Btrfs
From the Btrfs wiki
Btrfs is a modern copy on write (CoW) filesystem for Linux aimed at implementing advanced features while also focusing on fault tolerance, repair, and easy administration.
Setup
System
For this demo, I used a computer running Ubuntu 16.04. To be more specific:
nmesa@desktop-nicolas:~/demos/containers$ uname -a
Linux desktop-nicolas 4.13.0-39-generic #44~16.04.1-Ubuntu SMP Thu Apr 5 16:43:10 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
We also need docker for two things:
- To extract the alpine docker image.
- To use the bridge provided by docker for networking.
This is the version of docker that I had installed:
root@desktop-nicolas:/btrfs# docker version
Client:
Version: 18.06.0-ce
API version: 1.38
Go version: go1.10.3
Git commit: 0ffa825
Built: Wed Jul 18 19:11:02 2018
OS/Arch: linux/amd64
Experimental: false
Server:
Engine:
Version: 18.06.0-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: 0ffa825
Built: Wed Jul 18 19:09:05 2018
OS/Arch: linux/amd64
Experimental: false
System setup
Before we start to follow along with the video, we need to setup our environment into the same state as Jérôme’s computer. We need to have a btrfs filesystem mounted in /btrfs.
Create the disk image
We start by creating an empty disk image (from the btrfs wiki, the image should be at least 1GB in size).
nmesa@desktop-nicolas:~/demos/containers$ dd if=/dev/zero of=disk.img bs=512 count=4194304
4194304+0 records in
4194304+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 4.60753 s, 466 MB/s
nmesa@desktop-nicolas:~/demos/containers$ ls -alh disk.img
-rw-rw-r-- 1 nmesa nmesa 2.0G Aug 23 22:06 disk.img
We use the dd
command to create a 2GB empty image. The dd
command receives four arguments:
if
: The file/device to use as the input. In this case,/dev/zero
, which outputs a bunch of zeroes.of
: The output file/device. In this case,disk.img
.bs
: This is the block size in bytes (512 bytes in this case).count
: The number ofbs
blocks to read from/dev/zero
and write todisk.img
. In this case, 4194304 (2 * 1024 * 1024 * 1024 / 512) to make the image 2GB.
Format disk image
Let’s format our new image with the btrfs filesystem by using the mkfs.btrfs
command.
nmesa@desktop-nicolas:~/demos/containers$ mkfs.btrfs disk.img
btrfs-progs v4.4
See http://btrfs.wiki.kernel.org for more information.
Label: (null)
UUID: 772f0676-ec96-448d-b58c-31d4c10b5c3a
Node size: 16384
Sector size: 4096
Filesystem size: 2.00GiB
Block group profiles:
Data: single 8.00MiB
Metadata: DUP 110.38MiB
System: DUP 12.00MiB
SSD detected: no
Incompat features: extref, skinny-metadata
Number of devices: 1
Devices:
ID SIZE PATH
1 2.00GiB disk.img
Mount the disk image
Great! We have formatted the filesystem! We need to mount the image to /btrfs using the mount
command (note that you have to be root).
root@desktop-nicolas:/home/nmesa/demos/containers# mkdir /btrfs
root@desktop-nicolas:/home/nmesa/demos/containers# mount -t btrfs disk.img /btrfs
We are ready to follow along with the video!
Video breakdown
In this section, we follow along with the video, and I describe what each command does. Most commands require that you run them as root.
Make mount private
root@desktop-nicolas:/# mount --make-rprivate /
With this command, we make the mounts that our container is going to make private. This prevents any mounts that our container does, from being visible by the host system. The r
in --make-rprivate
stands for recursive which means it makes all mount points private.
Create container image
Let’s create two directories, one to hold the source images and one to hold the container images.
root@desktop-nicolas:/btrfs# mkdir images containers
Then we use the btrfs subvol create
command to create a new image called alpine.
root@desktop-nicolas:/btrfs# btrfs subvol create images/alpine
Create subvolume 'images/alpine'
root@desktop-nicolas:/btrfs# ls images/alpine/
root@desktop-nicolas:/btrfs#
The btrfs
command is used to control btrfs filesystems. The subvolume
subcommand is used to manage btrfs subvolumes. We use the create
subcommand to create a subvolume named alpine in the images directory.
We have an image, but it’s empty. We use docker
to get the alpine image.
root@desktop-nicolas:/btrfs# CID=$(docker run -d alpine true)
root@desktop-nicolas:/btrfs# echo $CID
9b290758c1734811c85445640c985fed9803121891d6604b4260e985c2647404
root@desktop-nicolas:/btrfs# docker export $CID | tar -C images/alpine/ -xf-
root@desktop-nicolas:/btrfs# ls images/alpine/
bin dev etc home lib media mnt proc root run sbin srv sys tmp usr var
We run docker run -d alpine true
and assign its output (the container id) to the CID
variable. Here’s a better explanation of the docker
command:
run
: runs a container.-d
: Detaches the container and prints the container id. Our terminal doesn’t wait for the container to finish its execution.alpine
: The image that the container should use.true
: The command to run in the container.
We use the docker export
command to export the image. We pipe that image to the tar
command which decompresses it and puts the files in images/alpine. Then we run ls images/alpine
to see the contents of the image.
The next step is to take a snapshot of this image and put it in our containers folder. The snapshot’s name is tupperware.
root@desktop-nicolas:/btrfs# btrfs subvol snapshot images/alpine/ containers/tupperware
Create a snapshot of 'images/alpine/' in 'containers/tupperware'
root@desktop-nicolas:/btrfs# ls containers/tupperware/
bin dev etc home lib media mnt proc root run sbin srv sys tmp usr var
We used the snapshot
subcommand to create a snapshot of the subvolume alpine with the name tupperware in the containers directory. Snapshots are very efficient. They don’t copy the data from the source subvolume making them space efficient and very fast to create. If a snapshot is modified, the original subvolume won’t be affected. We can see this in the following example:
root@desktop-nicolas:/btrfs# touch containers/tupperware/NICK_WAS_HERE
root@desktop-nicolas:/btrfs# ls containers/tupperware/
bin dev etc home lib media mnt NICK_WAS_HERE proc root run sbin srv sys tmp usr var
root@desktop-nicolas:/btrfs# ls images/alpine/
bin dev etc home lib media mnt proc root run sbin srv sys tmp usr var
In this example, we add a file called NICK_WAS_HERE to the snapshot. Then we prove that it is visible in the snapshot but not in the original alpine subvolume. More information about the btrfs
command can be found in this post.
Testing using chroot
Let’s use the chroot
command to run sh
inside the tupperware snapshot.
root@desktop-nicolas:/btrfs# chroot containers/tupperware/ sh
/ # ls /
NICK_WAS_HERE dev home media proc run srv tmp var
bin etc lib mnt root sbin sys usr
/ # apk
apk-tools 2.10.0, compiled for x86_64.
Installing and removing packages:
add Add PACKAGEs to 'world' and install (or upgrade) them, while ensuring that all dependencies are met
del Remove PACKAGEs from 'world' and uninstall them
System maintenance:
fix Repair package or upgrade it without modifying main dependencies
update Update repository indexes from all remote repositories
upgrade Upgrade currently installed packages to match repositories
cache Download missing PACKAGEs to cache and/or delete unneeded files from cache
Querying information about packages:
info Give detailed information about PACKAGEs or repositories
list List packages by PATTERN and other criteria
dot Generate graphviz graphs
policy Show repository policy for packages
Repository maintenance:
index Create repository index file from FILEs
fetch Download PACKAGEs from global repositories to a local directory
verify Verify package integrity and signature
manifest Show checksums of package contents
Use apk <command> --help for command-specific help.
Use apk --help --verbose for a full command listing.
This apk has coffee making abilities.
/ # exit
root@desktop-nicolas:/btrfs#
chroot
is used to change the root directory of a program. In the example above, the /btrfs/container/tupperware directory becomes the root of the filesystem. We execute apk
, which is only available in alpine, and then we exit the container.
Note that we didn’t use namespaces in this “container.” Let’s change that!
Using namespaces
root@desktop-nicolas:/btrfs# unshare --mount --uts --ipc --net --pid --fork bash
root@desktop-nicolas:/btrfs#
We use the unshare
command to run a program (bash) in different mount, uts, ipc, net and pid namespaces. We also pass in the --fork
flag to create a new process as a child of the unshare
command. It may seem like nothing happened from the command output, but we’re using namespaces and are somewhat isolated from the global namespace. Let’s change the hostname to prove that we are in an isolated uts namespace.
root@desktop-nicolas:/btrfs# hostname tupperware
root@desktop-nicolas:/btrfs# exec bash
root@tupperware:/btrfs#
Note that after executing bash
, our hostname switched to tupperware. Running hostname
from another terminal shows that the global namespace still has the same hostname (desktop-nicolas in this case).
# from a different terminal
nmesa@desktop-nicolas:~/demos/containers$ hostname
desktop-nicolas
Let’s do some experiments with the pid namespace.
root@tupperware:/btrfs# ps
PID TTY TIME CMD
24506 pts/26 00:00:00 bash
24551 pts/26 00:00:00 unshare
24552 pts/26 00:00:00 bash
24961 pts/26 00:00:00 ps
28780 pts/26 00:00:00 sudo
28781 pts/26 00:00:00 su
28782 pts/26 00:00:00 bash
root@tupperware:/btrfs# pidof unshare
24551
root@tupperware:/btrfs# kill $(pidof unshare)
bash: kill: (24551) - No such process
We see a lot more processes than expected. The PIDs are also those of the main system. What is going on? It turns out that the /proc of the global namespace is mounted, and the ps
command uses /proc to get some of its information. (You can verify this by running strace ps
.)
Mount proc
Let’s mount /proc.
root@tupperware:/btrfs# mount -t proc proc /proc
root@tupperware:/btrfs# ps
PID TTY TIME CMD
1 pts/26 00:00:00 bash
30 pts/26 00:00:00 ps
root@tupperware:/btrfs# umount /proc
root@tupperware:/btrfs#
After we mount proc, ps
only shows the processes running in our container with the correct PIDs. We unmount /proc again for now.
We’ve used mount
before. Let’s go over what it does.
The mount
command is used to mount filesystems and usually has the following syntax:
mount -t <type> <device> <directory>
From the mount man page:
this tells the kernel to attach the filesystem found on
<device>
(which is of type<type>
) at the directory<directory>
.
Here’s the argument breakdown for our specific case:
-t proc
: The type of filesystem to use is proc.proc
: The /proc filesystem is not attached to a device. Any keyword here would work.- Note that in the video Jérôme uses none instead of proc for this value. I changed this to be proc because the mount man page discourages the use of none for this (“The customary choice none is less fortunate: the error message ‘none busy’ from umount can be confusing.“).
/proc
: The directory where we want to mount the proc filesystem.
Pivot root
Let’s make /btrfs/containers/tupperware the root of our filesystem by using mount
and pivot_root
.
root@tupperware:/btrfs# mkdir /btrfs/containers/tupperware/oldroot
root@tupperware:/btrfs# mount --bind /btrfs/containers/tupperware /btrfs
root@tupperware:/btrfs# cd /btrfs/
root@tupperware:/btrfs# ls
bin dev etc home lib media mnt NICK_WAS_HERE oldroot proc root run sbin srv sys tmp usr var
root@tupperware:/btrfs# pivot_root . oldroot
root@tupperware:/btrfs# cd /
root@tupperware:/# ls
NICK_WAS_HERE dev home media oldroot root sbin sys usr
bin etc lib mnt proc run srv tmp var
root@tupperware:/# ls oldroot/
bin cdrom hello lib media root srv usr
boot core home lib32 mnt run sys var
bt dev initrd.img lib64 opt sbin tes vmlinuz
btrfs etc initrd.img.old lost+found proc snap tmp vmlinuz.old
Let’s break this down:
- We create a directory called oldroot where our current root filesystem will be mounted. As Jérôme explained in the video, to be able to run
pivot_root
successfully, you need to be close to the top of the filesystem (/). - We use
mount --bind
to get the container filesystem to /btrfs. (Jérôme splits this into two steps, and I didn’t understand why. I decided to try it this way to see if it would work and it did). Themount --bind
command is useful to mount a directory somewhere else (in our case we’re mounting /btrfs/containers/tupperware to /btrfs). - We go to /btrfs and run
ls
to make sure that our container filesystem is mounted there. - We execute
pivot_root
which switches the current directory (/btrfs) to be the new root and mounts the current root in oldroot. - We move to the new root and check that it is mounted correctly by running
ls
. - We verify that oldroot has the old root mount point.
Note: I didn’t understand why we needed to do a pivot_root
instead of doing a bind mount point to /
. This email by Linus Torvalds himself helped me understand. In a nutshell, /
points to the process’ root of the filesystem. By doing a bind mount to root (/
), the process would still point to the old version of /
.
Fixing the mount points
Let’s take a look at our mount points.
root@tupperware:/# mount -t proc proc /proc
root@tupperware:/# mount | head
/dev/sda2 on /oldroot type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/sda2 on /oldroot type ext4 (rw,relatime,errors=remount-ro,data=ordered)
udev on /oldroot/dev type devtmpfs (rw,nosuid,relatime,size=16404692k,nr_inodes=4101173,mode=755)
devpts on /oldroot/dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /oldroot/dev/shm type tmpfs (rw,nosuid,nodev)
mqueue on /oldroot/dev/mqueue type mqueue (rw,relatime)
hugetlbfs on /oldroot/dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
tmpfs on /oldroot/run type tmpfs (rw,nosuid,noexec,relatime,size=3286976k,mode=755)
tmpfs on /oldroot/run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
cgmfs on /oldroot/run/cgmanager/fs type tmpfs (rw,relatime,size=100k,mode=755)
We mount /proc first since mount
relies on it. (You can verify this by running strace mount
).
Note: The mount point list was pretty, big so I piped it to head
to show the first 10 lines.
Most of these mounts shouldn’t be part of our container so let’s unmount everything and mount /proc again.
root@tupperware:/# umount -a
umount: can't unmount /oldroot/btrfs: Resource busy
umount: can't unmount /oldroot: Resource busy
umount: can't unmount /oldroot: Resource busy
root@tupperware:/# mount -t proc proc /proc
root@tupperware:/# mount
/dev/sda2 on /oldroot type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/sda2 on /oldroot type ext4 (rw,relatime,errors=remount-ro,data=ordered)
/dev/loop0 on /oldroot/btrfs type btrfs (ro,relatime,space_cache,subvolid=5,subvol=/)
/dev/loop0 on / type btrfs (ro,relatime,space_cache,subvolid=258,subvol=/containers/tupperware)
proc on /proc type proc (rw,relatime)
umount
failed for /oldroot because the resource was busy. This is why /oldroot still shows up when we run mount
. Let’s fix that.
root@tupperware:/# umount -l /oldroot
root@tupperware:/# mount
/dev/loop0 on / type btrfs (ro,relatime,space_cache,subvolid=258,subvol=/containers/tupperware)
/dev/loop0 on / type btrfs (ro,relatime,space_cache,subvolid=258,subvol=/containers/tupperware)
proc on /proc type proc (rw,relatime)
The -l
flag in the umount
command stands for lazy. According to the umount man page, this flag will “detach the filesystem from the file hierarchy now, and clean up all references to this filesystem as soon as it is not busy anymore.” As a result, we don’t see any references to /oldroot when we rerun mount
.
Networking
Let’s verify that we don’t have access to the network.
root@tupperware:/# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
ping: sendto: Network unreachable
root@tupperware:/# ifconfig -a
lo Link encap:Local Loopback
LOOPBACK MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Good! We don’t have network access, and we only have the loopback interface. Let’s fix that.
Note: We run the following commands from a separate terminal (not in our container).
We need the process id of our container (the PID of unshare
).
root@desktop-nicolas:/home/nmesa# CPID=$(pidof unshare)
root@desktop-nicolas:/home/nmesa# echo $CPID
24551
We assign the PID to the CPID
variable to make it easier to copy/paste commands if you are following along. Let’s create a pair of virtual network interfaces.
root@desktop-nicolas:/home/nmesa# ip link add name h$CPID type veth peer name c$CPID
The ip link add
command creates two interfaces of type veth
(virtual ethernet interface). One with name h24551 and the other one with name c24551. h stands for host, c stands for container and 24551 is the PID of our container.
Let’s move the c24551 interface to our container (note this command is also executed from the host).
root@desktop-nicolas:/home/nmesa# ip link set c$CPID netns $CPID
The ip link set
command sets configuration attributes to interfaces. In this case, it sets the netns (network namespace) to be the same as the process id stored in $CPID (24551 in our case).
Let’s get back to our container terminal to verify that it got the new interface.
root@tupperware:/# ifconfig -a
c24551 Link encap:Ethernet HWaddr 96:18:F1:61:12:A2
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
lo Link encap:Local Loopback
LOOPBACK MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Great! We have the interface inside our container! Let’s get back to our host terminal and attach the h24551 interface to the docker bridge (docker0).
root@desktop-nicolas:/home/nmesa# ip link set h$CPID master docker0 up
root@desktop-nicolas:/home/nmesa# ifconfig docker0
docker0 Link encap:Ethernet HWaddr 02:42:ac:03:c8:91
inet addr:172.17.0.1 Bcast:172.17.255.255 Mask:255.255.0.0
inet6 addr: fe80::42:acff:fe03:c891/64 Scope:Link
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:77485 errors:0 dropped:0 overruns:0 frame:0
TX packets:77624 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:6199951 (6.1 MB) TX bytes:164681776 (164.6 MB)
We use the ip link set
command to attach the interface to the docker0 bridge. We also grab the ip address of our docker0 interface: 172.17.0.1 (write it down as you’ll need this later).
Note: Jérôme didn’t do this last step, and I suspect this is why the demo didn’t work as expected for him.
Let’s go back to our container terminal and configure the network interface.
root@tupperware:/# ip link set lo up
root@tupperware:/# ip link set c24551 name eth0 up
root@tupperware:/# ip addr add 172.17.42.3/16 dev eth0
root@tupperware:/# ip route add default via 172.17.0.1
root@tupperware:/# ifconfig
eth0 Link encap:Ethernet HWaddr 96:18:F1:61:12:A2
inet addr:172.17.42.3 Bcast:0.0.0.0 Mask:255.255.0.0
inet6 addr: fe80::9418:f1ff:fe61:12a2/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:35 errors:0 dropped:0 overruns:0 frame:0
TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3789 (3.7 KiB) TX bytes:936 (936.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Let’s break this down:
- We bring up the loopback interface.
- We bring up the c24551 interface and rename it to eth0. Note that this command can’t be copied and pasted since your interface probably has a different name.
- We assign the ip address of 172.17.42.3 to our eth0 interface. Note that this ip address has to be within the range that we got back from the docker0 interface (172.17.0.0/16 in our case).
- We set the default gateway of our container to be 172.17.0.1 (this should be the ip address that we got from docker0).
- We run a quick
ifconfig
to make sure everything is setup correctly.
Now, the moment of truth. Let’s ping 8.8.8.8.
root@tupperware:/# ping -c 3 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=121 time=11.045 ms
64 bytes from 8.8.8.8: seq=1 ttl=121 time=10.981 ms
64 bytes from 8.8.8.8: seq=2 ttl=121 time=10.508 ms
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 10.508/10.844/11.045 ms
Our container has network connectivity!
Let’s finish the job
As Jérôme noted in the video, we are still running bash inside the container, but the alpine image doesn’t have bash. Up until this point, we’ve been doing the container runtime’s job. To finish the container handoff, we run the following command:
root@tupperware:/# exec chroot / sh
/ #
The exec
command is a built-in utility in bash (who knew?). It replaces the shell with the command (chroot
in our case) without starting a new process. Once this command is executed, we’re officially in a container. Let’s run a few commands.
/ # apk
apk-tools 2.10.0, compiled for x86_64.
Installing and removing packages:
add Add PACKAGEs to 'world' and install (or upgrade) them, while ensuring that all dependencies are met
del Remove PACKAGEs from 'world' and uninstall them
System maintenance:
fix Repair package or upgrade it without modifying main dependencies
update Update repository indexes from all remote repositories
upgrade Upgrade currently installed packages to match repositories
cache Download missing PACKAGEs to cache and/or delete unneeded files from cache
Querying information about packages:
info Give detailed information about PACKAGEs or repositories
list List packages by PATTERN and other criteria
dot Generate graphviz graphs
policy Show repository policy for packages
Repository maintenance:
index Create repository index file from FILEs
fetch Download PACKAGEs from global repositories to a local directory
verify Verify package integrity and signature
manifest Show checksums of package contents
Use apk <command> --help for command-specific help.
Use apk --help --verbose for a full command listing.
This apk has coffee making abilities.
/ # ls
NICK_WAS_HERE dev home media oldroot root sbin sys usr
bin etc lib mnt proc run srv tmp var
Cleanup
Here are some cleanup steps commands.
/ # exit
root@desktop-nicolas:/btrfs# exit
root@desktop-nicolas:/btrfs# cd /
root@desktop-nicolas:/# umount /btrfs
root@desktop-nicolas:/# exit
exit
nmesa@desktop-nicolas:~/demos/containers$ rm -f disk.img
nmesa@desktop-nicolas:~/demos/containers$
Note: I didn’t have to delete my veth interfaces. I’m not sure if they are automatically removed once the network namespace dies.
Conclusion
A lot goes on under the hood to provision containers! This was only covering the namespaces part. There’s also cgroups, capabilities, devices, etc. As Jérôme suggests, don’t create your own container runtime and use docker instead.
Links / Further reading
- BTRFS
- Getting started with BTRFS
- BTRFS tutorial
- mount man page
- umount man page
- docker
- Building containers in pure bash and C
- Namespaces in operation