chroot()
/
for the processchroot()
cd
into directory that is then moved outside of chroot dirchroot()
without cd /
, which defeats the purposechroot()
INADDR_ANY
uts
, pid
, net
, etc. (see later)CLONE_FILES
)CLONE_VM
)CLONE_SIGHAND
)CLONE_FS
)uid 0
are actually process capabilities
CAP_CHOWN
, make arbitrary changes to file IDsCAP_KILL
, send signals to arbitrary processesCAP_SYS_ADMIN
: “Note: this capability is overloaded” ☺capabilities(7)
CONFIG_UTS_NS
CAP_SYS_ADMIN
CONFIG_IPC_NS
CAP_SYS_ADMIN
ipcs|wc -l
→ 48 entries on my machinecgroups(7)
- resource isolation/monitoringCAP_SYS_ADMIN
CLONE_NEWNS
chroot()
effect: isolates the directory hierarchy.MS_RDONLY
, MS_NOSUID
, MS_NOEXEC
) and the “atime” flags become locked and cannot be changed anymoreCAP_SYS_ADMIN
CONFIG_NET_NS
, but completed in 2.6.29CAP_SYS_ADMIN
# ip -o l
1: lo: …
2: eth0: …
# unshare --net
# ip -o l
1: lo: …
#
veth(4)
)# ip link add veth2-left type veth peer veth2-right
# ip link set veth2-right netns ns-right
CONFIG_PID_NS
CAP_SYS_ADMIN
getpid()
should never change, a process cannot change PID namespaces (compared to all other namespace types)setns()
only changes the namespace for future children of this process, not for the process itselfgetppid()
returns 0 in such casessetns()
—downwards, not upwards
SIGKILL
SIGKILL
/SIGSTOP
reboot()
in this namespace works (and terminates it)
test@debian:~$ id
uid=1001(test) gid=1001(test) groups=1001(test)
test@debian:~$ unshare --user
nobody@debian:~$ id; exit
uid=65534(nobody) gid=65534(nogroup) groups=65534(nogroup)
test@debian:~$ unshare --user --map-root-user
root@debian:~# id
uid=0(root) gid=0(root) groups=0(root)
echo 1 > /proc/sys/kernel/unprivileged_userns_clone
test@debian:~$ unshare --user --map-root-user --mount
root@debian:~# df -h|grep /mnt
root@debian:~# mount -t tmpfs none /mnt/
root@debian:~# df -h|grep /mnt
none 998M 0 998M 0% /mnt
CAP_SYS_ADMIN
in a (non-initial) user namespace is not quite the real thing
mknod
)procfs
, sysfs
, devpts
, tmpfs
, ramfs
, mqueue
, bpf
)dmesg
to read the kernel logs, if they were originally disallowedCLONE_NEWUSER
flag) in 2.6.23, semantics changed to current ones in 3.5, and the final bits were added to make it fully usable in 3.8; set CONFIG_USER_NS
bpf
mounting appeared in 4.4,cgroup
configuration introduced 4.6, etc.CAP_SYS_ADMIN
in the target namespaceCAP_NET_ADMIN
test@debian:~$ unshare --user --map-root-user
root@debian:~# iptables -L
iptables: Permission denied (you must be root).
root@debian:~# id
uid=0(root) gid=0(root) groups=0(root)
test@debian:~$ unshare --user --map-root-user
root@debian:~# date > /tmp/foo; exit
test@debian:~$ ls -l /tmp/foo
-rw-r--r-- 1 test test 29 Mar 30 15:29 /tmp/foo
test@debian:~$ unshare --user --map-root-user
root@debian:~# su - more-test
su: Authentication failure
test@debian:~$ ls -l /tmp/foo
-rw-r--r-- 1 more-test test 32 Mar 30 15:39 /tmp/foo
test@debian:~$ unshare --user --map-root-user
root@debian:~# ls -l /tmp/foo
-rw-r--r-- 1 nobody root 32 Mar 30 15:39 /tmp/foo
CAP_SETUID
/CAP_SETGID
can set arbitrary mappings/proc/$pid/uid_map
(and gid_map
), and read user_namespaces(7)
65534
, nobody/nogroup
)stat
, getuid
, chown
, etc.) as appropriate, both inside and outside of the namespacesetuid
/setgid
programs works as expected if there is a mapping!setns(2)
: switches one or more namespaces:
ls -l /proc/self/ns/
setns(2)
takes argument a file descriptor to one such directoryunshare(2)
: unshares parts of the execution context
CLONE_*
flagsclone(2)
is very much worth reading, to understand how complex process relationships areioctl_ns(2)
allows discovering some of the relationships between namespacesunshare(1)
, newuid(1)
, newgid(1)
, etc.CLONE_NEWUSER
)CLONE_NEWUTS
)CLONE_NEWNS
)CLONE_NEWPID
)CLONE_NEWNET
)namespaces(7)
, and all the “SEE ALSO” pages