Service Sandboxing using systemd
This is a simple howto for package maintainers to implement sandboxing around services.
The systemd service files provide directives to restrict capabilities (systemd.exec(5) § CAPABILITIES), filter system calls using seccomp (systemd.exec(5) § SYSTEM CALL FILTERING), apply cgroups and namespaces, limit filesystem access (systemd.exec(5) § SANDBOXING). See also systemd.exec(5) § SECURITY, JoinsNamespaceOf= in systemd.unit(5) § [UNIT] SECTION OPTIONS, ExecStart= in systemd.service(5) § OPTIONS.
Notice: Sandboxing helps protecting the system, other services and the user's homes from a compromised services. It often provides no hardening for the service itself.
Check the sandboxing status of all services:
sudo systemd-analyze security
Detailed report on the service you maintain:
sudo systemd-analyze security mydaemon.service --no-pager
Update your mydaemon.service file. This is a working example for a service "myserv" with typical settings (see systemd.directives(7) if you are unsure what man page describes a particular option)
[Service] PermissionsStartOnly=true # Filter directory access ReadOnlyDirectories=/ #RuntimeDirectory=myserv # /run/myserv the innermost subdirectories are removed when the unit is stopped StateDirectory=myserv # /var/lib/<name> CacheDirectory=myserv # /var/cache/<name> LogsDirectory=/myserv # /var/log/<name> ConfigurationDirectory=myserv # /etc/myserv # These *Directory directories change behavior with DynamicUser - see docs NoNewPrivileges=yes # Prevent acquiring new privileges. Warning: breaks execution of SUID binaries PrivateTmp=yes # Use dedicated /tmp PrivateUsers=yes # Hide system users ProtectControlGroups= # Service may modify to the control group file system ProtectHome=yes # Hide user homes PrivateDevices=yes # Prevent access to /dev ProtectKernelModules=yes # Prevent loading or reading kernel modules ProtectKernelTunables=yes # Prevent altering kernel tunables ProtectSystem=strict # strict or full, see docs #SystemCallFilter= # Filter system calls, recommended # ~@clock ~@cpu-emulation ~@debug ~@module ~@mount ~@obsolete ~@privileged ~@raw-io ~@reboot ~@resources ~@swap #AmbientCapabilities= # Service process does not receive ambient capabilities #CapabilityBoundingSet= # Restrict capabilities # CAP_AUDIT_* # Service has audit subsystem access # CAP_BLOCK_SUSPEND # Service may establish wake locks # CAP_(CHOWN|FSETID|SETFCAP) # Service may change file ownership/access mode/capabilities unrestricted # CAP_(DAC_*|FOWNER|IPC_OWNER) # Service may override UNIX file/IPC permission checks # CAP_IPC_LOCK # Service may lock memory into RAM # CAP_KILL # Service may send UNIX signals to arbitrary processes # CAP_LEASE # Service may create file leases # CAP_LINUX_IMMUTABLE # Service may mark files immutable # CAP_MAC_* # Service may adjust SMACK MAC # CAP_MKNOD # Service may create device nodes # CAP_NET_ADMIN # Service has network configuration privileges # CAP_NET_(BIND_SERVICE|BROADCAST|RAW)# Service has elevated networking privileges # CAP_RAWIO # Service has raw I/O access # CAP_SET(UID|GID|PCAP) # Service may change UID/GID identities/capabilities # CAP_SYS_ADMIN # Service has administrator privileges # CAP_SYS_BOOT # Service may issue reboot() # CAP_SYS_CHROOT # Service may issue chroot() # CAP_SYSLOG # Service has access to kernel logging # CAP_SYS_MODULE # Service may load kernel modules # CAP_SYS_(NICE|RESOURCE) # Service has privileges to change resource use parameters # CAP_SYS_PACCT # Service may use acct() # CAP_SYS_PTRACE # Service has ptrace() debugging abilities # CAP_SYS_TIME # Service processes may change the system clock # CAP_SYS_TTY_CONFIG # Service may issue vhangup() # CAP_WAKE_ALARM # Service may program timers that wake up the system #Delegate= # Service does not maintain its own delegated control group subtree #DeviceAllow= # Service has no device ACL #IPAddressDeny= # Service does not define an IP address whitelist #KeyringMode= # Service doesn't share key material with other services #LockPersonality= # Service may change ABI personality #MemoryDenyWriteExecute= # Service may create writable executable memory mappings #NotifyAccess= # Service child processes cannot alter service state #PrivateMounts= # Service may install system mounts #PrivateNetwork= # Service has access to the host's network #ProtectHostname= # Service may change system host/domainname #ProtectKernelLogs= # Service may read from or write to the kernel log ring buffer #RestrictAddressFamilies=~AF_(INET|INET6) ~AF_NETLINK ~AF_PACKET ~AF_UNIX # Filter socket type #RestrictNamespaces= # filter namespace creation: # ~CLONE_NEWCGROUP ~CLONE_NEWIPC ~CLONE_NEWNET ~CLONE_NEWNS ~CLONE_NEWPID ~CLONE_NEWUSER ~CLONE_NEWUTS #RestrictRealtime= # Service may acquire realtime scheduling #RestrictSUIDSGID= # Service may create SUID/SGID files #RootDirectory=/RootImage= # Service runs within the host's root directory #SupplementaryGroups= # Service runs as root, option does not matter #SystemCallArchitectures= # Service may execute system calls with all ABIs #UMask= # Files created by service are world-readable by default #User=/DynamicUser= # Service runs as root user
CategorySystemSecurity | CategorySystemAdministration | CategoryPackaging