Skip to content

Kernel Todo

stephensmalley edited this page Nov 17, 2016 · 75 revisions

This is a list of potential work items that relate to the SELinux kernel component. If you are interested in helping out with one of these items, or have something you would like to add, please contact the SELinux development team using the mailing list below.

SELinux Developer's Mailing List Subscription Page
SELinux Developer's Mailing List Archive

At the moment most of the work items have been taken from an older SELinux todo list which was not well maintained, and as a result many of the items are no longer applicable. We will be going through this older list to capture items that are still relevant, moving them to the "New Items" list below, but this will likely take some time.

##New Items

  • Split the open permission
  • Split the open permission into open_read and open_write so that we can better distinguish them in policy. Presently we rely upon the fact that we already check read and write permissions in addition to open; however, this is not sufficient because we sometimes have to allow read or write permission for a descriptor inherited across execve or received over IPC, but still do not want to allow direct open(2) with those permissions.
  • Fix CAP_DAC_OVERRIDE checking
  • At present, CAP_DAC_OVERRIDE is checked by the kernel first even if only read/search access is requested, and then CAP_DAC_READ_SEARCH is checked if CAP_DAC_OVERRIDE is not allowed. This causes SELinux to audit dac_override denials in many cases where dac_override is not truly required, which leads to overly liberal policy. Also, since capable() does not provide the inode information, dac_override and dac_read_search denials do not provide information about the relevant path unless system call auditing is enabled and a filter is defined. However, the kernel now calls capable_wrt_inode_uidgid() for these checks, so we could pass down the inode to the security hook in those cases and allow auditing of the file with the avc denial itself.
  • Generalize filesystem labeling behavior logic
  • Generalize the current hardcoded tests of specific filesystem type names used to determine whether to support setting per-file security contexts via setxattr on a genfscon-labeled filesystem and whether to initially label the files from policy based on pathname from the root of the filesystem. The former is only safe if the filesystem either implements its own setxattr handler for security labels or the filesystem pins its inodes in memory, as otherwise the label may not be preserved for the lifetime of the file. The latter is only safe if the filesystem does not permit userspace to modify the directory tree (i.e. no .create/.link/.rename methods or filesystem is not mountable by userspace), as otherwise userspace can potentially cause files to move in and out of a given label or to be accessible under different labels depending on which path is first looked up. We currently permit the former for sysfs (implements its own handler that saves/restores the value when the inode is evicted and later re-created from a backing data structure), and for pstore, debugfs, and rootfs (all of which pin their inodes in memory). We currently permit the latter for debugfs, sysfs, and pstore, as the first two do not permit any userspace manipulation of directories and the latter only permits unlink, which causes no issues by itself. We either need some way to detect which filesystems are safe to use in the kernel or specify the whitelists of filesystem type names in the policy.
  • Extend SELinux /proc/pid labeling
  • Extend SELinux /proc/pid labeling support to support derived types on specific /proc/pid files based on both the associated task context and the file name, e.g. name-based type transitions. This would allow applying different restrictions to different /proc/pid files of the same process via SELinux.
  • Mark the LSM hook structure as read only post-init
  • See the Openwall Kernel Hardening archive patchset for details.
  • Allow a domain to have a bounded relationship with multiple domains
  • The bounded domain transition restriction when NNP is enabled is resulting in an increase in requests for bounded domain transitions in policy and the current 1-to-1 bounding relationship is starting to be a limiting factor.
  • Improved coverage in the SELinux testsuite
  • https://github.com/SELinuxProject/selinux-testsuite
  • Among other things, the SELinux testsuite is used as a regression test and having full/improved coverage of the SELinux access controls would be very helpful.
  • Introduce btrfs subvolume label assignment support
  • This blocks using btrfs as the backend storage for Docker when SELinux is enabled.
  • Improve overlayfs/LSM integration
  • This blocks using overlayfs as the backend storage for Docker when SELinux is enabled.
  • Labeling and access controls for namespace operations
  • Support for RFC5570, aka CALIPSO
  • http:https://tools.ietf.org/html/rfc5570
  • Proper support for SCTP
  • https://en.wikipedia.org/wiki/Stream_Control_Transmission_Protocol
  • RFC patchset from Richard Haines
  • Access controls for generic AF_INET/AF_INET6 traffic and SOCK_RAW sockets
  • Access controls for AF_VSOCK sockets
  • http:https://lxr.free-electrons.com/source/net/vmw_vsock/af_vsock.c
  • Add netmasks to the SELinux network node cache
  • Add addresses to the SELinux network port object to match the normal triple (addr/proto/port) to improve the connect() and bind() access controls
  • Display bad/deferred file labels in AVC audit records
  • Add the FD number in AVC records generated by flush_unauthorized_files()/file_has_perm() to make life easier for the policy developers
  • Dynamic discovery of initial SIDs
  • Similar to dynamic discovery of classes/perms, map kernel initial SIDs to policy initial SIDs by string name rather than requiring identical index values, handle unknown initial SIDs cleanly (map to unlabeled), and allow future extensibility without causing problems (start regular SIDs at some fixed offset, e.g. 100, or start from the highest legal value and decrement, so that policy reload that changes the number of initial SIDs won't affect them).
  • Improve the scripts/selinux/mdp script to generate a more useful and flexible minimal policy
  • Develop a mechanism to automatically detect new syscalls in RC kernels and determine SELinux coverage
  • APIs for getting and setting security contexts of sockets and IPC objects
  • Ensure that socket context is kept consistent on socket inode and sock structures
  • Increased granularity for Generic Netlink access controls
  • Investigate SELinux security policy for cgroups
  • https://en.wikipedia.org/wiki/Cgroups
  • Improve support for the different network address families with more socket classes
  • Extend SELinux to support distinctions among more (all?) address families by defining new socket security classes in policy and updating the kernel logic to map them correctly. In the kernel, add the classes to security/selinux/include/classmap.h and update security/selinux/hooks.c:socket_type_to_security_class() to map the socket domain to its class. In the policy, add the classes to security_classes and access_vectors and add allow rules as appropriate. Otherwise, many sockets get mapped to the generic socket class and are indistinguishable in policy. Example: bluetooth sockets.
  • Investigate supporting both SOL_IP/IP_PASSEC and SOL_SOCKET/SO_PASSEC on both datagram and stream sockets
  • Remove the SECURITY_SELINUX_POLICYDB_VERSION_MAX Kconfig option
  • This was only necessary for Fedora 3 and 4, it now just causes problems.
  • LSM/SELinux hooks for kdbus
  • This will also require matching userspace and policy changes; Paul is currently working on an updated version of the LSM/SELinux kernel hooks. PM: The kdbus code is currently being reworked and we have little idea what the new version will look like, pause this effort until we better understand the status of kdbus.

##Carryover Items from the Old List

  • Fix signal inheritance controls (possibly drop some or all, or only enforce in policy for certain domains).

  • Ensure that all filesystems of interest that support security.* xattrs also call security_inode_init_security() to initialize newly allocated inodes with security labels. Otherwise, newly created files will not be assigned a security context automatically. Same should be true of POSIX ACLs.

  • Add a 'map' check on mmap so that we can distinguish memory mapped access (since it has different implications for revocation) When a file is opened and then read or written via syscalls like read(2)/write(2), we revalidate access on each read/write operation via selinux_file_permission() and therefore can revoke access if the process context, the file context, or the policy changes in such a manner that access is no longer allowed. When a file is opened and then memory mapped via mmap(2) and then subsequently read or written directly in memory, we presently have no way to revalidate or revoke access. The purpose of a separate map permission check on mmap(2) is to permit policy to prohibit memory mapping of specific files for which we need to ensure that every access is revalidated, particularly useful for scenarios where we expect the file to be relabeled at runtime in order to reflect state changes (e.g. cross-domain solution, assured pipeline without data copying).

  • Revoke memory-mapped file access upon policy change or setxattr.

PM: How feasible is this?

SDS: revoke(2) seems to be an ever-recurring topic on lkml. However, even with a revoke(2) implementation on regular files, it isn't quite what we want, as it would revoke all open references to the file, whereas ideally we'd like it to revalidate all open references to the file under the new policy or file security context and only revoke the ones which no longer are allowed by policy. However, for cross-domain solutions / assured pipelines without data copying, a simple revoke(2) would be better than nothing; the program would know that it needed to close and re-open its references to the file upon calling revoke(2) and any other process that is trying to access the file would correctly be deprived of its access.

  • Real device labeling and access control (i.e. bind a label to a device in the kernel irrespective of what device node is used to access it so that a process that can create any device nodes at all can't effectively bypass all device access controls just by creating an arbitrary node to any device in a type accessible to it).

PM: Labeling a device seems like it might not be too bad, implementing reasonable access controls seems like it might be a nightmare depending on your security goals. What exactly would we want to control? Just device creation/management?

SDS: The problem today is that any process that can create device nodes (i.e. has mknod capability and create to chr_file or blk_file) is free to create an alias to any device in a type for which it has create permission, and thus can effectively gain read/write to any device. If only udev ever created device nodes, then this would be a relatively minor concern; we would just need to verify that udev always creates the device node with the correct type and then everything else follows. But in practice, many programs create device nodes, and currently all such programs have to be fully trusted not to misuse that power to gain access to arbitrary devices. What we want is a tighter binding or validation between the label used for checking read/write access and the underlying device. That could be implemented as some kind of associate check between the device node security context and some security context associated with the driver or major/minor or some other identifier in the kernel, or as a separate device access control layer. Not sure of the best approach.

  • Crypto policy for domains & object handling.

PM: We need more detail on this item.

  • Expand selinux-testsuite as a full regression testuite for every permission & class.

  • Update performance testing & profiling, try to characterize SELinux performance overhead and identify performance hotspots for improvement.

PM: It would be nice if we could document or script some sort of relatively simple performance measurement as a first step towards regular performance testing.

  • Better support for FS whose labelling behaviour is not specified in policy. If nothing from policy just test for xattr support and use it if it is there (in progress at RH, patch reverted due to fuse deadlocks).

PM: Is this inherently not possible or just something that is problematic with FUSE? If the latter, why can't we simply blacklist FUSE?

SDS: FUSE is used widely in distributions (and Android), and the current lack of support for passing through file security contexts prevents granular control of files backed via FUSE filesystems. Also, the fact that the kernel fuse implementation does not convey the process security context to the FUSE userspace daemon makes it impossible to implement any SELinux userspace access controls in the FUSE userspace daemon based on calling process, an issue in e.g. the Android sdcard daemon.

  • Better validation of classes/perms on policy reload. Warn if any permissions are defined in a kernel class in the policy that are not defined in the kernel's classmap.

PM: Is this still a problem? I haven't checked lately.

SDS: We warn on classes/perms defined in the kernel classmap that are not defined in the policy being loaded (and handle them in accordance with handle_unknown=allow|deny|reject). But we don't do the reverse. So you can silently load a policy that defines and uses some new kernel class or permission and there won't be any indication that the kernel doesn't know about it. OTOH, we rarely will want to fail in that scenario as it is common for policy to pick up new classes/perms before they show up in the currently used kernel.