mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/
synced 2025-04-19 20:58:31 +09:00

Patch series "mseal system mappings", v9. As discussed during mseal() upstream process [1], mseal() protects the VMAs of a given virtual memory range against modifications, such as the read/write (RW) and no-execute (NX) bits. For complete descriptions of memory sealing, please see mseal.rst [2]. The mseal() is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management system. For example, such an attacker primitive can break control-flow integrity guarantees since read-only memory that is supposed to be trusted can become writable or .text pages can get remapped. The system mappings are readonly only, memory sealing can protect them from ever changing to writable or unmmap/remapped as different attributes. System mappings such as vdso, vvar, vvar_vclock, vectors (arm compat-mode), sigpage (arm compat-mode), are created by the kernel during program initialization, and could be sealed after creation. Unlike the aforementioned mappings, the uprobe mapping is not established during program startup. However, its lifetime is the same as the process's lifetime [3]. It could be sealed from creation. The vsyscall on x86-64 uses a special address (0xffffffffff600000), which is outside the mm managed range. This means mprotect, munmap, and mremap won't work on the vsyscall. Since sealing doesn't enhance the vsyscall's security, it is skipped in this patch. If we ever seal the vsyscall, it is probably only for decorative purpose, i.e. showing the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored. It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may alter the system mappings during restore operations. UML(User Mode Linux) and gVisor, rr are also known to change the vdso/vvar mappings. Consequently, this feature cannot be universally enabled across all systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default. To support mseal of system mappings, architectures must define CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS and update their special mappings calls to pass mseal flag. Additionally, architectures must confirm they do not unmap/remap system mappings during the process lifetime. The existence of this flag for an architecture implies that it does not require the remapping of thest system mappings during process lifetime, so sealing these mappings is safe from a kernel perspective. This version covers x86-64 and arm64 archiecture as minimum viable feature. While no specific CPU hardware features are required for enable this feature on an archiecture, memory sealing requires a 64-bit kernel. Other architectures can choose whether or not to adopt this feature. Currently, I'm not aware of any instances in the kernel code that actively munmap/mremap a system mapping without a request from userspace. The PPC does call munmap when _install_special_mapping fails for vdso; however, it's uncertain if this will ever fail for PPC - this needs to be investigated by PPC in the future [4]. The UML kernel can add this support when KUnit tests require it [5]. In this version, we've improved the handling of system mapping sealing from previous versions, instead of modifying the _install_special_mapping function itself, which would affect all architectures, we now call _install_special_mapping with a sealing flag only within the specific architecture that requires it. This targeted approach offers two key advantages: 1) It limits the code change's impact to the necessary architectures, and 2) It aligns with the software architecture by keeping the core memory management within the mm layer, while delegating the decision of sealing system mappings to the individual architecture, which is particularly relevant since 32-bit architectures never require sealing. Prior to this patch series, we explored sealing special mappings from userspace using glibc's dynamic linker. This approach revealed several issues: - The PT_LOAD header may report an incorrect length for vdso, (smaller than its actual size). The dynamic linker, which relies on PT_LOAD information to determine mapping size, would then split and partially seal the vdso mapping. Since each architecture has its own vdso/vvar code, fixing this in the kernel would require going through each archiecture. Our initial goal was to enable sealing readonly mappings, e.g. .text, across all architectures, sealing vdso from kernel since creation appears to be simpler than sealing vdso at glibc. - The [vvar] mapping header only contains address information, not length information. Similar issues might exist for other special mappings. - Mappings like uprobe are not covered by the dynamic linker, and there is no effective solution for them. This feature's security enhancements will benefit ChromeOS, Android, and other high security systems. Testing: This feature was tested on ChromeOS and Android for both x86-64 and ARM64. - Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly, i.e. "sl" shown in the smaps for those mappings, and mremap is blocked. - Passing various automation tests (e.g. pre-checkin) on ChromeOS and Android to ensure the sealing doesn't affect the functionality of Chromebook and Android phone. I also tested the feature on Ubuntu on x86-64: - With config disabled, vdso/vvar is not sealed, - with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK, normal operations such as browsing the web, open/edit doc are OK. Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1] Link: Documentation/userspace-api/mseal.rst [2] Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3] Link: https://lore.kernel.org/all/CABi2SkV6JJwJeviDLsq9N4ONvQ=EFANsiWkgiEOjyT9TQSt+HA@mail.gmail.com/ [4] Link: https://lore.kernel.org/all/202502251035.239B85A93@keescook/ [5] This patch (of 7): Provide infrastructure to mseal system mappings. Establish two kernel configs (CONFIG_MSEAL_SYSTEM_MAPPINGS, ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS) and VM_SEALED_SYSMAP macro for future patches. Link: https://lkml.kernel.org/r/20250305021711.3867874-1-jeffxu@google.com Link: https://lkml.kernel.org/r/20250305021711.3867874-2-jeffxu@google.com Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Cc: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Berg <benjamin@sipsolutions.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Elliot Hughes <enh@google.com> Cc: Florian Faineli <f.fainelli@gmail.com> Cc: Greg Ungerer <gerg@kernel.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Linus Waleij <linus.walleij@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <mike.rapoport@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stephen Röttger <sroettger@google.com> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
290 lines
10 KiB
Plaintext
290 lines
10 KiB
Plaintext
# SPDX-License-Identifier: GPL-2.0-only
|
|
#
|
|
# Security configuration
|
|
#
|
|
|
|
menu "Security options"
|
|
|
|
source "security/keys/Kconfig"
|
|
|
|
config SECURITY_DMESG_RESTRICT
|
|
bool "Restrict unprivileged access to the kernel syslog"
|
|
default n
|
|
help
|
|
This enforces restrictions on unprivileged users reading the kernel
|
|
syslog via dmesg(8).
|
|
|
|
If this option is not selected, no restrictions will be enforced
|
|
unless the dmesg_restrict sysctl is explicitly set to (1).
|
|
|
|
If you are unsure how to answer this question, answer N.
|
|
|
|
choice
|
|
prompt "Allow /proc/pid/mem access override"
|
|
default PROC_MEM_ALWAYS_FORCE
|
|
help
|
|
Traditionally /proc/pid/mem allows users to override memory
|
|
permissions for users like ptrace, assuming they have ptrace
|
|
capability.
|
|
|
|
This allows people to limit that - either never override, or
|
|
require actual active ptrace attachment.
|
|
|
|
Defaults to the traditional behavior (for now)
|
|
|
|
config PROC_MEM_ALWAYS_FORCE
|
|
bool "Traditional /proc/pid/mem behavior"
|
|
help
|
|
This allows /proc/pid/mem accesses to override memory mapping
|
|
permissions if you have ptrace access rights.
|
|
|
|
config PROC_MEM_FORCE_PTRACE
|
|
bool "Require active ptrace() use for access override"
|
|
help
|
|
This allows /proc/pid/mem accesses to override memory mapping
|
|
permissions for active ptracers like gdb.
|
|
|
|
config PROC_MEM_NO_FORCE
|
|
bool "Never"
|
|
help
|
|
Never override memory mapping permissions
|
|
|
|
endchoice
|
|
|
|
config MSEAL_SYSTEM_MAPPINGS
|
|
bool "mseal system mappings"
|
|
depends on 64BIT
|
|
depends on ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS
|
|
depends on !CHECKPOINT_RESTORE
|
|
help
|
|
Apply mseal on system mappings.
|
|
The system mappings includes vdso, vvar, vvar_vclock,
|
|
vectors (arm compat-mode), sigpage (arm compat-mode), uprobes.
|
|
|
|
A 64-bit kernel is required for the memory sealing feature.
|
|
No specific hardware features from the CPU are needed.
|
|
|
|
WARNING: This feature breaks programs which rely on relocating
|
|
or unmapping system mappings. Known broken software at the time
|
|
of writing includes CHECKPOINT_RESTORE, UML, gVisor, rr. Therefore
|
|
this config can't be enabled universally.
|
|
|
|
For complete descriptions of memory sealing, please see
|
|
Documentation/userspace-api/mseal.rst
|
|
|
|
config SECURITY
|
|
bool "Enable different security models"
|
|
depends on SYSFS
|
|
depends on MULTIUSER
|
|
help
|
|
This allows you to choose different security modules to be
|
|
configured into your kernel.
|
|
|
|
If this option is not selected, the default Linux security
|
|
model will be used.
|
|
|
|
If you are unsure how to answer this question, answer N.
|
|
|
|
config HAS_SECURITY_AUDIT
|
|
def_bool y
|
|
depends on AUDIT
|
|
depends on SECURITY
|
|
|
|
config SECURITYFS
|
|
bool "Enable the securityfs filesystem"
|
|
help
|
|
This will build the securityfs filesystem. It is currently used by
|
|
various security modules (AppArmor, IMA, SafeSetID, TOMOYO, TPM).
|
|
|
|
If you are unsure how to answer this question, answer N.
|
|
|
|
config SECURITY_NETWORK
|
|
bool "Socket and Networking Security Hooks"
|
|
depends on SECURITY
|
|
help
|
|
This enables the socket and networking security hooks.
|
|
If enabled, a security module can use these hooks to
|
|
implement socket and networking access controls.
|
|
If you are unsure how to answer this question, answer N.
|
|
|
|
config SECURITY_INFINIBAND
|
|
bool "Infiniband Security Hooks"
|
|
depends on SECURITY && INFINIBAND
|
|
help
|
|
This enables the Infiniband security hooks.
|
|
If enabled, a security module can use these hooks to
|
|
implement Infiniband access controls.
|
|
If you are unsure how to answer this question, answer N.
|
|
|
|
config SECURITY_NETWORK_XFRM
|
|
bool "XFRM (IPSec) Networking Security Hooks"
|
|
depends on XFRM && SECURITY_NETWORK
|
|
help
|
|
This enables the XFRM (IPSec) networking security hooks.
|
|
If enabled, a security module can use these hooks to
|
|
implement per-packet access controls based on labels
|
|
derived from IPSec policy. Non-IPSec communications are
|
|
designated as unlabelled, and only sockets authorized
|
|
to communicate unlabelled data can send without using
|
|
IPSec.
|
|
If you are unsure how to answer this question, answer N.
|
|
|
|
config SECURITY_PATH
|
|
bool "Security hooks for pathname based access control"
|
|
depends on SECURITY
|
|
help
|
|
This enables the security hooks for pathname based access control.
|
|
If enabled, a security module can use these hooks to
|
|
implement pathname based access controls.
|
|
If you are unsure how to answer this question, answer N.
|
|
|
|
config INTEL_TXT
|
|
bool "Enable Intel(R) Trusted Execution Technology (Intel(R) TXT)"
|
|
depends on HAVE_INTEL_TXT
|
|
help
|
|
This option enables support for booting the kernel with the
|
|
Trusted Boot (tboot) module. This will utilize
|
|
Intel(R) Trusted Execution Technology to perform a measured launch
|
|
of the kernel. If the system does not support Intel(R) TXT, this
|
|
will have no effect.
|
|
|
|
Intel TXT will provide higher assurance of system configuration and
|
|
initial state as well as data reset protection. This is used to
|
|
create a robust initial kernel measurement and verification, which
|
|
helps to ensure that kernel security mechanisms are functioning
|
|
correctly. This level of protection requires a root of trust outside
|
|
of the kernel itself.
|
|
|
|
Intel TXT also helps solve real end user concerns about having
|
|
confidence that their hardware is running the VMM or kernel that
|
|
it was configured with, especially since they may be responsible for
|
|
providing such assurances to VMs and services running on it.
|
|
|
|
See <https://www.intel.com/technology/security/> for more information
|
|
about Intel(R) TXT.
|
|
See <http://tboot.sourceforge.net> for more information about tboot.
|
|
See Documentation/arch/x86/intel_txt.rst for a description of how to enable
|
|
Intel TXT support in a kernel boot.
|
|
|
|
If you are unsure as to whether this is required, answer N.
|
|
|
|
config LSM_MMAP_MIN_ADDR
|
|
int "Low address space for LSM to protect from user allocation"
|
|
depends on SECURITY && SECURITY_SELINUX
|
|
default 32768 if ARM || (ARM64 && COMPAT)
|
|
default 65536
|
|
help
|
|
This is the portion of low virtual memory which should be protected
|
|
from userspace allocation. Keeping a user from writing to low pages
|
|
can help reduce the impact of kernel NULL pointer bugs.
|
|
|
|
For most ia64, ppc64 and x86 users with lots of address space
|
|
a value of 65536 is reasonable and should cause no problems.
|
|
On arm and other archs it should not be higher than 32768.
|
|
Programs which use vm86 functionality or have some need to map
|
|
this low address space will need the permission specific to the
|
|
systems running LSM.
|
|
|
|
config STATIC_USERMODEHELPER
|
|
bool "Force all usermode helper calls through a single binary"
|
|
help
|
|
By default, the kernel can call many different userspace
|
|
binary programs through the "usermode helper" kernel
|
|
interface. Some of these binaries are statically defined
|
|
either in the kernel code itself, or as a kernel configuration
|
|
option. However, some of these are dynamically created at
|
|
runtime, or can be modified after the kernel has started up.
|
|
To provide an additional layer of security, route all of these
|
|
calls through a single executable that can not have its name
|
|
changed.
|
|
|
|
Note, it is up to this single binary to then call the relevant
|
|
"real" usermode helper binary, based on the first argument
|
|
passed to it. If desired, this program can filter and pick
|
|
and choose what real programs are called.
|
|
|
|
If you wish for all usermode helper programs are to be
|
|
disabled, choose this option and then set
|
|
STATIC_USERMODEHELPER_PATH to an empty string.
|
|
|
|
config STATIC_USERMODEHELPER_PATH
|
|
string "Path to the static usermode helper binary"
|
|
depends on STATIC_USERMODEHELPER
|
|
default "/sbin/usermode-helper"
|
|
help
|
|
The binary called by the kernel when any usermode helper
|
|
program is wish to be run. The "real" application's name will
|
|
be in the first argument passed to this program on the command
|
|
line.
|
|
|
|
If you wish for all usermode helper programs to be disabled,
|
|
specify an empty string here (i.e. "").
|
|
|
|
source "security/selinux/Kconfig"
|
|
source "security/smack/Kconfig"
|
|
source "security/tomoyo/Kconfig"
|
|
source "security/apparmor/Kconfig"
|
|
source "security/loadpin/Kconfig"
|
|
source "security/yama/Kconfig"
|
|
source "security/safesetid/Kconfig"
|
|
source "security/lockdown/Kconfig"
|
|
source "security/landlock/Kconfig"
|
|
source "security/ipe/Kconfig"
|
|
|
|
source "security/integrity/Kconfig"
|
|
|
|
choice
|
|
prompt "First legacy 'major LSM' to be initialized"
|
|
default DEFAULT_SECURITY_SELINUX if SECURITY_SELINUX
|
|
default DEFAULT_SECURITY_SMACK if SECURITY_SMACK
|
|
default DEFAULT_SECURITY_TOMOYO if SECURITY_TOMOYO
|
|
default DEFAULT_SECURITY_APPARMOR if SECURITY_APPARMOR
|
|
default DEFAULT_SECURITY_DAC
|
|
|
|
help
|
|
This choice is there only for converting CONFIG_DEFAULT_SECURITY
|
|
in old kernel configs to CONFIG_LSM in new kernel configs. Don't
|
|
change this choice unless you are creating a fresh kernel config,
|
|
for this choice will be ignored after CONFIG_LSM has been set.
|
|
|
|
Selects the legacy "major security module" that will be
|
|
initialized first. Overridden by non-default CONFIG_LSM.
|
|
|
|
config DEFAULT_SECURITY_SELINUX
|
|
bool "SELinux" if SECURITY_SELINUX=y
|
|
|
|
config DEFAULT_SECURITY_SMACK
|
|
bool "Simplified Mandatory Access Control" if SECURITY_SMACK=y
|
|
|
|
config DEFAULT_SECURITY_TOMOYO
|
|
bool "TOMOYO" if SECURITY_TOMOYO=y
|
|
|
|
config DEFAULT_SECURITY_APPARMOR
|
|
bool "AppArmor" if SECURITY_APPARMOR=y
|
|
|
|
config DEFAULT_SECURITY_DAC
|
|
bool "Unix Discretionary Access Controls"
|
|
|
|
endchoice
|
|
|
|
config LSM
|
|
string "Ordered list of enabled LSMs"
|
|
default "landlock,lockdown,yama,loadpin,safesetid,smack,selinux,tomoyo,apparmor,ipe,bpf" if DEFAULT_SECURITY_SMACK
|
|
default "landlock,lockdown,yama,loadpin,safesetid,apparmor,selinux,smack,tomoyo,ipe,bpf" if DEFAULT_SECURITY_APPARMOR
|
|
default "landlock,lockdown,yama,loadpin,safesetid,tomoyo,ipe,bpf" if DEFAULT_SECURITY_TOMOYO
|
|
default "landlock,lockdown,yama,loadpin,safesetid,ipe,bpf" if DEFAULT_SECURITY_DAC
|
|
default "landlock,lockdown,yama,loadpin,safesetid,selinux,smack,tomoyo,apparmor,ipe,bpf"
|
|
help
|
|
A comma-separated list of LSMs, in initialization order.
|
|
Any LSMs left off this list, except for those with order
|
|
LSM_ORDER_FIRST and LSM_ORDER_LAST, which are always enabled
|
|
if selected in the kernel configuration, will be ignored.
|
|
This can be controlled at boot with the "lsm=" parameter.
|
|
|
|
If unsure, leave this as the default.
|
|
|
|
source "security/Kconfig.hardening"
|
|
|
|
endmenu
|
|
|