320 Commits

Author SHA1 Message Date
Linus Torvalds
f90f2145b2 s390 updates for 6.15 merge window
- Add sorting of mcount locations at build time
 
 - Rework uaccess functions with C exception handling to shorten inline
   assembly size and enable full inlining. This yields near-optimal code
   for small constant copies with a ~40kb kernel size increase
 
 - Add support for a configurable STRICT_MM_TYPECHECKS which allows to
   generate better code, but also allows to have type checking for
   debug builds
 
 - Optimize get_lowcore() for common callers with alternatives that
   nearly revert to the pre-relocated lowcore code, while also slightly
   reducing syscall entry and exit time
 
 - Convert MACHINE_HAS_* checks for single facility tests into cpu_has_*
   style macros that call test_facility(), and for features with additional
   conditions, add a new ALT_TYPE_FEATURE alternative to provide a static
   branch via alternative patching. Also, move machine feature detection
   to the decompressor for early patching and add debugging functionality
   to easily show which alternatives are patched
 
 - Add exception table support to early boot / startup code to get rid
   of the open coded exception handling
 
 - Use asm_inline for all inline assemblies with EX_TABLE or ALTERNATIVE
   to ensure correct inlining and unrolling decisions
 
 - Remove 2k page table leftovers now that s390 has been switched to
   always allocate 4k page tables
 
 - Split kfence pool into 4k mappings in arch_kfence_init_pool() and
   remove the architecture-specific kfence_split_mapping()
 
 - Use READ_ONCE_NOCHECK() in regs_get_kernel_stack_nth() to silence
   spurious KASAN warnings from opportunistic ftrace argument tracing
 
 - Force __atomic_add_const() variants on s390 to always return void,
   ensuring compile errors for improper usage
 
 - Remove s390's ioremap_wt() and pgprot_writethrough() due to mismatched
   semantics and lack of known users, relying on asm-generic fallbacks
 
 - Signal eventfd in vfio-ap to notify userspace when the guest AP
   configuration changes, including during mdev removal
 
 - Convert mdev_types from an array to a pointer in vfio-ccw and vfio-ap
   drivers to avoid fake flex array confusion
 
 - Cleanup trap code
 
 - Remove references to the outdated linux390@de.ibm.com address
 
 - Other various small fixes and improvements all over the code
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmfmuPwACgkQjYWKoQLX
 FBgTDAgAjKmZ5OYjACRfYepTvKk9SDqa2CBlQZ+BhbAXEVIrxKnv8OkImAXoWNsM
 mFxiCxAHWdcD+nqTrxFsXhkNLsndijlwnj/IqZgvy6R/3yNtBlAYRPLujOmVrsQB
 dWB8Dl38p63Ip1JfAqyabiAOUjfhrclRcM5FX5tgciXA6N/vhY3OM6k0+k7wN4Nj
 Dei/rCrnYRXTrFQgtM4w8JTIrwdnXjeKvaTYCflh4Q5ISJ7TceSF7cqq8HOs5hhK
 o2ciaoTdx212522CIsxeN3Ls3jrn8bCOCoOeSCysc5RL84grAuFnmjSajo1LFide
 S/TQtHXYy78Wuei9xvHi561ogiv/ww==
 =Kxgc
 -----END PGP SIGNATURE-----

Merge tag 's390-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Add sorting of mcount locations at build time

 - Rework uaccess functions with C exception handling to shorten inline
   assembly size and enable full inlining. This yields near-optimal code
   for small constant copies with a ~40kb kernel size increase

 - Add support for a configurable STRICT_MM_TYPECHECKS which allows to
   generate better code, but also allows to have type checking for debug
   builds

 - Optimize get_lowcore() for common callers with alternatives that
   nearly revert to the pre-relocated lowcore code, while also slightly
   reducing syscall entry and exit time

 - Convert MACHINE_HAS_* checks for single facility tests into cpu_has_*
   style macros that call test_facility(), and for features with
   additional conditions, add a new ALT_TYPE_FEATURE alternative to
   provide a static branch via alternative patching. Also, move machine
   feature detection to the decompressor for early patching and add
   debugging functionality to easily show which alternatives are patched

 - Add exception table support to early boot / startup code to get rid
   of the open coded exception handling

 - Use asm_inline for all inline assemblies with EX_TABLE or ALTERNATIVE
   to ensure correct inlining and unrolling decisions

 - Remove 2k page table leftovers now that s390 has been switched to
   always allocate 4k page tables

 - Split kfence pool into 4k mappings in arch_kfence_init_pool() and
   remove the architecture-specific kfence_split_mapping()

 - Use READ_ONCE_NOCHECK() in regs_get_kernel_stack_nth() to silence
   spurious KASAN warnings from opportunistic ftrace argument tracing

 - Force __atomic_add_const() variants on s390 to always return void,
   ensuring compile errors for improper usage

 - Remove s390's ioremap_wt() and pgprot_writethrough() due to
   mismatched semantics and lack of known users, relying on asm-generic
   fallbacks

 - Signal eventfd in vfio-ap to notify userspace when the guest AP
   configuration changes, including during mdev removal

 - Convert mdev_types from an array to a pointer in vfio-ccw and vfio-ap
   drivers to avoid fake flex array confusion

 - Cleanup trap code

 - Remove references to the outdated linux390@de.ibm.com address

 - Other various small fixes and improvements all over the code

* tag 's390-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (78 commits)
  s390: Use inline qualifier for all EX_TABLE and ALTERNATIVE inline assemblies
  s390/kfence: Split kfence pool into 4k mappings in arch_kfence_init_pool()
  s390/ptrace: Avoid KASAN false positives in regs_get_kernel_stack_nth()
  s390/boot: Ignore vmlinux.map
  s390/sysctl: Remove "vm/allocate_pgste" sysctl
  s390: Remove 2k vs 4k page table leftovers
  s390/tlb: Use mm_has_pgste() instead of mm_alloc_pgste()
  s390/lowcore: Use lghi instead llilh to clear register
  s390/syscall: Merge __do_syscall() and do_syscall()
  s390/spinlock: Implement SPINLOCK_LOCKVAL with inline assembly
  s390/smp: Implement raw_smp_processor_id() with inline assembly
  s390/current: Implement current with inline assembly
  s390/lowcore: Use inline qualifier for get_lowcore() inline assembly
  s390: Move s390 sysctls into their own file under arch/s390
  s390/syscall: Simplify syscall_get_arguments()
  s390/vfio-ap: Notify userspace that guest's AP config changed when mdev removed
  s390: Remove ioremap_wt() and pgprot_writethrough()
  s390/mm: Add configurable STRICT_MM_TYPECHECKS
  s390/mm: Convert pgste_val() into function
  s390/mm: Convert pgprot_val() into function
  ...
2025-03-29 11:59:43 -07:00
Heiko Carstens
b46525437e s390/spinlock: Implement SPINLOCK_LOCKVAL with inline assembly
Implement SPINLOCK_LOCKVAL with an inline assembly, which makes use of the
ALTERNATIVE macro, to read spinlock_lockval from lowcore. Provide an
alternative instruction with a different offset in case lowcore is
relocated.

This replaces sequences of two instructions with one instruction.

Before:
  10602a:       a7 78 00 00             lhi     %r7,0
  10602e:       a5 8e 00 00             llilh   %r8,0
  106032:       58 d0 83 ac             l       %r13,940(%r8)
  106036:       ba 7d b5 80             cs      %r7,%r13,1408(%r11)

After:
  10602a:       a7 88 00 00             lhi     %r8,0
  10602e:       e3 70 03 ac 00 58       ly      %r7,940
  106034:       ba 87 b5 80             cs      %r8,%r7,1408(%r11)

Kernel image size change:
add/remove: 756/750 grow/shrink: 646/3435 up/down: 30778/-46326 (-15548)

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-18 17:13:04 +01:00
joel granados
20de8f8d31 s390: Move s390 sysctls into their own file under arch/s390
Move s390 sysctls (spin_retry and userprocess_debug) into their own
files under arch/s390. Create two new sysctl tables
(2390_{fault,spin}_sysctl_table) which will be initialized with
arch_initcall placing them after their original place in proc_root_init.

This is part of a greater effort to move ctl tables into their
respective subsystems which will reduce the merge conflicts in
kernel/sysctl.c.

Signed-off-by: joel granados <joel.granados@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lore.kernel.org/r/20250306-jag-mv_ctltables-v2-6-71b243c8d3f8@kernel.org
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-18 17:13:04 +01:00
Heiko Carstens
52109a067a s390: Convert MACHINE_IS_[LPAR|VM|KVM], etc, machine_is_[lpar|vm|kvm]()
Move machine type detection to the decompressor and use static branches
to implement and use machine_is_[lpar|vm|kvm]() instead of a runtime check
via MACHINE_IS_[LPAR|VM|KVM].

Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04 17:18:07 +01:00
Heiko Carstens
ee487b0120 s390/uaccess: Inline __clear_user()
Rework __clear_user() similar to raw_copy_from_user() / raw_copy_to_user()
and inline the function saving the overhead of branches.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04 17:18:04 +01:00
Heiko Carstens
bc6029239c s390/uaccess: Separate key uaccess functions
Implement separate raw_copy_to_user_key() and raw_copy_from_user_key()
functions, which allows to remove the open-coded operand access control
handling from the normal raw_copy_to_user() / raw_copy_from_user()
functions - they are simplified to use immediate instructions to load
hard-coded operand access control values into register zero, which saves
one instruction.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2025-03-04 17:18:03 +01:00
Eric Biggers
68ea3c2ae0 lib/crc32: remove "_le" from crc32c base and arch functions
Following the standardization on crc32c() as the lib entry point for the
Castagnoli CRC32 instead of the previous mix of crc32c(), crc32c_le(),
and __crc32c_le(), make the same change to the underlying base and arch
functions that implement it.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250208024911.14936-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-02-08 20:06:30 -08:00
Linus Torvalds
b731bc5f49 more s390 updates for 6.14 merge window
- The rework that uncoupled physical and virtual address spaces
   inadvertently prevented KASAN shadow mappings from using large
   pages. Restore large page mappings for KASAN shadows
 
 - Add decompressor routine physmem_alloc() that may fail,
   unlike physmem_alloc_or_die(). This allows callers to
   implement fallback paths
 
 - Allow falling back from large pages to smaller pages (1MB or
   4KB) if the allocation of 2GB pages  in the decompressor can
   not be fulfilled
 
 - Add to the decompressor boot print support of "%%" format
   string, width and padding hadnling, length modifiers and
   decimal conversion specifiers
 
 - Add to the decompressor message severity levels similar to
   kernel ones. Support command-line options that control
   console output verbosity
 
 - Replaces boot_printk() calls with appropriate loglevel-
   specific helpers such as boot_emerg(), boot_warn(), and
   boot_debug().
 
 - Collect all boot messages into a ring buffer independent
   of the current log level. This is particularly useful for
   early crash analysis
 
 - If 'earlyprintk' command line parameter is not specified, store
   decompressor boot messages in a ring buffer to be printed later
   by the kernel, once the console driver is registered
 
 - Add 'bootdebug' command line parameter to enable printing of
   decompressor debug messages when needed. That parameters allows
   message supressing and filtering
 
 - Dump boot messages on a decompressor crash, but only if
   'bootdebug' command line parameter is enabled
 
 - When CONFIG_PRINTK_TIME is enabled, add timestamps to boot
   messages in the same format as regular printk()
 
 - Dump physical memory tracking information on boot:
   online ranges, reserved areas and vmem allocations
 
 - Dump virtual memory layout and randomization details
 
 - Improve decompression error reporting and dump the message
   ring buffer in case the boot failed and system halted
 
 - Add an exception handler which handles exceptions when FPU
   control register is attempted to be set to an invalid value.
   Remove '.fixup' section as result of this change
 
 - Use 'A', 'O', and 'R' inline assembly format flags, which
   allows recent Clang compilers to generate better FPU code
 
 - Rework uaccess code so it reads better and generates more
   efficient code
 
 - Cleanup futex inline assembly code
 
 - Disable KMSAN instrumention for futex inline assemblies, which
   contain dereferenced user pointers. Otherwise, shadows for the
   user pointers would be accessed
 
 - PFs which are not initially configured but in standby create
   only a single-function PCI domain. If they are configured later
   on, sibling PFs and their child VFs will not be added to their
   PCI domain breaking SR-IOV expectations. Fix that by allowing
   initially configured but in standby PFs create multi-function
   PCI domains
 
 - Add '-std=gnu11' to decompressor and purgatory CFLAGS to avoid
   compile errors caused by kernel's own definitions of 'bool',
   'false', and 'true' conflicting with the C23 reserved keywords
 
 - Fix sclp subsystem failure when a sclp console is not present
 
 - Fix misuse of non-NULL terminated strings in vmlogrdr driver
 
 - Various other small improvements, cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYKADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCZ5t0jhccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8A4wAP91BcHtV4OTmoMsG4JTwiy9wTzI
 zro9IrBSCLPLED7kfQEAzjEspd5gWMFDnIM/KxXmCNHA/nBTXVh/1F+b1F0z+Q8=
 =JfEG
 -----END PGP SIGNATURE-----

Merge tag 's390-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Alexander Gordeev:

 - The rework that uncoupled physical and virtual address spaces
   inadvertently prevented KASAN shadow mappings from using large pages.
   Restore large page mappings for KASAN shadows

 - Add decompressor routine physmem_alloc() that may fail, unlike
   physmem_alloc_or_die(). This allows callers to implement fallback
   paths

 - Allow falling back from large pages to smaller pages (1MB or 4KB) if
   the allocation of 2GB pages in the decompressor can not be fulfilled

 - Add to the decompressor boot print support of "%%" format string,
   width and padding hadnling, length modifiers and decimal conversion
   specifiers

 - Add to the decompressor message severity levels similar to kernel
   ones. Support command-line options that control console output
   verbosity

 - Replaces boot_printk() calls with appropriate loglevel- specific
   helpers such as boot_emerg(), boot_warn(), and boot_debug().

 - Collect all boot messages into a ring buffer independent of the
   current log level. This is particularly useful for early crash
   analysis

 - If 'earlyprintk' command line parameter is not specified, store
   decompressor boot messages in a ring buffer to be printed later by
   the kernel, once the console driver is registered

 - Add 'bootdebug' command line parameter to enable printing of
   decompressor debug messages when needed. That parameters allows
   message suppressing and filtering

 - Dump boot messages on a decompressor crash, but only if 'bootdebug'
   command line parameter is enabled

 - When CONFIG_PRINTK_TIME is enabled, add timestamps to boot messages
   in the same format as regular printk()

 - Dump physical memory tracking information on boot: online ranges,
   reserved areas and vmem allocations

 - Dump virtual memory layout and randomization details

 - Improve decompression error reporting and dump the message ring
   buffer in case the boot failed and system halted

 - Add an exception handler which handles exceptions when FPU control
   register is attempted to be set to an invalid value. Remove '.fixup'
   section as result of this change

 - Use 'A', 'O', and 'R' inline assembly format flags, which allows
   recent Clang compilers to generate better FPU code

 - Rework uaccess code so it reads better and generates more efficient
   code

 - Cleanup futex inline assembly code

 - Disable KMSAN instrumention for futex inline assemblies, which
   contain dereferenced user pointers. Otherwise, shadows for the user
   pointers would be accessed

 - PFs which are not initially configured but in standby create only a
   single-function PCI domain. If they are configured later on, sibling
   PFs and their child VFs will not be added to their PCI domain
   breaking SR-IOV expectations.

   Fix that by allowing initially configured but in standby PFs create
   multi-function PCI domains

 - Add '-std=gnu11' to decompressor and purgatory CFLAGS to avoid
   compile errors caused by kernel's own definitions of 'bool', 'false',
   and 'true' conflicting with the C23 reserved keywords

 - Fix sclp subsystem failure when a sclp console is not present

 - Fix misuse of non-NULL terminated strings in vmlogrdr driver

 - Various other small improvements, cleanups and fixes

* tag 's390-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (53 commits)
  s390/vmlogrdr: Use array instead of string initializer
  s390/vmlogrdr: Use internal_name for error messages
  s390/sclp: Initialize sclp subsystem via arch_cpu_finalize_init()
  s390/tools: Use array instead of string initializer
  s390/vmem: Fix null-pointer-arithmetic warning in vmem_map_init()
  s390: Add '-std=gnu11' to decompressor and purgatory CFLAGS
  s390/bitops: Use correct constraint for arch_test_bit() inline assembly
  s390/pci: Fix SR-IOV for PFs initially in standby
  s390/futex: Avoid KMSAN instrumention for user pointers
  s390/uaccess: Rename get_put_user_noinstr_attributes to uaccess_kmsan_or_inline
  s390/futex: Cleanup futex_atomic_cmpxchg_inatomic()
  s390/futex: Generate futex atomic op functions
  s390/uaccess: Remove INLINE_COPY_FROM_USER and INLINE_COPY_TO_USER
  s390/uaccess: Use asm goto for put_user()/get_user()
  s390/uaccess: Remove usage of the oac specifier
  s390/uaccess: Replace EX_TABLE_UA_LOAD_MEM exception handling
  s390/uaccess: Cleanup noinstr __put_user()/__get_user() inline assembly constraints
  s390/uaccess: Remove __put_user_fn()/__get_user_fn() wrappers
  s390/uaccess: Move put_user() / __put_user() close to put_user() asm code
  s390/uaccess: Use asm goto for __mvc_kernel_nofault()
  ...
2025-01-30 10:48:17 -08:00
Heiko Carstens
884f0582b2 s390/uaccess: Remove INLINE_COPY_FROM_USER and INLINE_COPY_TO_USER
The s390 implementations of raw_copy_from_user() and raw_copy_to_user() are
never inlined. However INLINE_COPY_FROM_USER and INLINE_COPY_TO_USER are
still set. This leads to the odd situation that only the error handling
(memset to zero of the not copied bytes) of copy_from_user() is inlined,
while the actual fast path code is out-of-line.

This would make sense if raw_copy_from_user() and raw_copy_to_user() were
implemented in assembler files, where inlining is not possible. But the
current s390 setup does not make any sense.

Address this by moving the raw uaccess copy inline assemblies to the
uaccess header file, and remove INLINE_COPY_FROM_USER and
INLINE_COPY_TO_USER definitions. This way the uaccess code, but now
including error handling, is still out-of-line with the common code
_copy_from_user() and _copy_to_user() variants, which inline the raw
uaccess functions via _inline_copy_from_user() and _inline_copy_to_user().

This reduces the size of the kernel image by ~17kb.
(defconfig, gcc 14.2.0)

Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-26 17:24:07 +01:00
Linus Torvalds
37b33c68b0 CRC updates for 6.14
- Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be
   directly accessible via the library API, instead of requiring the
   crypto API.  This is much simpler and more efficient.
 
 - Convert some users such as ext4 to use the CRC32 library API instead
   of the crypto API.  More conversions like this will come later.
 
 - Add a KUnit test that tests and benchmarks multiple CRC variants.
   Remove older, less-comprehensive tests that are made redundant by
   this.
 
 - Add an entry to MAINTAINERS for the kernel's CRC library code.  I'm
   volunteering to maintain it.  I have additional cleanups and
   optimizations planned for future cycles.
 
 These patches have been in linux-next since -rc1.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZ418ZRQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKyJYAP9kBlpm8W9/XY6N8SpjKaXE/vKQYHQl
 Nobhak06Us8uJwEAkcUTymWP4IwQj5A9jgBAPRw53FQcNVKIc+01C7gRHw0=
 =mqSH
 -----END PGP SIGNATURE-----

Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull CRC updates from Eric Biggers:

 - Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be
   directly accessible via the library API, instead of requiring the
   crypto API. This is much simpler and more efficient.

 - Convert some users such as ext4 to use the CRC32 library API instead
   of the crypto API. More conversions like this will come later.

 - Add a KUnit test that tests and benchmarks multiple CRC variants.
   Remove older, less-comprehensive tests that are made redundant by
   this.

 - Add an entry to MAINTAINERS for the kernel's CRC library code. I'm
   volunteering to maintain it. I have additional cleanups and
   optimizations planned for future cycles.

* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (31 commits)
  MAINTAINERS: add entry for CRC library
  powerpc/crc: delete obsolete crc-vpmsum_test.c
  lib/crc32test: delete obsolete crc32test.c
  lib/crc16_kunit: delete obsolete crc16_kunit.c
  lib/crc_kunit.c: add KUnit test suite for CRC library functions
  powerpc/crc-t10dif: expose CRC-T10DIF function through lib
  arm64/crc-t10dif: expose CRC-T10DIF function through lib
  arm/crc-t10dif: expose CRC-T10DIF function through lib
  x86/crc-t10dif: expose CRC-T10DIF function through lib
  crypto: crct10dif - expose arch-optimized lib function
  lib/crc-t10dif: add support for arch overrides
  lib/crc-t10dif: stop wrapping the crypto API
  scsi: target: iscsi: switch to using the crc32c library
  f2fs: switch to using the crc32 library
  jbd2: switch to using the crc32c library
  ext4: switch to using the crc32c library
  lib/crc32: make crc32c() go directly to lib
  bcachefs: Explicitly select CRYPTO from BCACHEFS_FS
  x86/crc32: expose CRC32 functions through lib
  x86/crc32: update prototype for crc32_pclmul_le_16()
  ...
2025-01-22 19:55:08 -08:00
Sven Schnelle
745600ed69 s390/lib: Use exrl instead of ex in xor functions
exrl is present in all machines currently supported, therefore prefer
it over ex. This saves one instruction and doesn't need an additional
register to hold the address of the target instruction.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-13 09:50:17 +01:00
Sven Schnelle
807e39ed4d s390/lib: Use exrl instead of ex in string functions
exrl is present in all machines currently supported in the linux
kernel, therefore prefer it over ex. This saves one instruction
and doesn't need an additional register to hold the address of the
target instruction.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-12-17 12:46:13 +01:00
Eric Biggers
008071917d s390/crc32: expose CRC32 functions through lib
Move the s390 CRC32 assembly code into the lib directory and wire it up
to the library interface.  This allows it to be used without going
through the crypto API.  It remains usable via the crypto API too via
the shash algorithms that use the library interface.  Thus all the
arch-specific "shash" code becomes unnecessary and is removed.

Note: to see the diff from arch/s390/crypto/crc32-vx.c to
arch/s390/lib/crc32-glue.c, view this commit with 'git show -M10'.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20241202010844.144356-10-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01 17:23:01 -08:00
Linus Torvalds
509f806f7f more s390 updates for 6.13 merge window
- Add swap entry for hugetlbfs support
 
 - Add PTE_MARKER support for hugetlbs mappings; this fixes a regression
   (possible page fault loop) which was introduced when support for
   UFFDIO_POISON for hugetlbfs was added
 
 - Add ARCH_HAS_PREEMPT_LAZY and PREEMPT_DYNAMIC support
 
 - Mark IRQ entries in entry code, so that stack tracers can filter out the
   non-IRQ parts of stack traces. This fixes stack depot capacity limit
   warnings, since without filtering the number of unique stack traces is
   huge
 
 - In PCI code fix leak of struct zpci_dev object, and fix potential double
   remove of hotplug slot
 
 - Fix pagefault_disable() / pagefault_enable() unbalance in
   arch_stack_user_walk_common()
 
 - A couple of inline assembly optimizations, more cmpxchg() to
   try_cmpxchg() conversions, and removal of usages of xchg() and cmpxchg()
   on one and two byte memory areas
 
 - Various other small improvements and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmdJ3WoACgkQIg7DeRsp
 bsJt1RAAtlkbeN4+eVeYM4vBwHvgfAY/5Ii2wdHO2qwPHqBVkRtsqrmyewE/tVCF
 PZsYBXDDrzyAtLMqjlNGDQ1QexNLn4BELgSIysr45mxwMq1W33BiXvb8I5uK/V/7
 /TcW2s1daJKKrbk+HBA8ZTwna5SeUSoZuh9y/n9SKVC4rRkWdeL7G1RRNQtafDlg
 aELCo17iHDZNoHeoRStOimZqVBwko6IQqQH4DCx2S4+J6nKQBGRyzGWIkLRoUxr6
 MgNLrxekWjkoqAnXM0Ztb7LYg6AS/iOuGbqg/xLi1VJSWNCIf9zLpDs++SdFHoTU
 n4Cj07IHR4OLQ1YB+EX2uPY7rJw0tPt0g/dgmYYi3uP88hJ7VYFOtfJx/UGlid2q
 3l7wXNwtg+CJtw0Ey+21cMdmnOffxH9c3nBPahe7zK5k1GKjXDOfWEcmucG0zW5K
 qYI5m7vAZAX4ve1362DOgJei/1uxGuMQQZsobHpwfhcGXzLZ2AZY45Ls86nQzHua
 KpupybWQe70hQYk9hUw+M/ShChuH8dhnPjx51T0r/0E0BdU6Q20xLPLWx/2jRzUb
 FlFg7WtVw2y45eQCFPbtVsoVzDCpfpfgTw5rrDsjFf/twS0E3ubmTC1rLr4YB+5m
 5cjPys/SYpQWUi3wQFTQ6dL3w0+vWXlQmTi5ChcxTZF2ytwP+yg=
 =cfmM
 -----END PGP SIGNATURE-----

Merge tag 's390-6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Heiko Carstens:

 - Add swap entry for hugetlbfs support

 - Add PTE_MARKER support for hugetlbs mappings; this fixes a regression
   (possible page fault loop) which was introduced when support for
   UFFDIO_POISON for hugetlbfs was added

 - Add ARCH_HAS_PREEMPT_LAZY and PREEMPT_DYNAMIC support

 - Mark IRQ entries in entry code, so that stack tracers can filter out
   the non-IRQ parts of stack traces. This fixes stack depot capacity
   limit warnings, since without filtering the number of unique stack
   traces is huge

 - In PCI code fix leak of struct zpci_dev object, and fix potential
   double remove of hotplug slot

 - Fix pagefault_disable() / pagefault_enable() unbalance in
   arch_stack_user_walk_common()

 - A couple of inline assembly optimizations, more cmpxchg() to
   try_cmpxchg() conversions, and removal of usages of xchg() and
   cmpxchg() on one and two byte memory areas

 - Various other small improvements and cleanups

* tag 's390-6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (27 commits)
  Revert "s390/mm: Allow large pages for KASAN shadow mapping"
  s390/spinlock: Use flag output constraint for arch_cmpxchg_niai8()
  s390/spinlock: Use R constraint for arch_load_niai4()
  s390/spinlock: Generate shorter code for arch_spin_unlock()
  s390/spinlock: Remove condition code clobber from arch_spin_unlock()
  s390/spinlock: Use symbolic names in inline assemblies
  s390: Support PREEMPT_DYNAMIC
  s390/pci: Fix potential double remove of hotplug slot
  s390/pci: Fix leak of struct zpci_dev when zpci_add_device() fails
  s390/mm/hugetlbfs: Add missing includes
  s390/mm: Add PTE_MARKER support for hugetlbfs mappings
  s390/mm: Introduce region-third and segment table swap entries
  s390/mm: Introduce region-third and segment table entry present bits
  s390/mm: Rearrange region-third and segment table entry SW bits
  KVM: s390: Increase size of union sca_utility to four bytes
  KVM: s390: Remove one byte cmpxchg() usage
  KVM: s390: Use try_cmpxchg() instead of cmpxchg() loops
  s390/ap: Replace xchg() with WRITE_ONCE()
  s390/mm: Allow large pages for KASAN shadow mapping
  s390: Add ARCH_HAS_PREEMPT_LAZY support
  ...
2024-11-29 10:40:52 -08:00
Heiko Carstens
889221c4d7 s390/spinlock: Use flag output constraint for arch_cmpxchg_niai8()
Add a new variant of arch_cmpxchg_niai8() which makes use of the flag
output constraint, which allows the compiler to generate slightly better
code. Also rename arch_cmpxchg_niai8() to arch_try_cmpxchg_niai8() which
reflects the purpose of the function and makes it consistent with other
"try" variants.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-28 14:14:34 +01:00
Heiko Carstens
84ac96587b s390/spinlock: Use R constraint for arch_load_niai4()
The load instruction used within arch_load_niai4() has a short displacement
and index register. Therefore use the R constraint to reflect this.
The used Q constraint does consider an index register.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-28 14:12:05 +01:00
Heiko Carstens
78486ed9e7 s390/spinlock: Use symbolic names in inline assemblies
Improve readability and use symbolic names.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-28 14:12:04 +01:00
Linus Torvalds
aad3a0d084 ftrace updates for v6.13:
- Merged tag ftrace-v6.12-rc4
 
   There was a fix to locking in register_ftrace_graph() for shadow stacks
   that was sent upstream. But this code was also being rewritten, and the
   locking fix was needed. Merging this fix was required to continue the
   work.
 
 - Restructure the function graph shadow stack to prepare it for use with
   kretprobes
 
   With the goal of merging the shadow stack logic of function graph and
   kretprobes, some more restructuring of the function shadow stack is
   required.
 
   Move out function graph specific fields from the fgraph infrastructure and
   store it on the new stack variables that can pass data from the entry
   callback to the exit callback.
 
   Hopefully, with this change, the merge of kretprobes to use fgraph shadow
   stacks will be ready by the next merge window.
 
 - Make shadow stack 4k instead of using PAGE_SIZE.
 
   Some architectures have very large PAGE_SIZE values which make its use for
   shadow stacks waste a lot of memory.
 
 - Give shadow stacks its own kmem cache.
 
   When function graph is started, every task on the system gets a shadow
   stack. In the future, shadow stacks may not be 4K in size. Have it have
   its own kmem cache so that whatever size it becomes will still be
   efficient in allocations.
 
 - Initialize profiler graph ops as it will be needed for new updates to fgraph
 
 - Convert to use guard(mutex) for several ftrace and fgraph functions
 
 - Add more comments and documentation
 
 - Show function return address in function graph tracer
 
   Add an option to show the caller of a function at each entry of the
   function graph tracer, similar to what the function tracer does.
 
 - Abstract out ftrace_regs from being used directly like pt_regs
 
   ftrace_regs was created to store a partial pt_regs. It holds only the
   registers and stack information to get to the function arguments and
   return values. On several archs, it is simply a wrapper around pt_regs.
   But some users would access ftrace_regs directly to get the pt_regs which
   will not work on all archs. Make ftrace_regs an abstract structure that
   requires all access to its fields be through accessor functions.
 
 - Show how long it takes to do function code modifications
 
   When code modification for function hooks happen, it always had the time
   recorded in how long it took to do the conversion. But this value was
   never exported. Recently the code was touched due to new ROX modification
   handling that caused a large slow down in doing the modifications and
   had a significant impact on boot times.
 
   Expose the timings in the dyn_ftrace_total_info file. This file was
   created a while ago to show information about memory usage and such to
   implement dynamic function tracing. It's also an appropriate file to store
   the timings of this modification as well. This will make it easier to see
   the impact of changes to code modification on boot up timings.
 
 - Other clean ups and small fixes
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZztrUxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qnnNAQD6w4q9VQ7oOE2qKLqtnj87h4c1GqKn
 SPkpEfC3n/ATEAD/fnYjT/eOSlHiGHuD/aTA+U/bETrT99bozGM/4mFKEgY=
 =6nCa
 -----END PGP SIGNATURE-----

Merge tag 'ftrace-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ftrace updates from Steven Rostedt:

 - Restructure the function graph shadow stack to prepare it for use
   with kretprobes

   With the goal of merging the shadow stack logic of function graph and
   kretprobes, some more restructuring of the function shadow stack is
   required.

   Move out function graph specific fields from the fgraph
   infrastructure and store it on the new stack variables that can pass
   data from the entry callback to the exit callback.

   Hopefully, with this change, the merge of kretprobes to use fgraph
   shadow stacks will be ready by the next merge window.

 - Make shadow stack 4k instead of using PAGE_SIZE.

   Some architectures have very large PAGE_SIZE values which make its
   use for shadow stacks waste a lot of memory.

 - Give shadow stacks its own kmem cache.

   When function graph is started, every task on the system gets a
   shadow stack. In the future, shadow stacks may not be 4K in size.
   Have it have its own kmem cache so that whatever size it becomes will
   still be efficient in allocations.

 - Initialize profiler graph ops as it will be needed for new updates to
   fgraph

 - Convert to use guard(mutex) for several ftrace and fgraph functions

 - Add more comments and documentation

 - Show function return address in function graph tracer

   Add an option to show the caller of a function at each entry of the
   function graph tracer, similar to what the function tracer does.

 - Abstract out ftrace_regs from being used directly like pt_regs

   ftrace_regs was created to store a partial pt_regs. It holds only the
   registers and stack information to get to the function arguments and
   return values. On several archs, it is simply a wrapper around
   pt_regs. But some users would access ftrace_regs directly to get the
   pt_regs which will not work on all archs. Make ftrace_regs an
   abstract structure that requires all access to its fields be through
   accessor functions.

 - Show how long it takes to do function code modifications

   When code modification for function hooks happen, it always had the
   time recorded in how long it took to do the conversion. But this
   value was never exported. Recently the code was touched due to new
   ROX modification handling that caused a large slow down in doing the
   modifications and had a significant impact on boot times.

   Expose the timings in the dyn_ftrace_total_info file. This file was
   created a while ago to show information about memory usage and such
   to implement dynamic function tracing. It's also an appropriate file
   to store the timings of this modification as well. This will make it
   easier to see the impact of changes to code modification on boot up
   timings.

 - Other clean ups and small fixes

* tag 'ftrace-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (22 commits)
  ftrace: Show timings of how long nop patching took
  ftrace: Use guard to take ftrace_lock in ftrace_graph_set_hash()
  ftrace: Use guard to take the ftrace_lock in release_probe()
  ftrace: Use guard to lock ftrace_lock in cache_mod()
  ftrace: Use guard for match_records()
  fgraph: Use guard(mutex)(&ftrace_lock) for unregister_ftrace_graph()
  fgraph: Give ret_stack its own kmem cache
  fgraph: Separate size of ret_stack from PAGE_SIZE
  ftrace: Rename ftrace_regs_return_value to ftrace_regs_get_return_value
  selftests/ftrace: Fix check of return value in fgraph-retval.tc test
  ftrace: Use arch_ftrace_regs() for ftrace_regs_*() macros
  ftrace: Consolidate ftrace_regs accessor functions for archs using pt_regs
  ftrace: Make ftrace_regs abstract from direct use
  fgragh: No need to invoke the function call_filter_check_discard()
  fgraph: Simplify return address printing in function graph tracer
  function_graph: Remove unnecessary initialization in ftrace_graph_ret_addr()
  function_graph: Support recording and printing the function return address
  ftrace: Have calltime be saved in the fgraph storage
  ftrace: Use a running sleeptime instead of saving on shadow stack
  fgraph: Use fgraph data to store subtime for profiler
  ...
2024-11-20 11:34:10 -08:00
Heiko Carstens
0caf91f6d6 s390/string: Convert to use flag output macros
Use flag output macros in inline asm to allow for better code generation if
the compiler has support for the flag output constraint.

Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-13 14:31:33 +01:00
Heiko Carstens
d5fd93629a s390/locking: Use arch_try_cmpxchg() instead of __atomic_cmpxchg_bool()
Use arch_try_cmpxchg() instead of __atomic_cmpxchg_bool() everywhere.
This generates the same code like before, but uses the standard
cmpxchg() implementation instead of a custom __atomic_cmpxchg_bool().

Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-11-12 14:01:29 +01:00
Steven Rostedt
7888af4166 ftrace: Make ftrace_regs abstract from direct use
ftrace_regs was created to hold registers that store information to save
function parameters, return value and stack. Since it is a subset of
pt_regs, it should only be used by its accessor functions. But because
pt_regs can easily be taken from ftrace_regs (on most archs), it is
tempting to use it directly. But when running on other architectures, it
may fail to build or worse, build but crash the kernel!

Instead, make struct ftrace_regs an empty structure and have the
architectures define __arch_ftrace_regs and all the accessor functions
will typecast to it to get to the actual fields. This will help avoid
usage of ftrace_regs directly.

Link: https://lore.kernel.org/all/20241007171027.629bdafd@gandalf.local.home/

Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
Cc: "x86@kernel.org" <x86@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Naveen N Rao <naveen@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Paul  Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas  Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Borislav  Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/20241008230628.958778821@goodmis.org
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-10-10 20:18:01 -04:00
Heiko Carstens
b3e0c5f734 s390/alternatives: Rework to allow for callbacks
Rework alternatives to allow for callbacks. With this every
alternative entry has additional data encoded:

- When (aka context) an alternative is supposed to be applied

- The type of an alternative, which allows for type specific handling
  and callbacks

- Extra type specific payload (patch information), which can be passed
  to callbacks in order to decide if an alternative should be applied
  or not

With this only the "late" context is implemented, which means there is
no change to the previous behaviour. All code is just converted to the
more generic new infrastructure.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Tested-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-07-23 16:02:31 +02:00
Jeff Johnson
4657a8a1c0 s390/lib: Add missing MODULE_DESCRIPTION() macros
With ARCH=s390, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in arch/s390/lib/test_kprobes_s390.o
WARNING: modpost: missing MODULE_DESCRIPTION() in arch/s390/lib/test_unwind.o
WARNING: modpost: missing MODULE_DESCRIPTION() in arch/s390/lib/test_modules.o

Add the missing invocations of the MODULE_DESCRIPTION() macro.

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240615-md-s390-arch-s390-lib-v1-1-d7424b943973@quicinc.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-06-28 14:52:30 +02:00
Sven Schnelle
208da1d5fc s390: Replace S390_lowcore by get_lowcore()
Replace all S390_lowcore usages in arch/s390/ by get_lowcore().

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2024-06-18 17:01:33 +02:00
Vasily Gorbik
ba05b39d54 s390/expoline: Make modules use kernel expolines
Currently, kernel modules contain their own set of expoline thunks. In
the case of EXPOLINE_EXTERN, this involves postlinking of precompiled
expoline.o. expoline.o is also necessary for out-of-source tree module
builds.

Now that the kernel modules area is less than 4 GB away from
kernel expoline thunks, make modules use kernel expolines. Also make
EXPOLINE_EXTERN the default if the compiler supports it. This simplifies
build and aligns with the approach adopted by other architectures.

Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17 13:38:03 +02:00
Heiko Carstens
dcd3e1de9d s390/checksum: provide csum_partial_copy_nocheck()
With csum_partial(), which reads all bytes into registers it is easy to
also implement csum_partial_copy_nocheck() which copies the buffer while
calculating its checksum.

For a 512 byte buffer this reduces the runtime by 19%. Compared to the old
generic variant (memcpy() + cksm instruction) runtime is reduced by 42%).

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16 14:30:17 +01:00
Heiko Carstens
cb2a1dd589 s390/checksum: provide vector register variant of csum_partial()
Provide a faster variant of csum_partial() which uses vector registers
instead of the cksm instruction.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16 14:30:17 +01:00
Heiko Carstens
1c8b8cf28f s390/nmi: implement and use local_mcck_save() / local_mcck_restore()
Instead of using local_mcck_disable() / local_mcck_enable() implement and
use local_mcck_save() / local_mcck_restore() to disable machine checks, and
restoring the previous state.

The problem with using local_mcck_disable() / local_mcck_enable() is that
there is an assumption that machine checks are always enabled. While this
is currently the case the code still looks quite odd, readers need to
double check if the code is correct.

In order to increase readability save and then restore the old machine
check mask bit, instead of assuming that it must have been enabled.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-12-11 14:33:05 +01:00
Heiko Carstens
527618abb9 s390/ctlreg: add struct ctlreg
Add struct ctlreg to enforce strict type checking / usage for control
register functions.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:56 +02:00
Heiko Carstens
2372d39142 s390/ctlreg: use local_ctl_load() and local_ctl_store() where possible
Convert all single control register usages of __local_ctl_load() and
__local_ctl_store() to local_ctl_load() and local_ctl_store().

This also requires to change the type of some struct lowcore members
from __u64 to unsigned long.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:56 +02:00
Heiko Carstens
8d5e98f8d6 s390/ctlreg: add local and system prefix to some functions
Add local and system prefix to some functions to clarify they change
control register contents on either the local CPU or the on all CPUs.

This results in the following API:

Two defines which load and save multiple control registers.
The defines correlate with the following C prototypes:

void __local_ctl_load(unsigned long *, unsigned int cr_low, unsigned int cr_high);
void __local_ctl_store(unsigned long *, unsigned int cr_low, unsigned int cr_high);

Two functions which locally set or clear one bit for a specified
control register:

void local_ctl_set_bit(unsigned int cr, unsigned int bit);
void local_ctl_clear_bit(unsigned int cr, unsigned int bit);

Two functions which set or clear one bit for a specified control
register on all CPUs:

void system_ctl_set_bit(unsigned int cr, unsigned int bit);
void system_ctl_clear_bit(unsigend int cr, unsigned int bit);

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:56 +02:00
Heiko Carstens
ebe1cd530f s390/ctlreg: rename ctl_reg.h to ctlreg.h
Rename ctl_reg.h to ctlreg.h so it matches not only ctlreg.c but also
other control register related function, union, and structure names,
which all come with a ctlreg prefix.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:56 +02:00
Heiko Carstens
0c4d01f395 s390/ctlreg: move control register code to separate file
Control register handling has nothing to do with low level SMP code.
Move it to a separate file.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19 13:26:56 +02:00
Masahiro Yamada
b8c723f1e6 s390: replace #include <asm/export.h> with #include <linux/export.h>
Commit ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
deprecated <asm/export.h>, which is now a wrapper of <linux/export.h>.

Replace #include <asm/export.h> with #include <linux/export.h>.

After all the <asm/export.h> lines are converted, <asm/export.h> and
<asm-generic/export.h> will be removed.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20230806151641.394720-2-masahiroy@kernel.org
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-09 15:20:50 +02:00
Heiko Carstens
b378a98261 s390: include linux/io.h instead of asm/io.h
Include linux/io.h instead of asm/io.h everywhere. linux/io.h includes
asm/io.h, so this shouldn't cause any problems. Instead this might help for
some randconfig build errors which were reported due to some undefined io
related functions.

Also move the changed include so it stays grouped together with other
includes from the same directory.

For ctcm_mpc.c also remove not needed comments (actually questions).

Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-03 11:19:40 +02:00
Heiko Carstens
fbac266f09 s390: select ARCH_SUPPORTS_INT128
s390 has instructions to support 128 bit arithmetics, e.g. a 64 bit
multiply instruction with a 128 bit result. Also 128 bit integer
artithmetics are already used in s390 specific architecture code (see
e.g. read_persistent_clock64()).

Therefore select ARCH_SUPPORTS_INT128.

However limit this to clang for now, since gcc generates inefficient code,
which may lead to stack overflows, when compiling
lib/crypto/curve25519-hacl64.c which depends on ARCH_SUPPORTS_INT128. The
gcc generated functions have 6kb stack frames, compared to only 1kb of the
code generated with clang.

If the kernel is compiled with -Os library calls for __ashlti3(),
__ashrti3(), and __lshrti3() may be generated. Similar to arm64
and riscv provide assembler implementations for these functions.

Acked-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-05-15 14:12:14 +02:00
Heiko Carstens
45769052ae s390/lib: use SYM* macros instead of ENTRY(), etc.
Consistently use the SYM* family of macros instead of the
deprecated ENTRY(), ENDPROC(), etc. family of macros.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-19 17:24:16 +02:00
Heiko Carstens
e48b6853d8 s390/kasan: remove override of mem*() functions
The kasan mem*() functions are not used anymore since s390 has switched
to GENERIC_ENTRY and commit 69d4c0d32186 ("entry, kasan, x86: Disallow
overriding mem*() functions").

Therefore remove the now dead code, similar to x86.
While at it also use the SYM* macros in mem.S.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-19 17:24:16 +02:00
Heiko Carstens
49d6e68f66 s390/uaccess: remove extra blank line
In order to get uaccess.c (nearly) checkpatch warning free remove an
extra blank line:

CHECK: Blank lines aren't necessary before a close brace '}'
+
+}

Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-04 18:27:24 +02:00
Heiko Carstens
c3bd834328 s390/uaccess: get rid of not needed local variable
Get rid of the not needed val local variable and pass the constant
value directly as operand value. In addition this turns the val
operand into an input operand, since it is not changed within the
inline assemblies.

This in turn requires also to add the earlyclobber contraint modifier
to all output operands, since the (former) val operand is used after
all output variants have been modified.

The usercopy kunit tests still pass after this change.

Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-04 18:27:24 +02:00
Heiko Carstens
7f65d18329 s390/uaccess: rename tmp1 and tmp2 variables
Rename tmp1 and tmp2 variables to more meaningful val (for value) and rem
(for remainder).

Except for debug sections the output of "objdump -Dr" of the uaccess object
file is identical before/after this change.

Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-04 18:27:24 +02:00
Heiko Carstens
afdcc2ce39 s390/uaccess: sort EX_TABLE list for inline assemblies
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-04 18:27:24 +02:00
Heiko Carstens
4e0b0ad45c s390/uaccess: rename/sort labels in inline assemblies
Rename and sort labels in uaccess inline assemblies to increase
readability. In addition have only one EX_TABLE entry per line - also to
increase readability.

Except for debug sections the output of "objdump -Dr" of the uaccess object
file is identical before/after this change.

Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-04 18:27:24 +02:00
Heiko Carstens
b96adf0d03 s390/uaccess: remove unused label in inline assemblies
Remove an unused label in all three uaccess inline assemblies.

Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-04 18:27:24 +02:00
Heiko Carstens
10679e4d98 s390/uaccess: use symbolic names for inline assembly operands
Improve readability of the uaccess inline assemblies by using symbolic
names for all input and output operands.

Except for debug sections the output of "objdump -Dr" of the uaccess object
file is identical before/after this change.

Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-04 18:27:24 +02:00
Heiko Carstens
89aba4c26f s390/uaccess: add missing earlyclobber annotations to __clear_user()
Add missing earlyclobber annotation to size, to, and tmp2 operands of the
__clear_user() inline assembly since they are modified or written to before
the last usage of all input operands. This can lead to incorrect register
allocation for the inline assembly.

Fixes: 6c2a9e6df604 ("[S390] Use alternative user-copy operations for new hardware.")
Reported-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/all/20230321122514.1743889-3-mark.rutland@arm.com/
Cc: stable@vger.kernel.org
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-03-27 17:23:08 +02:00
Vasily Gorbik
1a280f48c0 s390/kprobes: replace kretprobe with rethook
That's an adaptation of commit f3a112c0c40d ("x86,rethook,kprobes:
Replace kretprobe with rethook on x86") to s390.

Replaces the kretprobe code with rethook on s390. With this patch,
kretprobe on s390 uses the rethook instead of kretprobe specific
trampoline code.

Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22 18:42:35 +01:00
Heiko Carstens
4e1b5a86a5 s390/uaccess: add missing EX_TABLE entries to __clear_user()
For some exception types the instruction address points behind the
instruction that caused the exception. Take that into account and add
the missing exception table entries.

Cc: <stable@vger.kernel.org>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-10-26 14:47:30 +02:00
Linus Torvalds
27bc50fc90 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any negative
   reports (or any positive ones, come to that).
 
 - Also the Maple Tree from Liam R.  Howlett.  An overlapping range-based
   tree for vmas.  It it apparently slight more efficient in its own right,
   but is mainly targeted at enabling work to reduce mmap_lock contention.
 
   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.
 
   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   (https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
   This has yet to be addressed due to Liam's unfortunately timed
   vacation.  He is now back and we'll get this fixed up.
 
 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer.  It uses
   clang-generated instrumentation to detect used-unintialized bugs down to
   the single bit level.
 
   KMSAN keeps finding bugs.  New ones, as well as the legacy ones.
 
 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.
 
 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
   file/shmem-backed pages.
 
 - userfaultfd updates from Axel Rasmussen
 
 - zsmalloc cleanups from Alexey Romanov
 
 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
 
 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.
 
 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.
 
 - memcg cleanups from Kairui Song.
 
 - memcg fixes and cleanups from Johannes Weiner.
 
 - Vishal Moola provides more folio conversions
 
 - Zhang Yi removed ll_rw_block() :(
 
 - migration enhancements from Peter Xu
 
 - migration error-path bugfixes from Huang Ying
 
 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths.  For optimizations by PMEM drivers, DRM
   drivers, etc.
 
 - vma merging improvements from Jakub Matěn.
 
 - NUMA hinting cleanups from David Hildenbrand.
 
 - xu xin added aditional userspace visibility into KSM merging activity.
 
 - THP & KSM code consolidation from Qi Zheng.
 
 - more folio work from Matthew Wilcox.
 
 - KASAN updates from Andrey Konovalov.
 
 - DAMON cleanups from Kaixu Xia.
 
 - DAMON work from SeongJae Park: fixes, cleanups.
 
 - hugetlb sysfs cleanups from Muchun Song.
 
 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
 joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
 bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
 =xfWx
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP & KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
2022-10-10 17:53:04 -07:00
Alexander Potapenko
33b75c1d88 instrumented.h: allow instrumenting both sides of copy_from_user()
Introduce instrument_copy_from_user_before() and
instrument_copy_from_user_after() hooks to be invoked before and after the
call to copy_from_user().

KASAN and KCSAN will be only using instrument_copy_from_user_before(), but
for KMSAN we'll need to insert code after copy_from_user().

Link: https://lkml.kernel.org/r/20220915150417.722975-4-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-03 14:03:18 -07:00