- Add support for Global Persistent Flush (GPF)
- Cleanup of DPA partition metadata handling
- Remove the CXL_DECODER_MIXED enum that's not needed anymore
- Introduce helpers to access resource and perf meta data
- Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info'
- Make cxl_dpa_alloc() DPA partition number agnostic
- Remove cxl_decoder_mode
- Cleanup partition size and perf helpers
- Remove unused CXL partition values
- Add logging support for CXL CPER endpoint and port protocol errors
- Prefix protocol error struct and function names with cxl_
- Move protocol error definitions and structures to a common location
- Remove drivers/firmware/efi/cper_cxl.h to include/linux/cper.h
- Add support in GHES to process CXL CPER protocol errors
- Process CXL CPER protocol errors
- Add trace logging for CXL PCIe port RAS errors
- Remove redundant gp_port init
- Add validation of cxl device serial number
- CXL ABI documentation updates/fixups
- A series that uses guard() to clean up open coded mutex lockings and remove gotos for error
handling.
- Some followup patches to support dirty shutdown accounting
- Add helper to retrieve DVSEC offset for dirty shutdown registers
- Rename cxl_get_dirty_shutdown() to cxl_arm_dirty_shutdown()
- Add support for dirty shutdown count via sysfs
- cxl_test support for dirty shutdown
- A series to support CXL mailbox Features commands. Mostly in preparation for CXL EDAC
code to utilize the Features commands. It's also in preparation for CXL fwctl support
to utilize the CXL Features. The commands include "Get Supported Features", "Get Feature",
and "Set Feature".
- A series to support extended linear cache support described by the ACPI HMAT table. The
addition helps enumerate the cache and also provides additional RAS reporting support for
configuration with extended linear cache. (and related fixes for the
series).
- An update to cxl_test to support a 3-way capable CFMWS.
- A documentation fix to remove unused "mixed mode".
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmfqtP4ACgkQYGjFFmlT
OEqx9A//UsCWf1CH8bvjKXxSTlQmtPlNpcXe+gVR0sc5cL2VFxKf93AY8Zo1Br5A
b40gtZJz9QwjwGwIvDiki9U2bopOyX3aMOyBJMYmLuL/irY8ENx2ra7ODbxe7uGn
oZwpwG2sEGQxIAG2bCpVuCDIt8JjNvsTJo45TICs07w9TWTmH4Swpbz1g8VGpDz/
kCQcXXHSHZleR5BzqVRKxjjqGEUFj2xDMzAI8VSL+7izMMoPLbjwnl2c1fwaLBPd
iJTMboTXDj7eVMta/qqGkG7pshM81SnkSzy8cxImj3r4SRgRTZg9U8vhrR3K1kdH
F05Ozd12tljtNXLWthENZPUbfcovy9oTxzMt/gVut7j6C7H3s3KCSbV7zhz5BmfD
XcapOX4Cu7ptn88KLqE5a98oLuq2DXrLOcX5vKPYBfAO+68rC+gSAPSbzfZlSHa0
1/TsxVvzDQUBVZWL94DeHvemyQb58GQBOypeNZbH8P4gAhWJqk3hZEO+wlSxpfd+
R7wgabfKJUJ82KusCZHIW1Wg3/IrXb4yC+UyiObS5RgIJWpRmOkuJEHDvEUje+Dj
aOWw/H3vZgeZnpW87FRxzvDJx1/0jZI1vsxH65m2wrvz6n5aGIA/Q6pgqCdU/m6c
I231bl1bmZzJ8u3+vOZL4tFHcYHh4XCwQp+ZQt1uDa0fA5LbLhc=
=ZME1
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull Compute Express Link (CXL) updates from Dave Jiang:
- Add support for Global Persistent Flush (GPF)
- Cleanup of DPA partition metadata handling:
- Remove the CXL_DECODER_MIXED enum that's not needed anymore
- Introduce helpers to access resource and perf meta data
- Introduce 'struct cxl_dpa_partition' and 'struct cxl_range_info'
- Make cxl_dpa_alloc() DPA partition number agnostic
- Remove cxl_decoder_mode
- Cleanup partition size and perf helpers
- Remove unused CXL partition values
- Add logging support for CXL CPER endpoint and port protocol errors:
- Prefix protocol error struct and function names with cxl_
- Move protocol error definitions and structures to a common location
- Remove drivers/firmware/efi/cper_cxl.h to include/linux/cper.h
- Add support in GHES to process CXL CPER protocol errors
- Process CXL CPER protocol errors
- Add trace logging for CXL PCIe port RAS errors
- Remove redundant gp_port init
- Add validation of cxl device serial number
- CXL ABI documentation updates/fixups
- A series that uses guard() to clean up open coded mutex lockings and
remove gotos for error handling.
- Some followup patches to support dirty shutdown accounting:
- Add helper to retrieve DVSEC offset for dirty shutdown registers
- Rename cxl_get_dirty_shutdown() to cxl_arm_dirty_shutdown()
- Add support for dirty shutdown count via sysfs
- cxl_test support for dirty shutdown
- A series to support CXL mailbox Features commands.
Mostly in preparation for CXL EDAC code to utilize the Features
commands. It's also in preparation for CXL fwctl support to utilize
the CXL Features. The commands include "Get Supported Features", "Get
Feature", and "Set Feature".
- A series to support extended linear cache support described by the
ACPI HMAT table.
The addition helps enumerate the cache and also provides additional
RAS reporting support for configuration with extended linear cache.
(and related fixes for the series).
- An update to cxl_test to support a 3-way capable CFMWS
- A documentation fix to remove unused "mixed mode"
* tag 'cxl-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (39 commits)
cxl/region: Fix the first aliased address miscalculation
cxl/region: Quiet some dev_warn()s in extended linear cache setup
cxl/Documentation: Remove 'mixed' from sysfs mode doc
cxl: Fix warning from emitting resource_size_t as long long int on 32bit systems
cxl/test: Define a CFMWS capable of a 3 way HB interleave
cxl/mem: Do not return error if CONFIG_CXL_MCE unset
tools/testing/cxl: Set Shutdown State support
cxl/pmem: Export dirty shutdown count via sysfs
cxl/pmem: Rename cxl_dirty_shutdown_state()
cxl/pci: Introduce cxl_gpf_get_dvsec()
cxl/pci: Support Global Persistent Flush (GPF)
cxl: Document missing sysfs files
cxl: Plug typos in ABI doc
cxl/pmem: debug invalid serial number data
cxl/cdat: Remove redundant gp_port initialization
cxl/memdev: Remove unused partition values
cxl/region: Drop goto pattern of construct_region()
cxl/region: Drop goto pattern in cxl_dax_region_alloc()
cxl/core: Use guard() to drop goto pattern of cxl_dpa_alloc()
cxl/core: Use guard() to drop the goto pattern of cxl_dpa_free()
...
In preparation for cxl fwctl enabling, move data structures related to
cxl feature commands to a user header file.
Reviewed-by; Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/r/20250307205648.1021626-3-dave.jiang@intel.com
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Add fwctl support code to allow sending of CXL feature commands from
userspace through as ioctls via FWCTL. Provide initial setup bits. The
CXL PCI probe function will call devm_cxl_setup_fwctl() after the
cxl_memdev has been enumerated in order to setup FWCTL char device under
the cxl_memdev like the existing memdev char device for issuing CXL raw
mailbox commands from userspace via ioctls.
Link: https://patch.msgid.link/r/20250307205648.1021626-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Add CXL Features support. Setup code for enabling in kernel usage of CXL
Features. Expecting EDAC/RAS to utilize CXL Features in kernel for
things such as memory sparing. Also prepartion for enabling of CXL FWCTL
support to issue allowed Features from user space.
When PCIe AER is in FW-First, OS should process CXL Protocol errors from
CPER records. Introduce support for handling and logging CXL Protocol
errors.
The defined trace events cxl_aer_uncorrectable_error and
cxl_aer_correctable_error trace native CXL AER endpoint errors. Reuse them
to trace FW-First Protocol errors.
Since the CXL code is required to be called from process context and
GHES is in interrupt context, use workqueues for processing.
Similar to CXL CPER event handling, use kfifo to handle errors as it
simplifies queue processing by providing lock free fifo operations.
Add the ability for the CXL sub-system to register a workqueue to
process CXL CPER protocol errors.
[DJ: return cxl_cper_register_prot_err_work() directly in cxl_ras_init()]
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://patch.msgid.link/20250310223839.31342-2-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Certain features will be exclusively used by components such as in
kernel RAS driver. Setup an exclusion list that can be used to detect
if a feature is exclusive to the kernel.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Tested-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250220194438.2281088-7-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Add support for SET_FEATURE mailbox command.
CXL spec r3.2 section 8.2.9.6 describes optional device specific features.
CXL devices supports features with changeable attributes.
The settings of a feature can be optionally modified using Set Feature
command.
CXL spec r3.2 section 8.2.9.6.3 describes Set Feature command.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250220194438.2281088-6-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Add support for GET_FEATURE mailbox command.
CXL spec r3.2 section 8.2.9.6 describes optional device specific features.
The settings of a feature can be retrieved using Get Feature command.
CXL spec r3.2 section 8.2.9.6.2 describes Get Feature command.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250220194438.2281088-5-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
CXL spec r3.2 8.2.9.6.1 Get Supported Features (Opcode 0500h)
The command retrieve the list of supported device-specific features
(identified by UUID) and general information about each Feature.
The driver will retrieve the Feature entries in order to make checks and
provide information for the Get Feature and Set Feature command. One of
the main piece of information retrieved are the effects a Set Feature
command would have for a particular feature. The retrieved Feature
entries are stored in the cxl_mailbox context.
The setup of Features is initiated via devm_cxl_setup_features() during the
pci probe function before the cxl_memdev is enumerated.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250220194438.2281088-3-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Add feature commands enumeration code in order to detect and enumerate
the 3 feature related commands "get supported features", "get feature",
and "set feature". The enumeration will help determine whether the driver
can issue any of the 3 commands to the device.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250220194438.2281088-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
With 'struct cxl_mailbox' context introduced, the helper functions
cxl_query_cmd() and cxl_send_cmd() can take a cxl_mailbox directly
rather than a cxl_memdev parameter. Refactor to use cxl_mailbox
directly.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250204220430.4146187-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Add support in GHES to detect and process CXL CPER Protocol errors, as
defined in UEFI v2.10, section N.2.13.
Define struct cxl_cper_prot_err_work_data to cache CXL protocol error
information, including RAS capabilities and severity, for further
handling.
These cached CXL CPER records will later be processed by workqueues
within the CXL subsystem.
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250123084421.127697-5-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
In preparation to add tracepoint support, move protocol error UUID
definition to a common location, Also, make struct CXL RAS capability,
cxl_cper_sec_prot_err and CPER validation flags global for use across
different modules.
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250123084421.127697-3-Smita.KoralahalliChannabasappa@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
CXL spec 3.1 section 8.2.9.2.1.3 Table 8-47, Memory Module Event Record
has updated with following new fields and new info for Device Event Type
and Device Health Information fields.
1. Validity Flags
2. Component Identifier
3. Device Event Sub-Type
Update the Memory Module event record and Memory Module trace event for
the above spec changes. The new fields are inserted in logical places.
Example trace print of cxl_memory_module trace event,
cxl_memory_module: memdev=mem3 host=0000:0f:00.0 serial=3 log=Fatal : \
time=371709344709 uuid=fe927475-dd59-4339-a586-79bab113b774 len=128 \
flags='0x1' handle=2 related_handle=0 maint_op_class=0 \
maint_op_sub_class=0 : event_type='Temperature Change' \
event_sub_type='Unsupported Config Data' \
health_status='MAINTENANCE_NEEDED|REPLACEMENT_NEEDED' \
media_status='All Data Loss in Event of Power Loss' as_life_used=0x3 \
as_dev_temp=Normal as_cor_vol_err_cnt=Normal as_cor_per_err_cnt=Normal \
life_used=8 device_temp=3 dirty_shutdown_cnt=33 cor_vol_err_cnt=25 \
cor_per_err_cnt=45 validity_flags='COMPONENT|COMPONENT PLDM FORMAT' \
comp_id=03 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \
comp_id_pldm_valid_flags='Resource ID' \
pldm_entity_id=0x00 pldm_resource_id=fc d2 7e 2f
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250111091756.1682-6-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
CXL spec rev 3.1 section 8.2.9.2.1.1 Table 8-45, General Media Event
Record has updated with following new fields and new types for Memory
Event Type and Transaction Type fields.
1. Advanced Programmable Corrected Memory Error Threshold Event Flags
2. Corrected Memory Error Count at Event
3. Memory Event Sub-Type
The format of component identifier has changed (CXL spec 3.1 section
8.2.9.2.1 Table 8-44).
Update the general media event record and general media trace event for
the above spec changes. The new fields are inserted in logical places.
Example trace log of cxl_general_media trace event,
cxl_general_media: memdev=mem0 host=0000:0f:00.0 serial=3 log=Fatal : \
time=156831237413 uuid=fbcd0a77-c260-417f-85a9-088b1621eba6 len=128 \
flags='0x1' handle=1 related_handle=0 maint_op_class=2 \
maint_op_sub_class=4 : dpa=30d40 dpa_flags='' \
descriptor='UNCORRECTABLE_EVENT|THRESHOLD_EVENT|POISON_LIST_OVERFLOW' \
type='TE State Violation' sub_type='Media Link Command Training Error' \
transaction_type='Host Inject Poison' channel=3 rank=33 device=5 \
validity_flags='CHANNEL|RANK|DEVICE|COMPONENT|COMPONENT PLDM FORMAT' \
comp_id=03 74 c5 08 9a 1a 0b fc d2 7e 2f 31 9b 3c 81 4d \
comp_id_pldm_valid_flags='PLDM Entity ID | Resource ID' \
pldm_entity_id=74 c5 08 9a 1a 0b pldm_resource_id=fc d2 7e 2f \
hpa=ffffffffffffffff region= \
region_uuid=00000000-0000-0000-0000-000000000000 \
cme_threshold_ev_flags='Corrected Memory Errors in Multiple Media \
Components|Exceeded Programmable Threshold' cme_count=120
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Link: https://patch.msgid.link/20250111091756.1682-4-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
CXL spec 3.1 section 8.2.9.2.1 Table 8-42, Common Event Record format has
updated with Maintenance Operation Subclass information.
Add updates for the above spec change in the CXL events record and CXL
common trace event implementations.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Link: https://patch.msgid.link/20250111091756.1682-2-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Create a new 'struct cxl_mailbox' and move all mailbox related bits to
it. This allows isolation of all CXL mailbox data in order to export
some of the calls to external kernel callers and avoid exporting of CXL
driver specific bits such has device states. The allocation of
'struct cxl_mailbox' is also split out with cxl_mailbox_init() so the
mailbox can be created independently.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20240905223711.1990186-3-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Group all cxl related kernel headers into include/cxl/ directory.
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20240905223711.1990186-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>