474 Commits

Author SHA1 Message Date
Damodharam Ammepalli
f3fdd4fba1 ethtool: cmis_cdb: use correct rpl size in ethtool_cmis_module_poll()
rpl is passed as a pointer to ethtool_cmis_module_poll(), so the correct
size of rpl is sizeof(*rpl) which should be just 1 byte.  Using the
pointer size instead can cause stack corruption:

Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ethtool_cmis_wait_for_cond+0xf4/0x100
CPU: 72 UID: 0 PID: 4440 Comm: kworker/72:2 Kdump: loaded Tainted: G           OE      6.11.0 #24
Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Hardware name: Dell Inc. PowerEdge R760/04GWWM, BIOS 1.6.6 09/20/2023
Workqueue: events module_flash_fw_work
Call Trace:
 <TASK>
 panic+0x339/0x360
 ? ethtool_cmis_wait_for_cond+0xf4/0x100
 ? __pfx_status_success+0x10/0x10
 ? __pfx_status_fail+0x10/0x10
 __stack_chk_fail+0x10/0x10
 ethtool_cmis_wait_for_cond+0xf4/0x100
 ethtool_cmis_cdb_execute_cmd+0x1fc/0x330
 ? __pfx_status_fail+0x10/0x10
 cmis_cdb_module_features_get+0x6d/0xd0
 ethtool_cmis_cdb_init+0x8a/0xd0
 ethtool_cmis_fw_update+0x46/0x1d0
 module_flash_fw_work+0x17/0xa0
 process_one_work+0x179/0x390
 worker_thread+0x239/0x340
 ? __pfx_worker_thread+0x10/0x10
 kthread+0xcc/0x100
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x2d/0x50
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Fixes: a39c84d79625 ("ethtool: cmis_cdb: Add a layer for supporting CDB commands")
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250409173312.733012-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 18:41:19 -07:00
Ido Schimmel
eaa517b77e ethtool: cmis_cdb: Fix incorrect read / write length extension
The 'read_write_len_ext' field in 'struct ethtool_cmis_cdb_cmd_args'
stores the maximum number of bytes that can be read from or written to
the Local Payload (LPL) page in a single multi-byte access.

Cited commit started overwriting this field with the maximum number of
bytes that can be read from or written to the Extended Payload (LPL)
pages in a single multi-byte access. Transceiver modules that support
auto paging can advertise a number larger than 255 which is problematic
as 'read_write_len_ext' is a 'u8', resulting in the number getting
truncated and firmware flashing failing [1].

Fix by ignoring the maximum EPL access size as the kernel does not
currently support auto paging (even if the transceiver module does) and
will not try to read / write more than 128 bytes at once.

[1]
Transceiver module firmware flashing started for device enp177s0np0
Transceiver module firmware flashing in progress for device enp177s0np0
Progress: 0%
Transceiver module firmware flashing encountered an error for device enp177s0np0
Status message: Write FW block EPL command failed, LPL length is longer
	than CDB read write length extension allows.

Fixes: 9a3b0d078bd8 ("net: ethtool: Add support for writing firmware blocks using EPL payload")
Reported-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Closes: https://lore.kernel.org/netdev/20250402183123.321036-3-michael.chan@broadcom.com/
Tested-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20250409112440.365672-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-10 14:32:43 +02:00
Maxime Chevallier
4f038a6a02 net: ethtool: Don't call .cleanup_data when prepare_data fails
There's a consistent pattern where the .cleanup_data() callback is
called when .prepare_data() fails, when it should really be called to
clean after a successful .prepare_data() as per the documentation.

Rewrite the error-handling paths to make sure we don't cleanup
un-prepared data.

Fixes: c781ff12a2f3 ("ethtool: Allow network drivers to dump arbitrary EEPROM data")
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250407130511.75621-1-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-08 15:34:15 +02:00
Stanislav Fomichev
04efcee6ef net: hold instance lock during NETDEV_CHANGE
Cosmin reports an issue with ipv6_add_dev being called from
NETDEV_CHANGE notifier:

[ 3455.008776]  ? ipv6_add_dev+0x370/0x620
[ 3455.010097]  ipv6_find_idev+0x96/0xe0
[ 3455.010725]  addrconf_add_dev+0x1e/0xa0
[ 3455.011382]  addrconf_init_auto_addrs+0xb0/0x720
[ 3455.013537]  addrconf_notify+0x35f/0x8d0
[ 3455.014214]  notifier_call_chain+0x38/0xf0
[ 3455.014903]  netdev_state_change+0x65/0x90
[ 3455.015586]  linkwatch_do_dev+0x5a/0x70
[ 3455.016238]  rtnl_getlink+0x241/0x3e0
[ 3455.019046]  rtnetlink_rcv_msg+0x177/0x5e0

Similarly, linkwatch might get to ipv6_add_dev without ops lock:
[ 3456.656261]  ? ipv6_add_dev+0x370/0x620
[ 3456.660039]  ipv6_find_idev+0x96/0xe0
[ 3456.660445]  addrconf_add_dev+0x1e/0xa0
[ 3456.660861]  addrconf_init_auto_addrs+0xb0/0x720
[ 3456.661803]  addrconf_notify+0x35f/0x8d0
[ 3456.662236]  notifier_call_chain+0x38/0xf0
[ 3456.662676]  netdev_state_change+0x65/0x90
[ 3456.663112]  linkwatch_do_dev+0x5a/0x70
[ 3456.663529]  __linkwatch_run_queue+0xeb/0x200
[ 3456.663990]  linkwatch_event+0x21/0x30
[ 3456.664399]  process_one_work+0x211/0x610
[ 3456.664828]  worker_thread+0x1cc/0x380
[ 3456.665691]  kthread+0xf4/0x210

Reclassify NETDEV_CHANGE as a notifier that consistently runs under the
instance lock.

Link: https://lore.kernel.org/netdev/aac073de8beec3e531c86c101b274d434741c28e.camel@nvidia.com/
Reported-by: Cosmin Ratiu <cratiu@nvidia.com>
Tested-by: Cosmin Ratiu <cratiu@nvidia.com>
Fixes: ad7c7b2172c3 ("net: hold netdev instance lock during sysfs operations")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250404161122.3907628-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-07 11:13:39 -07:00
Taehee Yoo
216a61d33c net: ethtool: fix ethtool_ringparam_get_cfg() returns a hds_thresh value always as 0.
When hds-thresh is configured, ethnl_set_rings() is called, and it calls
ethtool_ringparam_get_cfg() to get ringparameters from .get_ringparam()
callback and dev->cfg.
Both hds_config and hds_thresh values should be set from dev->cfg, not
from .get_ringparam().
But ethtool_ringparam_get_cfg() sets only hds_config from dev->cfg.
So, ethtool_ringparam_get_cfg() returns always a hds_thresh as 0.

If an input value of hds-thresh is 0, a hds_thresh value from
ethtool_ringparam_get_cfg() are same. So ethnl_set_rings() does
nothing and returns immediately.
It causes a bug that setting a hds-thresh value to 0 is not working.

Reproducer:
    modprobe netdevsim
    echo 1 > /sys/bus/netdevsim/new_device
    ethtool -G eth0 hds-thresh 100
    ethtool -G eth0 hds-thresh 0
    ethtool -g eth0
    #hds-thresh value should be 0, but it shows 100.

The tools/testing/selftests/drivers/net/hds.py can test it too with
applying a following patch for hds.py.

Fixes: 928459bbda19 ("net: ethtool: populate the default HDS params in the core")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250404122126.1555648-2-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-07 11:00:00 -07:00
Pauli Virtanen
983e0e4e87 net-timestamp: COMPLETION timestamp on packet tx completion
Add SOF_TIMESTAMPING_TX_COMPLETION, for requesting a software timestamp
when hardware reports a packet completed.

Completion tstamp is useful for Bluetooth, as hardware timestamps do not
exist in the HCI specification except for ISO packets, and the hardware
has a queue where packets may wait.  In this case the software SND
timestamp only reflects the kernel-side part of the total latency
(usually small) and queue length (usually 0 unless HW buffers
congested), whereas the completion report time is more informative of
the true latency.

It may also be useful in other cases where HW TX timestamps cannot be
obtained and user wants to estimate an upper bound to when the TX
probably happened.

Signed-off-by: Pauli Virtanen <pav@iki.fi>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-03-25 12:48:05 -04:00
Maxime Chevallier
79f88a584e net: ethtool: Export the link_mode_params definitions
link_mode_params contains a lookup table of all 802.3 link modes that
are currently supported with structured data about each mode's speed,
duplex, number of lanes and mediums.

As a preparation for a port representation, export that table for the
rest of the net stack to use.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-2-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18 09:03:11 +01:00
Ilpo Järvinen
023af5a72a gso: AccECN support
Handling the CWR flag differs between RFC 3168 ECN and AccECN.
With RFC 3168 ECN aware TSO (NETIF_F_TSO_ECN) CWR flag is cleared
starting from 2nd segment which is incompatible how AccECN handles
the CWR flag. Such super-segments are indicated by SKB_GSO_TCP_ECN.
With AccECN, CWR flag (or more accurately, the ACE field that also
includes ECE & AE flags) changes only when new packet(s) with CE
mark arrives so the flag should not be changed within a super-skb.
The new skb/feature flags are necessary to prevent such TSO engines
corrupting AccECN ACE counters by clearing the CWR flag (if the
CWR handling feature cannot be turned off).

If NIC is completely unaware of RFC3168 ECN (doesn't support
NETIF_F_TSO_ECN) or its TSO engine can be set to not touch CWR flag
despite supporting also NETIF_F_TSO_ECN, TSO could be safely used
with AccECN on such NIC. This should be evaluated per NIC basis
(not done in this patch series for any NICs).

For the cases, where TSO cannot keep its hands off the CWR flag,
a GSO fallback is provided by this patch.

Signed-off-by: Ilpo Järvinen <ij@kernel.org>
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-03-17 13:54:50 +00:00
Paolo Abeni
941defcea7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.14-rc6).

Conflicts:

tools/testing/selftests/drivers/net/ping.py
  75cc19c8ff89 ("selftests: drv-net: add xdp cases for ping.py")
  de94e8697405 ("selftests: drv-net: store addresses in dict indexed by ipver")
https://lore.kernel.org/netdev/20250311115758.17a1d414@canb.auug.org.au/

net/core/devmem.c
  a70f891e0fa0 ("net: devmem: do not WARN conditionally after netdev_rx_queue_restart()")
  1d22d3060b9b ("net: drop rtnl_lock for queue_mgmt operations")
https://lore.kernel.org/netdev/20250313114929.43744df1@canb.auug.org.au/

Adjacent changes:

tools/testing/selftests/net/Makefile
  6f50175ccad4 ("selftests: Add IPv6 link-local address generation tests for GRE devices.")
  2e5584e0f913 ("selftests/net: expand cmsg_ipv6.sh with ipv4")

drivers/net/ethernet/broadcom/bnxt/bnxt.c
  661958552eda ("eth: bnxt: do not use BNXT_VNIC_NTUPLE unconditionally in queue restart logic")
  fe96d717d38e ("bnxt_en: Extend queue stop/start for TX rings")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-13 23:08:11 +01:00
Kory Maincent
d0a4a1b36d net: ethtool: tsinfo: Fix dump command
Fix missing initialization of ts_info->phc_index in the dump command,
which could cause a netdev interface to incorrectly display a PTP provider
at index 0 instead of "none".
Fix it by initializing the phc_index to -1.

In the same time, restore missing initialization of ts_info.cmd for the
IOCTL case, as it was before the transition from ethnl_default_dumpit to
custom ethnl_tsinfo_dumpit.

Also, remove unnecessary zeroing of ts_info, as it is embedded within
reply_data, which is fully zeroed two lines earlier.

Fixes: b9e3f7dc9ed95 ("net: ethtool: tsinfo: Enhance tsinfo to support several hwtstamp by net topology")
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250307091255.463559-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-10 13:14:25 -07:00
Jakub Kicinski
8ef890df40 net: move misc netdev_lock flavors to a separate header
Move the more esoteric helpers for netdev instance lock to
a dedicated header. This avoids growing netdevice.h to infinity
and makes rebuilding the kernel much faster (after touching
the header with the helpers).

The main netdev_lock() / netdev_unlock() functions are used
in static inlines in netdevice.h and will probably be used
most commonly, so keep them in netdevice.h.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250307183006.2312761-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-08 09:06:50 -08:00
Eric Dumazet
f36a928582 net: ethtool: use correct device pointer in ethnl_default_dump_one()
ethnl_default_dump_one() operates on the device provided in its @dev
parameter, not from ctx->req_info->dev.

syzbot reported:

Oops: general protection fault, probably for non-canonical address 0xdffffc0000000197: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000cb8-0x0000000000000cbf]
 RIP: 0010:netdev_need_ops_lock include/linux/netdevice.h:2792 [inline]
 RIP: 0010:netdev_lock_ops include/linux/netdevice.h:2803 [inline]
 RIP: 0010:ethnl_default_dump_one net/ethtool/netlink.c:557 [inline]
 RIP: 0010:ethnl_default_dumpit+0x447/0xd40 net/ethtool/netlink.c:593
Call Trace:
 <TASK>
  genl_dumpit+0x10d/0x1b0 net/netlink/genetlink.c:1027
  netlink_dump+0x64d/0xe10 net/netlink/af_netlink.c:2309
  __netlink_dump_start+0x5a2/0x790 net/netlink/af_netlink.c:2424
  genl_family_rcv_msg_dumpit net/netlink/genetlink.c:1076 [inline]
  genl_family_rcv_msg net/netlink/genetlink.c:1192 [inline]
  genl_rcv_msg+0x894/0xec0 net/netlink/genetlink.c:1210
  netlink_rcv_skb+0x206/0x480 net/netlink/af_netlink.c:2534
  genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
  netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
  netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1339
  netlink_sendmsg+0x8de/0xcb0 net/netlink/af_netlink.c:1883
  sock_sendmsg_nosec net/socket.c:709 [inline]
  __sock_sendmsg+0x221/0x270 net/socket.c:724
  ____sys_sendmsg+0x53a/0x860 net/socket.c:2564
  ___sys_sendmsg net/socket.c:2618 [inline]
  __sys_sendmsg+0x269/0x350 net/socket.c:2650

Fixes: 2bcf4772e45a ("net: ethtool: try to protect all callback with netdev instance lock")
Reported-by: syzbot+3da2442641f0c6a705a2@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/67caaf5e.050a0220.15b4b9.007a.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250307083544.1659135-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07 19:10:39 -08:00
Jakub Kicinski
2525e16a2b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.14-rc6).

Conflicts:

net/ethtool/cabletest.c
  2bcf4772e45a ("net: ethtool: try to protect all callback with netdev instance lock")
  637399bf7e77 ("net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device")

No Adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06 13:03:35 -08:00
Jakub Kicinski
2bcf4772e4 net: ethtool: try to protect all callback with netdev instance lock
Protect all ethtool callbacks and PHY related state with the netdev
instance lock, for drivers which want / need to have their ops
instance-locked. Basically take the lock everywhere we take rtnl_lock.
It was tempting to take the lock in ethnl_ops_begin(), but turns
out we actually nest those calls (when generating notifications).

Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Cc: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250305163732.2766420-11-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06 12:59:44 -08:00
Maxime Chevallier
637399bf7e net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device
ethnl_req_get_phydev() is used to lookup a phy_device, in the case an
ethtool netlink command targets a specific phydev within a netdev's
topology.

It takes as a parameter a const struct nlattr *header that's used for
error handling :

       if (!phydev) {
               NL_SET_ERR_MSG_ATTR(extack, header,
                                   "no phy matching phyindex");
               return ERR_PTR(-ENODEV);
       }

In the notify path after a ->set operation however, there's no request
attributes available.

The typical callsite for the above function looks like:

	phydev = ethnl_req_get_phydev(req_base, tb[ETHTOOL_A_XXX_HEADER],
				      info->extack);

So, when tb is NULL (such as in the ethnl notify path), we have a nice
crash.

It turns out that there's only the PLCA command that is in that case, as
the other phydev-specific commands don't have a notification.

This commit fixes the crash by passing the cmd index and the nlattr
array separately, allowing NULL-checking it directly inside the helper.

Fixes: c15e065b46dc ("net: ethtool: Allow passing a phy index for some commands")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Reported-by: Parthiban Veerasooran <parthiban.veerasooran@microchip.com>
Link: https://patch.msgid.link/20250301141114.97204-1-maxime.chevallier@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-04 17:12:01 -08:00
Jakub Kicinski
357660d759 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.14-rc5).

Conflicts:

drivers/net/ethernet/cadence/macb_main.c
  fa52f15c745c ("net: cadence: macb: Synchronize stats calculations")
  75696dd0fd72 ("net: cadence: macb: Convert to get_stats64")
https://lore.kernel.org/20250224125848.68ee63e5@canb.auug.org.au

Adjacent changes:

drivers/net/ethernet/intel/ice/ice_sriov.c
  79990cf5e7ad ("ice: Fix deinitializing VF in error path")
  a203163274a4 ("ice: simplify VF MSI-X managing")

net/ipv4/tcp.c
  18912c520674 ("tcp: devmem: don't write truncated dmabuf CMSGs to userspace")
  297d389e9e5b ("net: prefix devmem specific helpers")

net/mptcp/subflow.c
  8668860b0ad3 ("mptcp: reset when MPTCP opts are dropped after join")
  c3349a22c200 ("mptcp: consolidate subflow cleanup")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-27 10:20:58 -08:00
Gal Pressman
ecdff89338 ethtool: Symmetric OR-XOR RSS hash
Add an additional type of symmetric RSS hash type: OR-XOR.
The "Symmetric-OR-XOR" algorithm transforms the input as follows:

(SRC_IP | DST_IP, SRC_IP ^ DST_IP, SRC_PORT | DST_PORT, SRC_PORT ^ DST_PORT)

Change 'cap_rss_sym_xor_supported' to 'supported_input_xfrm', a bitmap
of supported RXH_XFRM_* types.

Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20250224174416.499070-2-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25 18:31:04 -08:00
Jakub Kicinski
db10fde5c4 net: ethtool: fix ioctl confusing drivers about desired HDS user config
The legacy ioctl path does not have support for extended attributes.
So we issue a GET to fetch the current settings from the driver,
in an attempt to keep them unchanged. HDS is a bit "special" as
the GET only returns on/off while the SET takes a "ternary" argument
(on/off/default). If the driver was in the "default" setting -
executing the ioctl path binds it to on or off, even tho the user
did not intend to change HDS config.

Factor the relevant logic out of the netlink code and reuse it.

Fixes: 87c8f8496a05 ("bnxt_en: add support for tcp-data-split ethtool command")
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Tested-by: Daniel Xu <dxu@dxuuu.xyz>
Tested-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250221025141.1132944-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-24 14:15:42 -08:00
Jakub Kicinski
637026e591 net: move stale comment about ntuple validation
Gal points out that the comment now belongs further down, since
the original if condition was split into two in
commit de7f7582dff2 ("net: ethtool: prevent flow steering to RSS contexts which don't exist")

Link: https://lore.kernel.org/de4a2a8a-1eb9-4fa8-af87-7526e58218e9@nvidia.com
Reviewed-by: Gal Pressman <gal@nvidia.com>
Link: https://patch.msgid.link/20250214224340.2268691-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-17 16:47:01 -08:00
Jakub Kicinski
7a7e019713 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.14-rc3).

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13 12:43:30 -08:00
Jakub Kicinski
de7f7582df net: ethtool: prevent flow steering to RSS contexts which don't exist
Since commit 42dc431f5d0e ("ethtool: rss: prevent rss ctx deletion
when in use") we prevent removal of RSS contexts pointed to by
existing flow rules. Core should also prevent creation of rules
which point to RSS context which don't exist in the first place.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250206235334.1425329-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-10 08:26:50 -08:00
Kory Maincent
6a774228e8 net: ethtool: tsconfig: Fix netlink type of hwtstamp flags
Fix the netlink type for hardware timestamp flags, which are represented
as a bitset of flags. Although only one flag is supported currently, the
correct netlink bitset type should be used instead of u32 to keep
consistency with other fields. Address this by adding a new named string
set description for the hwtstamp flag structure.

The code has been introduced in the current release so the uAPI change is
still okay.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Fixes: 6e9e2eed4f39 ("net: ethtool: Add support for tsconfig command to get/set hwtstamp config")
Link: https://patch.msgid.link/20250205110304.375086-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06 16:35:21 -08:00
Jakub Kicinski
ba6ec09911 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.14-rc2).

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06 15:19:00 -08:00
Jianbo Liu
4897f9b7f8 ethtool: Add support for 200Gbps per lane link modes
Define 200G, 400G and 800G link modes using 200Gbps per lane.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-06 10:14:01 +01:00
Jakub Kicinski
2b91cc1214 ethtool: ntuple: fix rss + ring_cookie check
The info.flow_type is for RXFH commands, ntuple flow_type is inside
the flow spec. The check currently does nothing, as info.flow_type
is 0 (or even uninitialized by user space) for ETHTOOL_SRXCLSRLINS.

Fixes: 9e43ad7a1ede ("net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in")
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03 18:38:58 -08:00
Jakub Kicinski
244f8aa46f ethtool: rss: fix hiding unsupported fields in dumps
Commit ec6e57beaf8b ("ethtool: rss: don't report key if device
doesn't support it") intended to stop reporting key fields for
additional rss contexts if device has a global hashing key.

Later we added dump support and the filtering wasn't properly
added there. So we end up reporting the key fields in dumps
but not in dos:

  # ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --do rss-get \
		--json '{"header": {"dev-index":2}, "context": 1 }'
  {
     "header": { ... },
     "context": 1,
     "indir": [0, 1, 2, 3, ...]]
  }

  # ./pyynl/cli.py --spec netlink/specs/ethtool.yaml --dump rss-get
  [
     ... snip context 0 ...
     { "header": { ... },
       "context": 1,
       "indir": [0, 1, 2, 3, ...],
 ->    "input_xfrm": 255,
 ->    "hfunc": 1,
 ->    "hkey": "000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000"
     }
  ]

Hide these fields correctly.

The drivers/net/hw/rss_ctx.py selftest catches this when run on
a device with single key, already:

  # Check| At /root/./ksft-net-drv/drivers/net/hw/rss_ctx.py, line 381, in test_rss_context_dump:
  # Check|     ksft_ne(set(data.get('hkey', [1])), {0}, "key is all zero")
  # Check failed {0} == {0} key is all zero
  not ok 8 rss_ctx.test_rss_context_dump

Fixes: f6122900f4e2 ("ethtool: rss: support dumping RSS contexts")
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250201013040.725123-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-03 18:38:52 -08:00
Gal Pressman
4f5a52adeb ethtool: Fix set RXNFC command with symmetric RSS hash
The sanity check that both source and destination are set when symmetric
RSS hash is requested is only relevant for ETHTOOL_SRXFH (rx-flow-hash),
it should not be performed on any other commands (e.g.
ETHTOOL_SRXCLSRLINS/ETHTOOL_SRXCLSRLDEL).

This resolves accessing uninitialized 'info.data' field, and fixes false
errors in rule insertion:
  # ethtool --config-ntuple eth2 flow-type ip4 dst-ip 255.255.255.255 action -1 loc 0
  rmgr: Cannot insert RX class rule: Invalid argument
  Cannot insert classification rule

Fixes: 13e59344fb9d ("net: ethtool: add support for symmetric-xor RSS hash")
Cc: Ahmed Zaki <ahmed.zaki@intel.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://patch.msgid.link/20250126191845.316589-1-gal@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-28 12:25:42 +01:00
Paolo Abeni
cf33d96f50 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts and no adjacent changes.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-21 10:24:33 +01:00
Jakub Kicinski
928459bbda net: ethtool: populate the default HDS params in the core
The core has the current HDS config, it can pre-populate the values
for the drivers. While at it, remove the zero-setting in netdevsim.
Zero are the default values since the config is zalloc'ed.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:44:58 -08:00
Jakub Kicinski
32ad1f7a05 net: provide pending ring configuration in net_device
Record the pending configuration in net_device struct.
ethtool core duplicates the current config and the specific
handlers (for now just ringparam) can modify it.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:44:57 -08:00
Jakub Kicinski
743dea746e net: ethtool: store netdev in a temp variable in ethnl_default_set_doit()
For ease of review of the next patch store the dev pointer
on the stack, instead of referring to req_info.dev every time.

No functional changes.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:44:57 -08:00
Jakub Kicinski
3c836451ca net: move HDS config from ethtool state
Separate the HDS config from the ethtool state struct.
The HDS config contains just simple parameters, not state.
Having it as a separate struct will make it easier to clone / copy
and also long term potentially make it per-queue.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250119020518.1962249-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-20 11:44:57 -08:00
Antoine Tenart
12e070eb69 net: avoid race between device unregistration and ethnl ops
The following trace can be seen if a device is being unregistered while
its number of channels are being modified.

  DEBUG_LOCKS_WARN_ON(lock->magic != lock)
  WARNING: CPU: 3 PID: 3754 at kernel/locking/mutex.c:564 __mutex_lock+0xc8a/0x1120
  CPU: 3 UID: 0 PID: 3754 Comm: ethtool Not tainted 6.13.0-rc6+ #771
  RIP: 0010:__mutex_lock+0xc8a/0x1120
  Call Trace:
   <TASK>
   ethtool_check_max_channel+0x1ea/0x880
   ethnl_set_channels+0x3c3/0xb10
   ethnl_default_set_doit+0x306/0x650
   genl_family_rcv_msg_doit+0x1e3/0x2c0
   genl_rcv_msg+0x432/0x6f0
   netlink_rcv_skb+0x13d/0x3b0
   genl_rcv+0x28/0x40
   netlink_unicast+0x42e/0x720
   netlink_sendmsg+0x765/0xc20
   __sys_sendto+0x3ac/0x420
   __x64_sys_sendto+0xe0/0x1c0
   do_syscall_64+0x95/0x180
   entry_SYSCALL_64_after_hwframe+0x76/0x7e

This is because unregister_netdevice_many_notify might run before the
rtnl lock section of ethnl operations, eg. set_channels in the above
example. In this example the rss lock would be destroyed by the device
unregistration path before being used again, but in general running
ethnl operations while dismantle has started is not a good idea.

Fix this by denying any operation on devices being unregistered. A check
was already there in ethnl_ops_begin, but not wide enough.

Note that the same issue cannot be seen on the ioctl version
(__dev_ethtool) because the device reference is retrieved from within
the rtnl lock section there. Once dismantle started, the net device is
unlisted and no reference will be found.

Fixes: dde91ccfa25f ("ethtool: do not perform operations on net devices being unregistered")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20250116092159.50890-1-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:37:31 -08:00
Vladimir Oltean
6a128cdf19 net: ethtool: ts: add separate counter for unconfirmed one-step TX timestamps
For packets with two-step timestamp requests, the hardware timestamp
comes back to the driver through a confirmation mechanism of sorts,
which allows the driver to confidently bump the successful "pkts"
counter.

For one-step PTP, the NIC is supposed to autonomously insert its
hardware TX timestamp in the packet headers while simultaneously
transmitting it. There may be a confirmation that this was done
successfully, or there may not.

None of the current drivers which implement ethtool_ops :: get_ts_stats()
also support HWTSTAMP_TX_ONESTEP_SYNC or HWTSTAMP_TX_ONESTEP_SYNC, so it
is a bit unclear which model to follow. But there are NICs, such as DSA,
where there is no transmit confirmation at all. Here, it would be wrong /
misleading to increment the successful "pkts" counter, because one-step
PTP packets can be dropped on TX just like any other packets.

So introduce a special counter which signifies "yes, an attempt was made,
but we don't know whether it also exited the port or not". I expect that
for one-step PTP packets where a confirmation is available, the "pkts"
counter would be bumped.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250116104628.123555-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 20:01:09 -08:00
Taehee Yoo
e61779015c net: ethtool: add ring parameter filtering
While the devmem is running, the tcp-data-split and
hds-thresh configuration should not be changed.
If user tries to change tcp-data-split and threshold value while the
devmem is running, it fails and shows extack message.

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250114142852.3364986-5-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15 14:42:11 -08:00
Taehee Yoo
eec8359f07 net: ethtool: add support for configuring hds-thresh
The hds-thresh option configures the threshold value of
the header-data-split.
If a received packet size is larger than this threshold value, a packet
will be split into header and payload.
The header indicates TCP and UDP header, but it depends on driver spec.
The bnxt_en driver supports HDS(Header-Data-Split) configuration at
FW level, affecting TCP and UDP too.
So, If hds-thresh is set, it affects UDP and TCP packets.

Example:
   # ethtool -G <interface name> hds-thresh <value>

   # ethtool -G enp14s0f0np0 tcp-data-split on hds-thresh 256
   # ethtool -g enp14s0f0np0
   Ring parameters for enp14s0f0np0:
   Pre-set maximums:
   ...
   HDS thresh:  1023
   Current hardware settings:
   ...
   TCP data split:         on
   HDS thresh:  256

The default/min/max values are not defined in the ethtool so the drivers
should define themself.
The 0 value means that all TCP/UDP packets' header and payload
will be split.

Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250114142852.3364986-3-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15 14:42:11 -08:00
Taehee Yoo
197258f0ef net: ethtool: add hds_config member in ethtool_netdev_state
When tcp-data-split is UNKNOWN mode, drivers arbitrarily handle it.
For example, bnxt_en driver automatically enables if at least one of
LRO/GRO/JUMBO is enabled.
If tcp-data-split is UNKNOWN and LRO is enabled, a driver returns
ENABLES of tcp-data-split, not UNKNOWN.
So, `ethtool -g eth0` shows tcp-data-split is enabled.

The problem is in the setting situation.
In the ethnl_set_rings(), it first calls get_ringparam() to get the
current driver's config.
At that moment, if driver's tcp-data-split config is UNKNOWN, it returns
ENABLE if LRO/GRO/JUMBO is enabled.
Then, it sets values from the user and driver's current config to
kernel_ethtool_ringparam.
Last it calls .set_ringparam().
The driver, especially bnxt_en driver receives
ETHTOOL_TCP_DATA_SPLIT_ENABLED.
But it can't distinguish whether it is set by the user or just the
current config.

When user updates ring parameter, the new hds_config value is updated
and current hds_config value is stored to old_hdsconfig.
Driver's .set_ringparam() callback can distinguish a passed
tcp-data-split value is came from user explicitly.
If .set_ringparam() is failed, hds_config is rollbacked immediately.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250114142852.3364986-2-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15 14:42:11 -08:00
Kory Maincent
3e9dbfec49 net: pse-pd: Split ethtool_get_status into multiple callbacks
The ethtool_get_status callback currently handles all status and PSE
information within a single function. This approach has two key
drawbacks:

1. If the core requires some information for purposes other than
   ethtool_get_status, redundant code will be needed to fetch the same
   data from the driver (like is_enabled).

2. Drivers currently have access to all information passed to ethtool.
   New variables will soon be added to ethtool status, such as PSE ID,
   power domain IDs, and budget evaluation strategies, which are meant
   to be managed solely by the core. Drivers should not have the ability
   to modify these variables.

To resolve these issues, ethtool_get_status has been split into multiple
callbacks, with each handling a specific piece of information required
by ethtool or the core.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-14 13:56:32 +01:00
Jakub Kicinski
6167c0b6e8 net: ethtool: add support for structured PHY statistics
Introduce a new way to report PHY statistics in a structured and
standardized format using the netlink API. This new method does not
replace the old driver-specific stats, which can still be accessed with
`ethtool -S <eth name>`. The structured stats are available with
`ethtool -S <eth name> --all-groups`.

This new method makes it easier to diagnose problems by organizing stats
in a consistent and documented way.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-14 11:44:19 +01:00
Jakub Kicinski
b7a2c1fe6b net: ethtool: plumb PHY stats to PHY drivers
Introduce support for standardized PHY statistics reporting in ethtool
by extending the PHYLIB framework. Add the functions
phy_ethtool_get_phy_stats() and phy_ethtool_get_link_ext_stats() to
provide a consistent interface for retrieving PHY-level and
link-specific statistics. These functions are used within the ethtool
implementation to avoid direct access to the phy_device structure
outside of the PHYLIB framework.

A new structure, ethtool_phy_stats, is introduced to standardize PHY
statistics such as packet counts, byte counts, and error counters.
Drivers are updated to include callbacks for retrieving PHY and
link-specific statistics, ensuring values are explicitly set only for
supported fields, initialized with ETHTOOL_STAT_NOT_SET to avoid
ambiguity.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-14 11:44:19 +01:00
Oleksij Rempel
fe55b1d401 ethtool: linkstate: migrate linkstate functions to support multi-PHY setups
Adapt linkstate_get_sqi() and linkstate_get_sqi_max() to take a
phy_device argument directly, enabling support for setups with
multiple PHYs. The previous assumption of a single PHY attached to
a net_device no longer holds.

Use ethnl_req_get_phydev() to identify the appropriate PHY device
for the operation. Update linkstate_prepare_data() and related
logic to accommodate this change, ensuring compatibility with
multi-PHY configurations.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-14 11:44:19 +01:00
Li RongQing
b493f881aa net: ethtool: Use hwprov under rcu_read_lock
hwprov should be protected by rcu_read_lock to prevent possible UAF

Fixes: 4c61d809cf60 ("net: ethtool: Fix suspicious rcu_dereference usage")
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Acked-by: Kory Maincent <kory.maincent@bootlin.com>

diff with v1: move and use err varialbe, instead of define a new variable

 net/ethtool/common.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Link: https://patch.msgid.link/20250109111057.4746-1-lirongqing@baidu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-10 18:14:24 -08:00
Kory Maincent
4c61d809cf net: ethtool: Fix suspicious rcu_dereference usage
The __ethtool_get_ts_info function can be called with or without the
rtnl lock held. When the rtnl lock is not held, using rtnl_dereference()
triggers a warning due to the lack of lock context.

Add an rcu_read_lock() to ensure the lock is acquired and to maintain
synchronization.

Reported-by: syzbot+a344326c05c98ba19682@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/676147f8.050a0220.37aaf.0154.GAE@google.com/
Fixes: b9e3f7dc9ed9 ("net: ethtool: tsinfo: Enhance tsinfo to support several hwtstamp by net topology")
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241220083741.175329-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-23 10:38:46 -08:00
Kory Maincent
6e9e2eed4f net: ethtool: Add support for tsconfig command to get/set hwtstamp config
Introduce support for ETHTOOL_MSG_TSCONFIG_GET/SET ethtool netlink socket
to read and configure hwtstamp configuration of a PHC provider. Note that
simultaneous hwtstamp isn't supported; configuring a new one disables the
previous setting.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-16 12:51:41 +00:00
Kory Maincent
b9e3f7dc9e net: ethtool: tsinfo: Enhance tsinfo to support several hwtstamp by net topology
Either the MAC or the PHY can provide hwtstamp, so we should be able to
read the tsinfo for any hwtstamp provider.

Enhance 'get' command to retrieve tsinfo of hwtstamp providers within a
network topology.

Add support for a specific dump command to retrieve all hwtstamp
providers within the network topology, with added functionality for
filtered dump to target a single interface.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-16 12:51:41 +00:00
Kory Maincent
910c4788d6 ethtool: Fix wrong mod state in case of verbose and no_mask bitset
A bitset without mask in a _SET request means we want exactly the bits in
the bitset to be set. This works correctly for compact format but when
verbose format is parsed, ethnl_update_bitset32_verbose() only sets the
bits present in the request bitset but does not clear the rest. The commit
6699170376ab ("ethtool: fix application of verbose no_mask bitset") fixes
this issue by clearing the whole target bitmap before we start iterating.
The solution proposed brought an issue with the behavior of the mod
variable. As the bitset is always cleared the old value will always
differ to the new value.

Fix it by adding a new function to compare bitmaps and a temporary variable
which save the state of the old bitmap.

Fixes: 6699170376ab ("ethtool: fix application of verbose no_mask bitset")
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Link: https://patch.msgid.link/20241202153358.1142095-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 18:54:43 -08:00
Gal Pressman
9407190947 ethtool: Fix access to uninitialized fields in set RXNFC command
The check for non-zero ring with RSS is only relevant for
ETHTOOL_SRXCLSRLINS command, in other cases the check tries to access
memory which was not initialized by the userspace tool. Only perform the
check in case of ETHTOOL_SRXCLSRLINS.

Without this patch, filter deletion (for example) could statistically
result in a false error:
  # ethtool --config-ntuple eth3 delete 484
  rmgr: Cannot delete RX class rule: Invalid argument
  Cannot delete classification rule

Fixes: 9e43ad7a1ede ("net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in")
Link: https://lore.kernel.org/netdev/871a9ecf-1e14-40dd-bbd7-e90c92f89d47@nvidia.com/
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20241202164805.1637093-1-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-03 18:42:28 -08:00
Kees Cook
1cfb5e5788 Revert "net: ethtool: Avoid thousands of -Wflex-array-member-not-at-end warnings"
This reverts commit 3bd9b9abdf1563a22041b7255baea6d449902f1a. We cannot
use the new tagged struct group because it throws C++ errors even under
"extern C".

Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20241115204308.3821419-1-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18 18:52:11 -08:00
Edward Cree
a64499f618 net: ethtool: account for RSS+RXNFC add semantics when checking channel count
In ethtool_check_max_channel(), the new RX count must not only cover the
 max queue indices in RSS indirection tables and RXNFC destinations
 separately, but must also, for RXNFC rules with FLOW_RSS, cover the sum
 of the destination queue and the maximum index in the associated RSS
 context's indirection table, since that is the highest queue that the
 rule can actually deliver traffic to.
It could be argued that the max queue across all custom RSS contexts
 (ethtool_get_max_rss_ctx_channel()) need no longer be considered, since
 any context to which packets can actually be delivered will be targeted
 by some RXNFC rule and its max will thus be allowed for by
 ethtool_get_max_rxnfc_channel().  For simplicity we keep both checks, so
 even RSS contexts unused by any RXNFC rule must fit the channel count.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/43257d375434bef388e36181492aa4c458b88336.1731499022.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14 19:53:42 -08:00
Edward Cree
9e43ad7a1e net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in
Ethtool ntuple filters with FLOW_RSS were originally defined as adding
 the base queue ID (ring_cookie) to the value from the indirection table,
 so that the same table could distribute over more than one set of queues
 when used by different filters.
However, some drivers / hardware ignore the ring_cookie, and simply use
 the indirection table entries as queue IDs directly.  Thus, for drivers
 which have not opted in by setting ethtool_ops.cap_rss_rxnfc_adds to
 declare that they support the original (addition) semantics, reject in
 ethtool_set_rxnfc any filter which combines FLOW_RSS and a nonzero ring.
(For a ring_cookie of zero, both behaviours are equivalent.)
Set the cap bit in sfc, as it is known to support this feature.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/cc3da0844083b0e301a33092a6299e4042b65221.1731499022.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14 19:53:41 -08:00