commit 57619f3cdeb5ae9f4252833b0ed600e9f81da722 Author: Bart Van Assche Date: Thu Jun 13 14:18:27 2024 -0700 scsi: usb: uas: Do not query the IO Advice Hints Grouping mode page for USB/UAS devices Recently it was reported that the following USB storage devices are unusable with Linux kernel 6.9: * Kingston DataTraveler G2 * Garmin FR35 This is because attempting to read the IO Advice Hints Grouping mode page causes these devices to reset. Hence do not read the IO Advice Hints Grouping mode page from USB/UAS storage devices. Acked-by: Alan Stern Cc: stable@vger.kernel.org Fixes: 4f53138fffc2 ("scsi: sd: Translate data lifetime information") Reported-by: Joao Machado Closes: https://lore.kernel.org/linux-scsi/20240130214911.1863909-1-bvanassche@acm.org/T/#mf4e3410d8f210454d7e4c3d1fb5c0f41e651b85f Tested-by: Andy Shevchenko Bisected-by: Christian Heusel Reported-by: Andy Shevchenko Closes: https://lore.kernel.org/linux-scsi/CACLx9VdpUanftfPo2jVAqXdcWe8Y43MsDeZmMPooTzVaVJAh2w@mail.gmail.com/ Signed-off-by: Bart Van Assche Link: https://lore.kernel.org/r/20240613211828.2077477-3-bvanassche@acm.org Signed-off-by: Martin K. Petersen commit 633aeefafc9c2a07a76a62be6aac1d73c3e3defa Author: Bart Van Assche Date: Thu Jun 13 14:18:26 2024 -0700 scsi: core: Introduce the BLIST_SKIP_IO_HINTS flag Prepare for skipping the IO Advice Hints Grouping mode page for USB storage devices. Cc: Alan Stern Cc: Joao Machado Cc: Andy Shevchenko Cc: Christian Heusel Cc: stable@vger.kernel.org Fixes: 4f53138fffc2 ("scsi: sd: Translate data lifetime information") Signed-off-by: Bart Van Assche Link: https://lore.kernel.org/r/20240613211828.2077477-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen commit 135c6eb27a85c8b261a2cc1f5093abcda6ee9010 Author: Joel Slebodnick Date: Thu Jun 13 14:27:28 2024 -0400 scsi: ufs: core: Free memory allocated for model before reinit Under the conditions that a device is to be reinitialized within ufshcd_probe_hba(), the device must first be fully reset. Resetting the device should include freeing U8 model (member of dev_info) but does not, and this causes a memory leak. ufs_put_device_desc() is responsible for freeing model. unreferenced object 0xffff3f63008bee60 (size 32): comm "kworker/u33:1", pid 60, jiffies 4294892642 hex dump (first 32 bytes): 54 48 47 4a 46 47 54 30 54 32 35 42 41 5a 5a 41 THGJFGT0T25BAZZA 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace (crc ed7ff1a9): [] kmemleak_alloc+0x34/0x40 [] __kmalloc_noprof+0x1e4/0x2fc [] ufshcd_read_string_desc+0x94/0x190 [] ufshcd_device_init+0x480/0xdf8 [] ufshcd_probe_hba+0x3c/0x404 [] ufshcd_async_scan+0x40/0x370 [] async_run_entry_fn+0x34/0xe0 [] process_one_work+0x154/0x298 [] worker_thread+0x2f8/0x408 [] kthread+0x114/0x118 [] ret_from_fork+0x10/0x20 Fixes: 96a7141da332 ("scsi: ufs: core: Add support for reinitializing the UFS device") Cc: Reviewed-by: Andrew Halaney Reviewed-by: Bart Van Assche Signed-off-by: Joel Slebodnick Link: https://lore.kernel.org/r/20240613200202.2524194-1-jslebodn@redhat.com Signed-off-by: Martin K. Petersen commit 90e6f08915ec6efe46570420412a65050ec826b2 Author: Damien Le Moal Date: Tue Jun 11 17:34:35 2024 +0900 scsi: mpi3mr: Fix ATA NCQ priority support The function mpi3mr_qcmd() of the mpi3mr driver is able to indicate to the HBA if a read or write command directed at an ATA device should be translated to an NCQ read/write command with the high prioiryt bit set when the request uses the RT priority class and the user has enabled NCQ priority through sysfs. However, unlike the mpt3sas driver, the mpi3mr driver does not define the sas_ncq_prio_supported and sas_ncq_prio_enable sysfs attributes, so the ncq_prio_enable field of struct mpi3mr_sdev_priv_data is never actually set and NCQ Priority cannot ever be used. Fix this by defining these missing atributes to allow a user to check if an ATA device supports NCQ priority and to enable/disable the use of NCQ priority. To do this, lift the function scsih_ncq_prio_supp() out of the mpt3sas driver and make it the generic SCSI SAS transport function sas_ata_ncq_prio_supported(). Nothing in that function is hardware specific, so this function can be used in both the mpt3sas driver and the mpi3mr driver. Reported-by: Scott McCoy Fixes: 023ab2a9b4ed ("scsi: mpi3mr: Add support for queue command processing") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal Link: https://lore.kernel.org/r/20240611083435.92961-1-dlemoal@kernel.org Reviewed-by: Niklas Cassel Signed-off-by: Martin K. Petersen commit 95f8bf932b46cd5c17c681d67be9234551234eac Author: Jeff Johnson Date: Mon Jun 10 09:16:15 2024 -0700 scsi: Add missing MODULE_DESCRIPTION() macros On x86, make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/scsi/scsi_common.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/scsi/advansys.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/scsi/BusLogic.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/scsi/aha1740.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/scsi/isci/isci.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/scsi/elx/efct.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/scsi/atp870u.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/scsi/ppa.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/scsi/imm.o Add all missing invocations of the MODULE_DESCRIPTION() macro. This updates all files which have a MODULE_LICENSE() but which do not have a MODULE_DESCRIPTION(), even ones which did not produce the x86 allmodconfig warnings. Acked-by: Finn Thain Signed-off-by: Jeff Johnson Link: https://lore.kernel.org/r/20240610-md-drivers-scsi-v3-1-055da78d66b2@quicinc.com Signed-off-by: Martin K. Petersen commit 77691af484e28af7a692e511b9ed5ca63012ec6e Author: Ziqi Chen Date: Fri Jun 7 18:06:23 2024 +0800 scsi: ufs: core: Quiesce request queues before checking pending cmds In ufshcd_clock_scaling_prepare(), after SCSI layer is blocked, ufshcd_pending_cmds() is called to check whether there are pending transactions or not. And only if there are no pending transactions can we proceed to kickstart the clock scaling sequence. ufshcd_pending_cmds() traverses over all SCSI devices and calls sbitmap_weight() on their budget_map. sbitmap_weight() can be broken down to three steps: 1. Calculate the nr outstanding bits set in the 'word' bitmap. 2. Calculate the nr outstanding bits set in the 'cleared' bitmap. 3. Subtract the result from step 1 by the result from step 2. This can lead to a race condition as outlined below: Assume there is one pending transaction in the request queue of one SCSI device, say sda, and the budget token of this request is 0, the 'word' is 0x1 and the 'cleared' is 0x0. 1. When step 1 executes, it gets the result as 1. 2. Before step 2 executes, block layer tries to dispatch a new request to sda. Since the SCSI layer is blocked, the request cannot pass through SCSI but the block layer would do budget_get() and budget_put() to sda's budget map regardless, so the 'word' has become 0x3 and 'cleared' has become 0x2 (assume the new request got budget token 1). 3. When step 2 executes, it gets the result as 1. 4. When step 3 executes, it gets the result as 0, meaning there is no pending transactions, which is wrong. Thread A Thread B ufshcd_pending_cmds() __blk_mq_sched_dispatch_requests() | | sbitmap_weight(word) | | scsi_mq_get_budget() | | | scsi_mq_put_budget() | | sbitmap_weight(cleared) ... When this race condition happens, the clock scaling sequence is started with transactions still in flight, leading to subsequent hibernate enter failure, broken link, task abort and back to back error recovery. Fix this race condition by quiescing the request queues before calling ufshcd_pending_cmds() so that block layer won't touch the budget map when ufshcd_pending_cmds() is working on it. In addition, remove the SCSI layer blocking/unblocking to reduce redundancies and latencies. Fixes: 8d077ede48c1 ("scsi: ufs: Optimize the command queueing code") Co-developed-by: Can Guo Signed-off-by: Can Guo Signed-off-by: Ziqi Chen Link: https://lore.kernel.org/r/1717754818-39863-1-git-send-email-quic_ziqichen@quicinc.com Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 52912ca87e2b810e5acdcdc452593d30c9187d8f Author: Damien Le Moal Date: Fri Jun 7 10:25:07 2024 +0900 scsi: core: Disable CDL by default For SCSI devices supporting the Command Duration Limits feature set, the user can enable/disable this feature use through the sysfs device attribute "cdl_enable". This attribute modification triggers a call to scsi_cdl_enable() to enable and disable the feature for ATA devices and set the scsi device cdl_enable field to the user provided bool value. For SCSI devices supporting CDL, the feature set is always enabled and scsi_cdl_enable() is reduced to setting the cdl_enable field. However, for ATA devices, a drive may spin-up with the CDL feature enabled by default. But the SCSI device cdl_enable field is always initialized to false (CDL disabled), regardless of the actual device CDL feature state. For ATA devices managed by libata (or libsas), libata-core always disables the CDL feature set when the device is attached, thus syncing the state of the CDL feature on the device and of the SCSI device cdl_enable field. However, for ATA devices connected to a SAS HBA, the CDL feature is not disabled on scan for ATA devices that have this feature enabled by default, leading to an inconsistent state of the feature on the device with the SCSI device cdl_enable field. Avoid this inconsistency by adding a call to scsi_cdl_enable() in scsi_cdl_check() to make sure that the device-side state of the CDL feature set always matches the scsi device cdl_enable field state. This implies that CDL will always be disabled for ATA devices connected to SAS HBAs, which is consistent with libata/libsas initialization of the device. Reported-by: Scott McCoy Fixes: 1b22cfb14142 ("scsi: core: Allow enabling and disabling command duration limits") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal Link: https://lore.kernel.org/r/20240607012507.111488-1-dlemoal@kernel.org Reviewed-by: Niklas Cassel Reviewed-by: Igor Pylypiv Reviewed-by: Hannes Reinecke Reviewed-by: Christoph Hellwig Signed-off-by: Martin K. Petersen commit 4254dfeda82f20844299dca6c38cbffcfd499f41 Author: Breno Leitao Date: Wed Jun 5 01:55:29 2024 -0700 scsi: mpt3sas: Avoid test/set_bit() operating in non-allocated memory There is a potential out-of-bounds access when using test_bit() on a single word. The test_bit() and set_bit() functions operate on long values, and when testing or setting a single word, they can exceed the word boundary. KASAN detects this issue and produces a dump: BUG: KASAN: slab-out-of-bounds in _scsih_add_device.constprop.0 (./arch/x86/include/asm/bitops.h:60 ./include/asm-generic/bitops/instrumented-atomic.h:29 drivers/scsi/mpt3sas/mpt3sas_scsih.c:7331) mpt3sas Write of size 8 at addr ffff8881d26e3c60 by task kworker/u1536:2/2965 For full log, please look at [1]. Make the allocation at least the size of sizeof(unsigned long) so that set_bit() and test_bit() have sufficient room for read/write operations without overwriting unallocated memory. [1] Link: https://lore.kernel.org/all/ZkNcALr3W3KGYYJG@gmail.com/ Fixes: c696f7b83ede ("scsi: mpt3sas: Implement device_remove_in_progress check in IOCTL path") Cc: stable@vger.kernel.org Suggested-by: Keith Busch Signed-off-by: Breno Leitao Link: https://lore.kernel.org/r/20240605085530.499432-1-leitao@debian.org Reviewed-by: Keith Busch Signed-off-by: Martin K. Petersen commit 7926d51f73e0434a6250c2fd1a0555f98d9a62da Author: Martin K. Petersen Date: Tue Jun 4 22:25:21 2024 -0400 scsi: sd: Use READ(16) when reading block zero on large capacity disks Commit 321da3dc1f3c ("scsi: sd: usb_storage: uas: Access media prior to querying device properties") triggered a read to LBA 0 before attempting to inquire about device characteristics. This was done because some protocol bridge devices will return generic values until an attached storage device's media has been accessed. Pierre Tomon reported that this change caused problems on a large capacity external drive connected via a bridge device. The bridge in question does not appear to implement the READ(10) command. Issue a READ(16) instead of READ(10) when a device has been identified as preferring 16-byte commands (use_16_for_rw heuristic). Link: https://bugzilla.kernel.org/show_bug.cgi?id=218890 Link: https://lore.kernel.org/r/70dd7ae0-b6b1-48e1-bb59-53b7c7f18274@rowland.harvard.edu Link: https://lore.kernel.org/r/20240605022521.3960956-1-martin.petersen@oracle.com Fixes: 321da3dc1f3c ("scsi: sd: usb_storage: uas: Access media prior to querying device properties") Cc: stable@vger.kernel.org Reported-by: Pierre Tomon Suggested-by: Alan Stern Tested-by: Pierre Tomon Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit daf613331c9388dec1b8c56565583afcdf87a053 Author: Bart Van Assche Date: Mon Jun 3 10:23:11 2024 -0700 scsi: powertec: Declare local function static Signed-off-by: Bart Van Assche Link: https://lore.kernel.org/r/20240603172311.1587589-5-bvanassche@acm.org Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen commit 1dc98be418149feb79779af45abe0fe6243243e9 Author: Bart Van Assche Date: Mon Jun 3 10:23:10 2024 -0700 scsi: eesox: Declare local function static Signed-off-by: Bart Van Assche Link: https://lore.kernel.org/r/20240603172311.1587589-4-bvanassche@acm.org Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen commit 1414045725a00f40d52ffb4c866d6efeab02c37a Author: Bart Van Assche Date: Mon Jun 3 10:23:09 2024 -0700 scsi: cumana: Declare local function static Signed-off-by: Bart Van Assche Link: https://lore.kernel.org/r/20240603172311.1587589-3-bvanassche@acm.org Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen commit f5a954bbf2f4309a222f56162f2cd576b7b27f48 Author: Bart Van Assche Date: Mon Jun 3 10:23:08 2024 -0700 scsi: acornscsi: Declare local functions static Signed-off-by: Bart Van Assche Link: https://lore.kernel.org/r/20240603172311.1587589-2-bvanassche@acm.org Reviewed-by: Johannes Thumshirn Signed-off-by: Martin K. Petersen commit a420a8ed0a92488a04b34dfc262101c87940c800 Author: Minwoo Im Date: Sat Jun 1 06:22:44 2024 +0900 scsi: ufs: mcq: Prevent no I/O queue case for MCQ If hba_maxq equals poll_queues, which means there are no I/O queues (HCTX_TYPE_DEFAULT, HCTX_TYPE_READ), the very first hw queue will be allocated as HCTX_TYPE_POLL and it will be used as the dev_cmd_queue. In this case, device commands such as QUERY cannot be properly handled. This patch prevents the initialization of MCQ when the number of I/O queues is not set and only the number of POLL queues is set. Signed-off-by: Minwoo Im Link: https://lore.kernel.org/r/20240531212244.1593535-3-minwoo.im@samsung.com Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 175d1825ca4d2288fee734ada0955a1e36dd50e6 Author: Minwoo Im Date: Sat Jun 1 06:22:43 2024 +0900 scsi: ufs: pci: Add support MCQ for QEMU-based UFS Recently, ufs-mcq feature has been introduced to QEMU hw/ufs device [1]. This patch adds MCQ support for upstream QEMU UFS PCI controller. This patch provides mandatory vops callbacks to make UFS controller work properly on MCQ mode. Operation and Runtime Config register stride is fixed to 48bytes which is implemented by qemu. [1] https://lore.kernel.org/qemu-devel/cover.1716876237.git.jeuk20.kim@samsung.com/ Signed-off-by: Minwoo Im Link: https://lore.kernel.org/r/20240531212244.1593535-2-minwoo.im@samsung.com Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit e8a1d87b7983b461d1d625e2973cdaadc0bd8ff5 Author: Minwoo Im Date: Mon May 20 07:14:57 2024 +0900 scsi: ufs: mcq: Convert MCQ_CFG_n to an inline function Inline functions are preferred over macros. Convert the MCQ_CFG_n macro to an inline function. Signed-off-by: Minwoo Im Link: https://lore.kernel.org/r/20240519221457.772346-3-minwoo.im@samsung.com Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 2fc39848952dfb91a9233563cc1444669b8e79c3 Author: Minwoo Im Date: Mon May 20 07:14:56 2024 +0900 scsi: ufs: mcq: Fix missing argument 'hba' in MCQ_OPR_OFFSET_n The MCQ_OPR_OFFSET_n macro takes 'hba' in the caller context without receiving 'hba' instance as an argument. To prevent potential bugs in future use cases, add an argument 'hba'. Fixes: 2468da61ea09 ("scsi: ufs: core: mcq: Configure operation and runtime interface") Cc: Asutosh Das Signed-off-by: Minwoo Im Link: https://lore.kernel.org/r/20240519221457.772346-2-minwoo.im@samsung.com Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit d53b681ce9ca7db5ef4ecb8d2cf465ae4a031264 Author: Chanwoo Lee Date: Fri May 24 10:59:04 2024 +0900 scsi: ufs: mcq: Fix error output and clean up ufshcd_mcq_abort() An error unrelated to ufshcd_try_to_abort_task is being logged and can cause confusion. Modify ufshcd_mcq_abort() to print the result of the abort failure. For readability, return immediately instead of 'goto'. Fixes: f1304d442077 ("scsi: ufs: mcq: Added ufshcd_mcq_abort()") Signed-off-by: Chanwoo Lee Link: https://lore.kernel.org/r/20240524015904.1116005-1-cw9316.lee@samsung.com Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 600edc6620a4380b9f6027f293dac09eb0f22048 Author: Avri Altman Date: Thu May 30 17:25:09 2024 +0300 scsi: ufs: sysfs: Make max_number_of_rtt read-write Given the importance of the RTT parameter, we want to be able to configure it via sysfs. This is because UFS users should be discouraged from change UFS device parameters without the UFSHCI driver being aware of these changes. Signed-off-by: Avri Altman Link: https://lore.kernel.org/r/20240530142510.734-4-avri.altman@wdc.com Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit e75ff63300c5e8fac31649f438ebad6af88e0032 Author: Avri Altman Date: Thu May 30 17:25:08 2024 +0300 scsi: ufs: core: Maximum RTT supported by the host driver Allow platform vendors to take precedence having their own max rtt support. This makes sense because the host controller's nortt characteristic may vary among vendors. while at it, set this value for Mediatek, as requested by Peter - https://lore.kernel.org/all/0a57d6bab739d6a10584f2baba115d00dfc9c94c.camel@mediatek.com/ Signed-off-by: Avri Altman Link: https://lore.kernel.org/r/20240530142510.734-3-avri.altman@wdc.com Reviewed-by: Peter Wang Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 9ec54934ce857065e38523a2010e20182e76f515 Author: Avri Altman Date: Thu May 30 17:25:07 2024 +0300 scsi: ufs: core: Allow RTT negotiation The rtt-upiu packets precede any data-out upiu packets, thus synchronizing the data input to the device: this mostly applies to write operations, but there are other operations that requires rtt as well. There are several rules binding this rtt - data-out dialog, specifically There can be at most outstanding bMaxNumOfRTT such packets. This might have an effect on write performance (sequential write in particular), as each data-out upiu must wait for its rtt sibling. UFSHCI expects bMaxNumOfRTT to be min(bDeviceRTTCap, NORTT). However, as of today, there does not appears to be no-one who sets it: not the host controller nor the driver. It wasn't an issue up to now: bMaxNumOfRTT is set to 2 after manufacturing, and wasn't limiting the write performance. UFS4.0, and specifically gear 5 changes this, and requires the device to be more attentive. This doesn't come free - the device has to allocate more resources to that end, but the sequential write performance improvement is significant. Early measurements shows 25% gain when moving from rtt 2 to 9. Therefore, set bMaxNumOfRTT to be min(bDeviceRTTCap, NORTT) as UFSHCI expects. Signed-off-by: Avri Altman Link: https://lore.kernel.org/r/20240530142510.734-2-avri.altman@wdc.com Reviewed-by: Bean Huo Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit 96281dfa266d333522c004205acc5ff1e9e3a337 Author: Dr. David Alan Gilbert Date: Tue May 28 22:56:40 2024 +0100 scsi: qla2xxx: Remove unused struct 'scsi_dif_tuple' 'scsi_dif_tuple' is unused since commit 8cb2049c7448 ("[SCSI] qla2xxx: T10 DIF - Handle uninitalized sectors."). Remove it. Signed-off-by: Dr. David Alan Gilbert Link: https://lore.kernel.org/r/20240528215640.91771-1-linux@treblig.org Reviewed-by: Himanshu Madhani Signed-off-by: Martin K. Petersen commit 41b757425203a73ba5aa401cf00feeccc1555f0c Author: John Garry Date: Fri May 24 08:48:29 2024 +0000 scsi: bsg: Pass dev to blk_mq_alloc_queue() When calling bsg_setup_queue() -> blk_mq_alloc_queue(), we don't pass the dev as the queuedata, but rather manually set it afterwards. Just pass dev to blk_mq_alloc_queue() to have automatically set. Signed-off-by: John Garry Link: https://lore.kernel.org/r/20240524084829.2132555-3-john.g.garry@oracle.com Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Tested-by: Himanshu Madhani Reviewed-by: Himanshu Madhani Signed-off-by: Martin K. Petersen commit e7c09df178f740b74a077bbc16ed0bd872ad0581 Author: John Garry Date: Fri May 24 08:48:28 2024 +0000 scsi: core: Pass sdev to blk_mq_alloc_queue() When calling scsi_alloc_sdev() -> blk_mq_alloc_queue(), we don't pass the sdev as the queuedata, but rather manually set it afterwards. Just pass to blk_mq_alloc_queue() to have automatically set. Signed-off-by: John Garry Link: https://lore.kernel.org/r/20240524084829.2132555-2-john.g.garry@oracle.com Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Tested-by: Himanshu Madhani Reviewed-by: Himanshu Madhani Signed-off-by: Martin K. Petersen commit d09c05aa35909adb7d29f92f0cd79fdcd1338ef0 Author: Martin K. Petersen Date: Mon May 20 22:30:40 2024 -0400 scsi: core: Handle devices which return an unusually large VPD page count Peter Schneider reported that a system would no longer boot after updating to 6.8.4. Peter bisected the issue and identified commit b5fc07a5fb56 ("scsi: core: Consult supported VPD page list prior to fetching page") as being the culprit. Turns out the enclosure device in Peter's system reports a byteswapped page length for VPD page 0. It reports "02 00" as page length instead of "00 02". This causes us to attempt to access 516 bytes (page length + header) of information despite only 2 pages being present. Limit the page search scope to the size of our VPD buffer to guard against devices returning a larger page count than requested. Link: https://lore.kernel.org/r/20240521023040.2703884-1-martin.petersen@oracle.com Fixes: b5fc07a5fb56 ("scsi: core: Consult supported VPD page list prior to fetching page") Cc: stable@vger.kernel.org Reported-by: Peter Schneider Closes: https://lore.kernel.org/all/eec6ebbf-061b-4a7b-96dc-ea748aa4d035@googlemail.com/ Tested-by: Peter Schneider Reviewed-by: Bart Van Assche Signed-off-by: Martin K. Petersen commit e4f5f8298cf6ddae43210d236ad65ac2c6379559 Author: Deming Wang Date: Mon May 13 07:59:56 2024 -0400 scsi: mpt3sas: Add missing kerneldoc parameter descriptions Add missing kerneldoc parameter descriptions to _scsih_set_debug_level(). Signed-off-by: Deming Wang Link: https://lore.kernel.org/r/20240513115956.1576-1-wangdeming@inspur.com Signed-off-by: Martin K. Petersen commit 6c3bb589debd763dc4b94803ddf3c13b4fcca776 Author: Saurav Kashyap Date: Wed May 15 14:41:01 2024 +0530 scsi: qedf: Set qed_slowpath_params to zero before use Zero qed_slowpath_params before use. Signed-off-by: Saurav Kashyap Signed-off-by: Nilesh Javali Link: https://lore.kernel.org/r/20240515091101.18754-4-skashyap@marvell.com Signed-off-by: Martin K. Petersen commit 78e88472b60936025b83eba57cffa59d3501dc07 Author: Saurav Kashyap Date: Wed May 15 14:41:00 2024 +0530 scsi: qedf: Wait for stag work during unload If stag work is already scheduled and unload is called, it can lead to issues as unload cleans up the work element. Wait for stag work to get completed before cleanup during unload. Signed-off-by: Saurav Kashyap Signed-off-by: Nilesh Javali Link: https://lore.kernel.org/r/20240515091101.18754-3-skashyap@marvell.com Signed-off-by: Martin K. Petersen commit 51071f0831ea975fc045526dd7e17efe669dc6e1 Author: Saurav Kashyap Date: Wed May 15 14:40:59 2024 +0530 scsi: qedf: Don't process stag work during unload and recovery Stag work can cause issues during unload and recovery, hence don't process it. Signed-off-by: Saurav Kashyap Signed-off-by: Nilesh Javali Link: https://lore.kernel.org/r/20240515091101.18754-2-skashyap@marvell.com Signed-off-by: Martin K. Petersen commit 9fad9d560af5c654bb38e0b07ee54a4e9acdc5cd Author: Justin Stitt Date: Wed May 8 17:22:51 2024 +0000 scsi: sr: Fix unintentional arithmetic wraparound Running syzkaller with the newly reintroduced signed integer overflow sanitizer produces this report: [ 65.194362] ------------[ cut here ]------------ [ 65.197752] UBSAN: signed-integer-overflow in ../drivers/scsi/sr_ioctl.c:436:9 [ 65.203607] -2147483648 * 177 cannot be represented in type 'int' [ 65.207911] CPU: 2 PID: 10416 Comm: syz-executor.1 Not tainted 6.8.0-rc2-00035-gb3ef86b5a957 #1 [ 65.213585] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 65.219923] Call Trace: [ 65.221556] [ 65.223029] dump_stack_lvl+0x93/0xd0 [ 65.225573] handle_overflow+0x171/0x1b0 [ 65.228219] sr_select_speed+0xeb/0xf0 [ 65.230786] ? __pm_runtime_resume+0xe6/0x130 [ 65.233606] sr_block_ioctl+0x15d/0x1d0 ... Historically, the signed integer overflow sanitizer did not work in the kernel due to its interaction with `-fwrapv` but this has since been changed [1] in the newest version of Clang. It was re-enabled in the kernel with Commit 557f8c582a9b ("ubsan: Reintroduce signed overflow sanitizer"). Firstly, let's change the type of "speed" to unsigned long as sr_select_speed()'s only caller passes in an unsigned long anyways. $ git grep '\.select_speed' | drivers/scsi/sr.c: .select_speed = sr_select_speed, ... | static int cdrom_ioctl_select_speed(struct cdrom_device_info *cdi, | unsigned long arg) | { | ... | return cdi->ops->select_speed(cdi, arg); | } Next, let's add an extra check to make sure we don't exceed 0xffff/177 (350) since 0xffff is the max speed. This has two benefits: 1) we deal with integer overflow before it happens and 2) we properly respect the max speed of 0xffff. There are some "magic" numbers here but I did not want to change more than what was necessary. Link: https://github.com/llvm/llvm-project/pull/82432 [1] Closes: https://github.com/KSPP/linux/issues/357 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt Link: https://lore.kernel.org/r/20240508-b4-b4-sio-sr_select_speed-v2-1-00b68f724290@google.com Reviewed-by: Kees Cook Signed-off-by: Martin K. Petersen commit 10157b1fc1a762293381e9145041253420dfc6ad Author: Martin Wilck Date: Tue May 14 16:03:44 2024 +0200 scsi: core: alua: I/O errors for ALUA state transitions When a host is configured with a few LUNs and I/O is running, injecting FC faults repeatedly leads to path recovery problems. The LUNs have 4 paths each and 3 of them come back active after say an FC fault which makes 2 of the paths go down, instead of all 4. This happens after several iterations of continuous FC faults. Reason here is that we're returning an I/O error whenever we're encountering sense code 06/04/0a (LOGICAL UNIT NOT ACCESSIBLE, ASYMMETRIC ACCESS STATE TRANSITION) instead of retrying. [mwilck: The original patch was developed by Rajashekhar M A and Hannes Reinecke. I moved the code to alua_check_sense() as suggested by Mike Christie [1]. Evan Milne had raised the question whether pg->state should be set to transitioning in the UA case [2]. I believe that doing this is correct. SCSI_ACCESS_STATE_TRANSITIONING by itself doesn't cause I/O errors. Our handler schedules an RTPG, which will only result in an I/O error condition if the transitioning timeout expires.] [1] https://lore.kernel.org/all/0bc96e82-fdda-4187-148d-5b34f81d4942@oracle.com/ [2] https://lore.kernel.org/all/CAGtn9r=kicnTDE2o7Gt5Y=yoidHYD7tG8XdMHEBJTBraVEoOCw@mail.gmail.com/ Co-developed-by: Rajashekhar M A Co-developed-by: Hannes Reinecke Signed-off-by: Hannes Reinecke Signed-off-by: Martin Wilck Link: https://lore.kernel.org/r/20240514140344.19538-1-mwilck@suse.com Reviewed-by: Damien Le Moal Reviewed-by: Christoph Hellwig Reviewed-by: Mike Christie Signed-off-by: Martin K. Petersen commit 9f365cb8bbd0162963d6852651d7c9e30adcb7b5 Author: Nathan Chancellor Date: Tue May 14 13:47:23 2024 -0700 scsi: mpi3mr: Use proper format specifier in mpi3mr_sas_port_add() When building for a 32-bit platform such as ARM or i386, for which size_t is unsigned int, there is a warning due to using an unsigned long format specifier: drivers/scsi/mpi3mr/mpi3mr_transport.c:1370:11: error: format specifies type 'unsigned long' but the argument has type 'unsigned int' [-Werror,-Wformat] 1369 | ioc_warn(mrioc, "skipping port %u, max allowed value is %lu\n", | ~~~ | %u 1370 | i, sizeof(mr_sas_port->phy_mask) * 8); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use the proper format specifier for size_t, %zu, to resolve the warning for all platforms. Fixes: 3668651def2c ("scsi: mpi3mr: Sanitise num_phys") Signed-off-by: Nathan Chancellor Link: https://lore.kernel.org/r/20240514-mpi3mr-fix-wformat-v1-1-f1ad49217e5e@kernel.org Signed-off-by: Martin K. Petersen