Marvell Driver Release Notes Marvell, Inc. All rights reserved Table of Contents 1. Change History 2. Known Issues 3. Notices 4. Contacting Support 1. Change History Current version: 3.1.87.0 - August 30, 2021 This section contains: * 1.1 Hardware Support * 1.2 Software Component * 1.3 Bug Fixes 1.1 Hardware Support Initial Drop 3.1.1.0: * Support for 2400/2500/2600/2700/2800 Series Fibre Channel adapters * Support for 8100/8200/8300 Series Converged Network Adapters * Mt. Rainier support. Between versions 3.1.1.0 and 3.1.9.0: * None Between versions 3.1.9.0 and 3.1.10.0: * Remove support for Fabric Cache (Mt. Rainier) adapter Between versions 3.1.10.0 and 3.1.87.0: * None 1.2 Software Components Between versions 3.1.1.0 and 3.1.3.0: * None Between versions 3.1.3.0 and 3.1.4.0: * End-to-End QoS fabric priority support * Update ISP25XX fw version to 8.08.01. Between versions 3.1.4.0 and 3.1.6.0: * None Between versions 3.1.6.0 and 3.1.7.0: * Revert update ISP25XX fw version to 8.08.01. FW version is now back to 8.07.00 Between versions 3.1.7.0 and 3.1.10.0: * None Between versions 3.1.10.0 and 3.1.11.0: * Update ISP25XX fw version to 8.08.01. * Add support for Mach Auxiliary Image Status and suppporting flash-related functionality improvements Between versions 3.1.11.0 and 3.1.12.0: * SmartSAN features disabled by default Between versions 3.1.12.0 and 3.1.16.0: * None Between versions 3.1.16.0 and 3.1.17.0: * Secure flash update support for ISP28XX Between versions 3.1.17.0 and 3.1.18.0: * None Between versions 3.1.18.0 and 3.1.19.0: * Simplified Fabric Discovery support Between versions 3.1.19.0 and 3.1.27.0: * None Between versions 3.1.27.0 and 3.1.28.0: * Update ISP25XX fw to version 8.08.207 Between versions 3.1.28.0 and 3.1.29.0: * Revert ISP25XX fw to version 8.08.206 Between versions 3.1.29.0 and 3.1.31.0: * None Between versions 3.1.31.0 and 3.1.32.0: * SAN Congestion Management support Between versions 3.1.32.0 and 3.1.33.0: * Update ISP25XX fw to version 8.08.207 Between versions 3.1.33.0 and 3.1.39.0: * None Between versions 3.1.39.0 and 3.1.40.0: * Statistic collection for error detection * Disable FW command timer by default. Between versions 3.1.40.0 and 3.1.41.0: ERXXXXXX: SAN Congestion Management support, phase 2 Change: Enhancements to SAN Congestion Management feature Relevance: 27XX and 28XX adapters Between versions 3.1.41.0 and 3.1.42.0: ERXXXXXX: Lockdown support Change: Added support for management security lockdown feature Relevance: 27XX and 28XX adapters Between versions 3.1.42.0 and 3.1.74.0: * None Between versions 3.1.74.0 and 3.1.82.0: * Encryption support Between versions 3.1.82.0 and 3.1.83.0: * API support for encryption Between versions 3.1.83.0 and 3.1.87.0: * None 1.3 Bug Fixes Between versions 3.1.1.0 and 3.1.2.0: Problem Description: Improvements Solution: Various Mach-related fixes uncovered during testing. Problem Description: Failing during unload testing for ESX 6.7 Solution: Proper cleanup during detach function. Between versions 3.1.2.0 and 3.1.3.0: Problem Description: Issue with Mach adapter fw dump Solution: Fix firmware dump template entry to not insert extra dword of info. Between versions 3.1.3.0 and 3.1.4.0: Problem Description: Secure flash requires that the driver only writes to flash through MB Cmds for Qlipper and newer adapters. Solution: Patch up holes in write flash routine that could cause the driver to use register access. Problem Description: Driver was advertising Application Services support in the RFT_ID command regardless if enabled or not. Solution: Only enable Application Services if ql2xvmidsupport is turned on. Problem Description: Target devices are temporariry not accessible when the link toggle occurs on one of the target device paths. Solution: Clear the "login needed" bit when the initial login attempt fails. This login will get retried in the relogin dpc call. This prevents target ports remaining in a "login needed" state despite disappearing from the fabric entirely. Problem Description: Driver was not handling ABTS received IOCB in response to a PUREX IOCB failure. Solution: Code was implemented to handle ABTS received. Problem Description: Driver would not send the full RDP response with a switch port that was not in the logged in state. Solution: Remove the limitation as the new firmware now supports splitting up the RDP response payload over multiple frames This will be based on a check of the supported firmware versions. Problem Description: Code improvement. Solution: Move ql2xvmidsupport check to module_init routine Problem Description: VMID information was not getting displayed for the NPIV key value in the vmkmgmt interface. Solution: Added VMID information to NPIV vmkmgmt output Problem Description: Code improvement Solution: DPORT improvement - return mb1 and mb2 contents after execution. Also added more info in DPORT AEN messages. Problem Description: ESX 6.7 IOVP DDV testing was failing. Solution: Use proper cleanup of driver resources in error paths Between versions 3.1.4.0 and 3.1.5.0: Problem Description: Flash writes were failing. Solution: Ensure dword count and byte count were used correctly in flash write routine. Between versions 3.1.5.0 and 3.1.6.0: Problem Description: Driver read incorrectly from flash for E2E QoS values, leading to invalid table readings and disabling of feature. Solution: Pass in byte count instead of dword count to flash read function. Problem Descriptoin: RDP response payload was not formatted correctly, leading to missing contents or response frame not transmitted. Solution: Properly populate link services attribute and implement correct check of firmware version and login state when building payload. Problem Description: Driver was advertising 1G speed support for 8G adapters. Solution: Remove 1G speed support for 8G adapters. Also, cleaned up supported speed code for various features and adapters. Problem Description: FDMI info showing incorrect supported speeds for 16G mezz adapters in FDMI. Solution Remove 4G supported speed if NVRAM parameter indicates to do so. Between versions 3.1.6.0 and 3.1.7.0: None Between versions 3.1.7.0 and 3.1.8.0: Problem Description: When E2E QoS solution was disabled on a LUN, driver was still populating the priority field in the CS_CTL for I/O to that LUN. Solution: Don't fall back to standard fabric priority QoS unless the fabric priority bit in the NVRAM is set. Between versions 3.1.8.0 and 3.1.9.0: Problem Description: qlnativefc driver would fail to load on a system with NPAR enabled on NIC adapter. Solution: Dump callback registration has a hard limit of 40 (ESX limit), so on a system with many adapters/functions registering a callback, this call could fail. Solution was to treat a failed dump callback registration as non-fatal and continue with driver load. Between versions 3.1.9.0 and 3.1.10.0: Problem Description: New compiler in ESX engineering drop uncovered coding issues Solution: Fix existent coding issues uncovered. Problem Description: Code inspection - driver was ignoring potential retry delay returned in FCP Response frame from target. Solution: Handle retry delay by returning VMK_SCSI_HOST_RETRY for specified time. Problem Description: FDMI info registered on switch was inconsistent across different OSes. Solution: Sync up information registered to switch for FDMI across all drivers. Problem Description: Errors encountered during DDV testing. Solution: Fix issues uncovered during DDV inbox testing at VMware. Improvement: Cleanup various log messages Problem Description: Set Params mailbox command was failing during initialization. Solution: Only send CS_CTL map info using Set Params MB Cmd when GFO successful. Problem Description: Flash access was failing during Mach adapter bring-up Solution: Flash access routines were converting flash addresses incorrectly. Problem Description: Problems with Mach bring-up with new flash signature. Description: Add check for new Mach flash signature during initialization Problem Description: Targets would fail to get logged into if a target on the fabric list sent a REJECT for a given login. Solution: Continue with the logins in the scan loop despite seeing a login failure Between versions 3.1.10.0 and 3.1.11.0: Problem Description: Certain HBAs full marketing name was being trucated when executing the "esxcli storage san fc list" command. Solution: Copy over full contents of VPD product description field to API. Between versions 3.1.11.0 and 3.1.12.0: Problem Description: Fabric Discovery could take a long time, which could result in path loss or in the case of NPIV, vports could not get created. Solution: GFO sent to an unsupported switch would take 20 seconds to timeout. Fix was to send the GFO command in a separate thread then the fabric discovery. Problem Description: A management application doing periodic small reads of flash would cause the system to experience I/O failures with "RETRY" status. Solution: Eliminate the code to block I/O during small read operations of the flash. This code is no longer needed. Between versions 3.1.12.0 and 3.1.13.0: Code improvement: Turn on ql2xattemptdumponpanic module parameter by default. Code improvement: Remove uncessary RISC parity disable/enable sequence during ISP abort - only needed on obsoleted adapters. Problem Description: Incorrect supported speeds were being reported by driver on Mach adapter. Solution: Correctly interpret supported speeds returned by FW after execution. Problem Description: FW dump template entry T262 was showing "bad range" in the logs. Solution: If start and end address are equal, this is a valid dump entry. Problem Description: NPIV port creation would fail with "Physical uid does not match VPORT uid, NPIV Disabled for this VM" Solution: Ensure the target ID assigned to the WWPN on the physical port is the same on the NPIV port. Between versions 3.1.13.0 and 3.1.14.0: Code improvement: Various log messages improvement. Problem Description: PSOD observed while running DDV test Solution: Check for NULL pointer during memory alloc call before using buffer. Between versions 3.1.14.0 and 3.1.15.0: Problem Description: Fabric registrations were not taking place with Fabric Priority support enabled. Solution: Don't unecessrily clear REGISTER_FC4_NEEDED flag in fabric scan code Problem Description: Hazard C16 test was failing with input/output error. Solution: Virtual Reset code needs to abort all I/Os on a given world, whereas the old code would stop searching for outstanding I/O in the event of one abort failure. Fix was to not exit I/O loop until all oustanding I/O have had an abort attempt. Problem Description: Dell-EMC VMAX/PMAX setup would not report any LUNs Solution: Omit check for same domain and area as adapter's during "local loop" device discovery Between versions 3.1.15.0 and 3.1.16.0: Code Review: Add brackets missing for secondary VPD location assignment. Problem Description: If ql2xattemptdumponpanic is set and dump partition is on SAN LUN, saving off of coredump during PSOD will fail Solution: Reset and re-init adapter after FW dump attempt so paths to LUNS come back online. Set ql2xattemptdumponpanic to default to 0 as well. Between versions 3.1.16.0 and 3.1.17.0: Problem Description: PSOD observed during NPIV failover testing. Solution: scsiDoneList only initialized and used in the base vha, so fix was to ensure the cmd done code path used the base vha scsiDoneList Problem Description: Incorrect supported speed values were being reported in management app. Solution: Return correct supported speed bit map values in vmkmgt interface Problem Description: FW dump callback was only saving off one 27xx/28xx dump Solution: Improve dump callback logic to save off and/or take fw dumps on all adapters available. Between versions 3.1.17.0 and 3.1.18.0: Problem Description: Flash update failing on Mach adapter Solution: Fix non-secure flash updates on Mach. Also, improved the code to handle multiple chunks for a FW update. Added support for MPI and PEP FW updates. Problem Description: Loss of paths when target sends ASC/ASCQ of 0x3F/0x3 (Inquiry Data has changed) Solution: Vendor unique T10 DIF code existed to intercept this status and handle the change internally, however it was not within a check to see if LUN supported this feature or not. Fix was to put this handling within the T10 DIF supported code path only. Problem Description: Sending SCSI pass thru commands to an NPIV port was failing Solution: Correctly pass in NPIV vha structure to pass thru function, instead of the base vha which would not have the corret vport index to pass to firmware. Between versions 3.1.18.0 and 3.1.19.0: Problem Description: Secure Flash Update MB Cmd was failing during testing with operational firmware. Solution: Driver was passing down incorrect SFUB length in bytes - should be in DWORD count. Problem Description: Code improvement Solution: Secure adapter and fw support displayed in vmkmgmt interface Problem Description: Code improvement Solution: Correctly indicate default values for module parameter Between versions 3.1.19.0 and 3.1.20.0: Problem Description: DDV testing for vSphere next inbox was failing. Solution: Re-work some of the memory cleanup in different failure paths to avoid driver reload issues during DDV testing. Problem Description: Driver vmkmgmt shows the VMID info under the NPIV section, when NPIV is not enabled. Solution: This info was reprinted in the NPIV section unecessarily, already produced in the HBA section, so it was removed. Problem Description: Various issues introduced Simplified Fabric Discovery support. Solution: Fixed VMID, application and unload issues introduced with this new feature. Between versions 3.1.20.0 and 3.1.21.0: Problem Description: ql2xt10difvendor was disabled and associated with ql2xenablesmartsan Solution: Re-enable ql2xt10difvendor and separate the two features out.a Problem Description: During ESX 6.7U2 beta testing, a PSOD was encounterd during driver load. Solution: Recent changes for Simplified Fabric Discovery introduced deferred work threads to handle further IOCB processing after interrupts. workLock protected the deferred work list and posting however the hardwareLock with the same lock rank could be held at the same time. Solution was to lower the workLock rank below that of the hardwareLock. Between versions 3.1.21.0 and 3.1.22.0: Problem Description: PSODs during ESX 6.5 U2 BETA testing. Solution: Adjust spinlock usage and ranking for locks introduced for Simplified Fabric Discovery feature. Between versions 3.1.22.0 and 3.1.23.0: Problem Description: zdump fails to get saved off on BFS configuration Solution: Properly initialize code to execute I/O for dump callback - this was was missing due to new logic introduced with Simplified Fabric Discovery Problem Description: PSODs encountered during open zone testing and utilizing NPIV. Solution: Various improvments and fixes to Simplified Fabric Discovery handling when NPIV is enabled. Problem Description: DDV testing failure encountered during VMWare inbox driver testing Solution: Memory cleanup fix in the IOCTL memory alloc logic Problem Description: 'show fdmi’ command does not list “OS Name and Version” and "CT Payload Length" field on Cisco switches Solution: Added "OS Name and Version" and "CT Payload Length" attributes to original FDMI implementation. Between versions 3.1.23.0 and 3.1.24.0: Problem Descrption: System would PSOD during driver unload/load testing. Solution: Release hardware lock before vmk_WorldYield calls in fw dump rouintes. Problem Description: I/O were not getting completed properly when FW would fail an IOCB with invalid entry status. This lead to continous aborts observed. Solution: Populate handle field in srb when command is submitted to firmware. Problem Description: ISL cable pulls were resulting in path loss and PSODs. Solution: RSCN handling logic made the same regardless of type. The code to handle domain change was old and removed with this fix. Between versions 3.1.24.0 and 3.1.25.0: Proble Description: During ISL disable/enable testing, some target ports not coming back online Solution: Handle Port Down I/O status so that the discovery state gets set to DELETED. Problem Description: PSOD occurs during ISL disable/enable testing. Solution: When cleaning up I/O, don't decrement refcount twice. Between versions 3.1.25.0 and 3.1.26.0: Proble Description: Code improvement Solution: Unecessary sp free call made in GPNFT done path Problme Description: Code improvement Solution: Print out all mailboxes up to 7 for AEN 8002 per FW team's request Problem Description: Code Improvement Solution: Cleanup pDIF log messages Problem Description: Used fc-esxcli version 1.0.20 to check negotiated PCI width and speed for the FC adapters but it is not displayed. Same is true with VI plugin. Solution: Return 16-bit value for PCIe Link Status register. Previous code was returning a 32-bit value and shifting the PCIe width and speed values up 16 bits. Problem Description: PSOD encountered during open zone testing with cable pulls and I/O. Solution: Initialization of work item for simplified fabric discovery needs to be done at the time of allocation, not each time a work item is queued. This could lead to link lists being broken. Problem Description: Driver lock-ups could occur with MPI heartbeat stop async event Solution For MPI heartbeat stop async event, this patch would capture MPI FW dump and chip reset. FW will tell which function to capture FW dump. Also, added handling for timeout occuring during fw dump where the dump procedure would be shortened Problem Description: Flash update failures can occur on single port 28XX adapters. Solution: For single port 28XX adapter, the second core can still runs in the back ground. The flash semaphore can be held by the non-active core. This patch tell MPI FW to check for this case and clear the semaphore from the non-active core. Problem Description: Code Improvement Solution: Added MPI dump handling when firmware panic occurs Problem Description: FDMI 2 RHBA command was getting rejected by the switch. Solution: Correctly formwat version strings in attribute list. Also, bring formatting inline with other OS drivers. Problem Description: GetActiveRegions API not supported on Baker/Qlipper Solution: Added support for this API call Between versions 3.1.26.0 and 3.1.27.0: Problem Description: PSOD occurs during flash update with multi-queue disabled. Solution: Only execute qpair code logic in ISP abort handler if mq is enabled. Problem Descrption: PSOD occurs during Boot From SAN (BFS) and RSCN storm testing Solution: SRB timer code was not being canceled correctly if still active during driver shutdown Problem Description: PSOD occurs during VMID and VVOL testing Solution: gblDsdList for Command Type 6 IOCBs was not fully protected during command processing Between versions 3.1.27.0 and 3.1.28.0: Problem Description: DDV tests during IOVP were failing Solution: Various fixes and improvments to the driver detach routine as well as memory allocation failure paths. Problem Description: FW Dump on PSOD was failing when ql2xattemptdumponpanic was set. Solution: Don't do an ISP abort after fw dump. Between versions 3.1.28.0 and 3.1.29.0: Problem Description: On systems with ql2xmqcpuaffinity disabled OR if the number of NUMA nodes is 1, a PSOD could occur during command processing. Solution: Properly initialize qpair pointer when ql2xcpuaffinity is not enabled in the command submission path. Between versions 3.1.29.0 and 3.1.30.0: Problem Description: Target remained OFFLINE during port toggle testing after a FW panic (8002). Solution: During target session deletion, set FCF_ASYNC_SENT flag to prevent other session management commands from being executed. Also, other improvements to the fcport deletion state handling. ER146424 Problem Description: I/O errors were seen when driver encountered a FW panic. Solution: Properly clean up outstanding I/O and populate with correct host status when aborted during 8002 Async Event processing. ER146425 Problem Description: vmkmgmt output was showing incorrect number of response queues when multi-queue was disabled Solution: Calculate proper response queue count to display; also show CPU affinity disabled/enabled. ER146367 Problem Description: Hilda mezz adapters showed incorrect supported speeds Solution: Do runtime check for Synergy Quartz mezz card - show only 16G supported speed. ER146323 Problem Description: Mach adapter driver strings showed 64G FC in qlnativefc_devices.py file. Solution: Base string off of subsystem device ID for better clarity and improve generic Mach string to show 32G/64G support. ER146372 Between versions 3.1.30.0 and 3.1.31.0: Problem Description: PSOD was seen during testing on blade servers with 8G and Hilda adapters Solution: Removed reads from request queue in pointers during normal operation. Also, correctly initialize request queue in pointer address during non multi-queue operation. ER146525 Problem Description: PSOD during driver load on one NUMA node server. Solution: Do not attempt to set fw_started bit on qpairs using the queue pair map if it hasn't bee allocated. ER146639 Between versions 3.1.31.0 and 3.1.32.0: Problem Description: MPI fw dump and reset was done with ISP fw dump and reset which was unecessary. Solution: Improve MPI and ISP fw dump and reset handling to the agreed upon specification. Allow for either dump to be performed separately. ER146608 Problem Description: Application Services, when enabled (ql2xvmidsupport) was not getting registered on the Brocade switch. Solution: Enable Application Services FC4 type when sending RFT_ID command. ER146749 Between versions 3.1.32.0 and 3.1.33.0: Problem Description: A PSOD would occur when encountering an MPI pause during MPI fw dump handling Solution: Defer MPI dump collection to DPC thread. ER146769 Between versions 3.1.33.0 and 3.1.34.0: Problem Description: FCP targets don't come back online after first cable pull with NVMe enabled. Solution: Don't re-use IFWCB buffer for N2N PLOGI template. Also, some improvements to N2N handling with NVMe enabled. ER146763 Problem Description: Prevent panic in vmkmgmt I/O timeout path with verbose logging turned on. Solution: Check cmd pointer in qlnativefcPrintScsiCmd function before printing to log. Problem Description: Namespace discovery and path claiming for NVMe storage was not completing and so paths would always show up dead. Solution: Data underruns from FW with "good" NVMe status should be returned "good" status to the NVMe layer. ER146708 Problem Description: I/O timeouts and aborts observed when 8G HBA FW loaded from flash that did not support multiqueue. Solution: When CPU affinity setup fails, queue pointers need to be re-initialized to correct non-MQ location on 8G adapters. Problem Description: FCP_RSP frame status qualifier field not supported. Solution: FCP-4 (referred FCP-4 rev-2b) identifies the earlier known "retry delay timer" field as "status qualifier", which is described in SAM-5 and later specs. This fix makes appropriate driver side modifications to honor the new definition. The SAM document referred was SAM-6 rev-5. Problem Description: MPI dump execution taking a long time. Solution: Check MPI dump flag in timer function when attempting to wake up DPC thread. Also, made improvments to the functions involved in collectin the MPI dump. ER146608 Problem Description: vmkmgmt key-value info wasn't showing pDIF support on LUNs Solution: Added pDIF info to the vmkmgmt output for debugging purposes. Problem Description: It may takes more than twenty minutes to unload the driver - ESX7.0 inbox testing. Solution: Check WaitForHbaReady every 10ms as opposed to every sec (10ms). DCPN58478 Between versions 3.1.34.0 and 3.1.35.0: Problem Description: Support for NVMe target reporting to apps in ESX 7.0 Solution: Create functions out of code that assigns target IDs Problem Description: DDV test failing with PSOD when forcing slab alloc errors on RFT_ID command Solution: Retries done on CT passthru commands need timer to be restarted. Also, attempt retry even if timer not active. Problem Description: "Mid-layer underflow" error messages seen in logs as a result of pDIF I/O sent to a non-pDIF LUN. Solution: Properly protect the data buffer(s) used to parse the Inquiry response for each LUN before determining if pDIF is supported or not. This buffer was not protected which could cause incorrect Inquiry data read leading to incorrectly enabling pDIF. ER146982 Problem Description: Code review - improvement needed with debugging issues when handling errors in response queue processing. Solution: Force reset and fw dump collection on various response queue error conditions Between versions 3.1.35.0 and 3.1.36.0: Problem Description: DDV failures in VMWare inbox testing Solution: Various fixes in DMA alloc failure path and driver detach routine. Problem Description: SCM support was not getting enabled in FW on Qlipper/Baker adapters. Solution: FW team had defined bit 10 in FW extended attributes lower to indicate SCM support, which confliced with secure flash support in Mach adapters. New bit 12 is now used to indicate SCM support. ER147063 Problem Description: Fabric priority QoS was not working with ql2xfabricpriority module parameter Solution: Fix initialization of GFO work thread when ql2xfabricpriority is turned on. ER147121 Between versions 3.1.36.0 and 3.1.37.0: Problem Description: DDV EILoading test was failing with non-zero heap. Solution: During driver unload, short circuit RFT_ID/RFF_ID/RNN_ID/RSNN_ID retries in the "done" handler and just clean-up the DMA buffers. Problem Description: Debug print statement in LUN reset path was printing incorrect target and LUN Ids Solution: Print correct values. No impact for LUN reset handling - cosmetic. Problem Description: Peg core dump would fail with invalid address error. Solution: Populate RAM ID field in MB Register 10 Between versions 3.1.37.0 and 3.1.39.0: Code improvement : Various code improvements after code review. Resolution : Code improvement when setting up qpair in I/O path. Relevance : all adapters Between versions 3.1.39.0 and 3.1.40.0: DCPN22882 : Use function number for logical device creation Resolution : This will esnure persistent logical device names across driver reload and system reboot. Relevance : all adapters Between versions 3.1.40.0 and 3.1.41.0: ER147415: Driver not relogging in during storage bounce test Resolution: When the driver receives a login status of "Port ID used", logout the nport handle before attempting a relogin. Relevance: All adapters ERXXXXXX: MPI and FW dump processing improvements Resolution: Improvements to dump processing to conform with additions to specification Relevance: 27XX adapters and later ERXXXXXX: Code review - vmkmgmt calls were missing cleanup function Resolution: Ensure all exits paths call the qlnativefcExitApi routine Relevance: All adapters ERXXXXXX: Improvements to enhanced error detection stats collection feature Resolution: Additions required by customer; also put API into separate source and header files Relevance: 27XX adapters and later Between versions 3.1.41.0 and 3.1.42.0: ER147525: vmkmgmt interface for NVMe hosts shows FCP targets in EED and SCM stats Resolution: Display the NVMe or FCP target lists based on host vmhba type. Relevance: 27xx adapters and later ER147526: ql2xenhancedabort support shows available on ESX 6.7 driver Resolution: Enhanced abort only supported with NVMe supported drivers - move ql2xenhancedabortsupport to NVME compile switch. Relevance: 27XX adapters and later ER147524: PSOD seen when handling SCM peer congestion event Resolution: System did not have multi-queue enabled so check for using slow queue to throttle I/O needs to have a check for multi-queue as well. Relevance: 27XX adapters and later ER147606: CPU affinity multi-queue disabled with SCM phase 2 support Resolution: Slow queue pair implementation was improved to use unique MSI-X vector when initializing Relevance: 27XX adapters and later ER147415: Driver not relogging in during storage bounce test, part 2 Resolution: When the driver receives a port logout 8014 Async Event, logout the nport handle before attempting a relogin. Relevance: All adapters ERXXXXXX: Use MPI hang trigger to do PEGTUNE halt. Resolution: Utilize vmkmgmt interface for MPI pause to do a PEGTUNE halt on 83XX adapters Relevance: 83XX adapters DCPN65467: PSOD occurrs during DDV testing when forcing queue creation failures in CPU affinity setup. Resolution: Don't disable mqenable flag when CPU affinity qpair creation fails. Relevance: All adapters ER147603: Enhanced Error Detection API would return 0 for entry count when retrieving host stats. Resolution: Populate entry count correctly before returning stat values. Relevance: 27XX adapters and later ERXXXXXX: Code review - enhanced abort MB Cmd timeout value. Reslution: Increase MB Cmd 0x54 timeout value to be greater than FW timeout value. Relevance: 27XX adapters and later Between versions 3.1.42.0 and 3.1.43.0: ER147561: Host side stats were not being updated for FPIN warnings. Resolution: FPIN descriptor structure had been modified incorrectly for phase 2 of SCM feature. Fix was to revert to correct endianess of original structure Relevance: 27XX adapters and later ERXXXXXX: Code review - vmkmgmt MPI flags need updating after MPI reset Resolution: Capture MPI flags for display after MPI reset. Also, include MPI Reporting support in output. Relevance: 27XX adapters and later ER147621: PSOD when NVMe I/O is returned with UNDERRUN. Resolution: Ensure good NVMe completion status from target before handling UNDERRUN. Also, only return SUCCESS status if data was transferred. Relevance: 27XX adapters and later ERXXXXXX: Enhanced abort logging improvements for NVMe. Resolution: When enhanced abort is used, print Abort IOCB completion and NVMe completion status by default Relevance: 27XX adapters and later ER147533: ESX CLI doesn't show UCSCM stats. Resolution: SCM Phase 2 broke backward compatibility with Phase 1. interface; bring back old interface and create new API calls for Phase 2 Relevance: 27XX adapters and later ER146879: Support to clear SCM/SCMR stats. Resolution: Added an interface in the driver to clear stats via qaucli. Relevance: 27XX adapters and later ER147630: SCM throttling increase did not happen as expected after congestion event. Resolution: Ensure throttle change value is at least 1 after calculating the new value. Relevance: 27XX adapters and later ERXXXXXX: Various SCM improvements Resolution: API structure additions to bring inline with latest spec SCMR algorithm improvements Clear peer congestion after removal from fabric SCMR support for NVMe Log message improvments Relevance: 27XX adapters and later Between versions 3.1.43.0 and 3.1.44.0: ER147576: PSOD and data inconsistencies preceeded by numerous invalid parameter in response IOCBs - "Process error entry. type/count/sys/status/comp = 18:2:0:8:1000" Resolution: Prevent data issues by passing back correct error status when invalid entry error seen on response queue. Relevance: All adapters ER147689: MPI reset was timing out and status not showing up with link down.. Resolution: Driver was only grabbing MPI Status info if FW state was ready, i.e. link up. Change was to get this info regardless of FW state. Relevance: 27xx adapters and later ERXXXXXX: SCM: Remove call to notify FW of slow device Resolution: Prevent driver from sending MB Cmd 0x1A (Set Port Params) to notify FW of slow device until this feature is fully supported Relevance: 27xx adapters and later ER147733: Enabling ql2xvmidsupport and ql2xenablesmartsan at the same time on a boot from SAN setup caused driver load failures. Resolution: Prevent both module parameters from being set at the same time - disable ql2xvmidsupport in that event. Relevance: 27xx adapters and later Between versions 3.1.44.0 and 3.1.45.0: ER147768: SCMR: When ql2x_scmr_flow_ctl_tgt and ql2x_scmr_flow_ctl_host set to 0, congestion state for both target and host would not get cleared. Resolution: Allow congestion clearing to be called even when not actively throttling. Relevance: 27xx adapters and later ERXXXXXX: SCMR: Remove remaining code to notify FW of slow device Resolution: Some code had been left in from previous attempt. Also, some various log message improvments. Relevance: 27xx adapters and later Between versions 3.1.45.0 and 3.1.46.0: ER146879: Alarm and warning counters were not getting cleared in QL_SET_PORT_SCM vmkmgmt callback. Resolution: Clear counters correctly in callback. Relevance: 27xx adapters and later ER147795: Host side congestion was not honoring event period sent in FPIN for when to clear congestion state. Resolution: Copy over correct FPIN field when determining event period for a host congestion scenario. Also print out correct value of event period. Relevance: 27xx adapters and later. Between versions 3.1.45.0 and 3.1.61.0: ERXXXXXX: Various improvements to Enhanced Error Detection support Resolution: Short link down counter fix, logging improvments and API callback wait time improvements Relevance: 27XX adapters and later ER148040: Performance regression with ESX 7.0 inbox testing (DPCN64008) Resolution: GPSC failures were leading to IIDMA for target being set to 1GB/s. Fix was to initialize target port speed to "unknown" so GPSC failure path would not lead to IIDMA being incorrectly set. Relevance: All adapters ER147303: Fix response queue handler reading stale packets Resolution: Two module parameters are introduced (qlx2rsq_follow_inptr and ql2xrspq_follow_inptr_legacy) to control response queue processing logic to follow in pointer rather than signature (on by default) Relevance: All adapters ERXXXXXX: Improvement to SAN Congestion Managment algorithm Resolution: Added queue depth based throttling and turned on by default (ql2x_scmr_throttle_mode = 2) Relevancce: 27xx adapters and later ER148010: Output of "esxcli qlfc qcc port scmchk get" for "Seconds Since Last Event" always shows 0. Resolution: Keep track of and return elapsed time for each congestion event. Relevance: 27xx adapter and later ER147961: Target VP failover/failback shows "Process error entry" messages Resolution: Mark target port as "OFFLINE" immediately before modifying port ID in driver database Relevance: All adapters Between versions 3.1.61.0 and 3.1.62.0: ER148066: Host side throttling - qdepth value shows negative Resolution: Call qdepth increment function regardless if throttle attempt was made. Relevance: 27xx and later ER148045: CPU lockup occurs during FPIN processing Resolution: Problem was due to the FPIN processing loop potentially spinning indefinitely. Fix was to wait only for 2 seconds for extra FPIN packets before failing. Relevance: 27xx adapters and later ER148082: Host side throttling: seeing continous FPIN error messages in the logs Resolution: A new snapshot of the response queue in pointer needed to be read after processing FPIN entries to avoid getting out of sync with firmware and reading stale entries. Relevance: 27xx adapters and later ER148097: Host stays in congested state with SCM signals Resolution: Set correct event and throttle period values when receiving signals. Also, check for cleared congestion only after handling throttling state. Relevance: 28xx adapters Between versions 3.1.62.0 and 3.1.63.0: ER148066: Q depth throttling messages show negative values Resolution: Manage qdepth using SRB flag to account for commands that are not counted towards qdepth calculation. Relevance: 27xx adapters and later ER148138: Congestion Severity shows None for Signals in esxcli Resolution: Populate congestion severity when receiving signals based on warning or alarms. Relevance: 27xx adapters and later ER148140: Congestion Severity doesn't reset after clear event. Resolution: Clear congestion severity either when congestion clears naturally or through an FPIN event. Relevance: 27xx adapters and later ERXXXXXX: Certain Enhanced Error Detection API calls need to be allowed to execute when port isolated. Resolution: Changed API interface to allow or not allow EED calls to go through based on requirements. Also, increment target short link down timeout when target logs out from the initiator. Relevance: 27xx adapters and later ERXXXXXX: ISP Abort was taking a long time to trigger when MB Cmds were active Resolution: Reduce MB Cmd timeout to 5 seconds when an ISP abort is needed Relevance: All adapters Between versions 3.1.63.0 and 3.1.64.0: ERXXXXXX: MB Cmd timeouts seen during Enhanced Error Detection port disable/enable testing Resolution: Don't return enable API call until FW is initialized. Also, clear initDone flag with port disabled to allow MB Cmd polling instead of waiting for interrupt. Relevance: 27xx adapters and later ER148213: Disable Host and Peer throttling using module parameter system don't show congested state. Resolution: Make calls to set/clear congestion separate from the throttling functions and put in appropriate module parameter check only for throttling Relevance: 27xx adapters and later ER148079: DDV EILoading runs for an indefinite of time if the datastores are mapped Resolution: Don't destory DPC world until after the priority workqueues have been destroyed Relevance: All adapters ER148145: Cable unplugged messages are not seen on vmkernel when SFP is present but no cable Resolution: In timer task, print cable unplugged after loop down time if FW never became ready Relevance: All adapters ERXXXXXX: Adapters older than 27xx do not have shadow registers so need to directly read chip registers to determine response in pointer Resolution: Turn off ql2xrspq_follow_inptr_legacy by default for now to avoid reading response queue in register during interrupt processing on older chips. Relevance: 25xx and Hilda adapters Between versions 3.1.64.0 and 3.1.65.0: ERXXXXXX: Failures seen during DDV testing with 7.0 inbox driver Resolution: Improvements to handle DDV testing at VMWare Relevance: All adapters ERXXXXXX: Fabric zone deletion with 1 target did not increment target short link down counter Resolution: Logout all target ports when GPN_FT fails after 5 attempts. Relevance: All adapters ER148223: Logs are continuously flooded with ADISC failure messages during IOM disruptive activities Resolution: Clear logoutCompleted flag before sending logout after target is removed from fabric - this prevents subsequent logout and ADISC failures in the event of target logging into adapter first. Relevance: All adapters Between versions 3.1.65.0 and 3.1.66.0: ERXXXXXX: The NVMe event list spinlock was at a lower rank than the hardware lock which was used when assigning IDs from the targetIds array, resulting in PSOD during inbox testing at VMWare. Resolution: Create new target ids spinlock to handle target ID assignment with a lower rank than the event list lock. Relevance: 27xx adapters and later ERXXXXXX: Enhanced Error Detection port isolation operation could be delayed and/or return busy when driver is performing an ISP abort until after link up occurs Resolution: Allow disable port API call to be made regardless of DPC thread activity but wait for activity to quiesce. Set port isolated flags early in the call so ISP abort handler can be short Short circuited. Relevance: 27xx adapters and later Between versions 3.1.66.0 and 3.1.67.0: DCPN73306: DDV::EILoading test hit a PSOD - ASSERT bora/vmkernel/main/fastslab.c:2230 Resolution: Prevent various work threads when driver is unloading to prevent heap memory not getting completely freed. Relevance: All adapters ERXXXXXX: DDV testing resulted in PSOD with driver worlds still active Resolution: Check status on vmk_WorldWaitForDeath to prevent driver unload with world still active Relevance: All adapters ERXXXXXX: ql2xuseshadowregisters wasn't used consistently throughout the code. Resolution: Use qpair->useShadowReg as the main flag to check if the adapter is shadow register capable AND module parameter is enabled. Relevance: 27xx adapters and later ERXXXXXX: Driver was not using proper Slab ID check before destroying slabs during driver unload - found during code review. Resolution: Add proper check to Slab ID before freeing up slab. Relevance: All adapters Between versions 3.1.67.0 and 3.1.68.0: ERXXXXXX: USCM: Firmware will fail to go ready while waiting for EDC and RDF ELS commands to complete to switch. Resolution: Feature that enables driver to control the sending of the EDC and RDF commands. Controlled by module parameter ql2xcontrol_edc_rdf, enabled by default. Relevance: 27xx adapters and later Between versions 3.1.68.0 and 3.1.69.0: FCD-92: USCM: ESX Cli and VI-plug scmstats, scmchk failing. Resolution: When driver controlled RDF completes successfully, set SCM_FLAG_RDF_COMPLETED in scmFabricConnectionFlags. Relevance: 27xx adapters and later FCD-93: UCSM: Driver controlled EDC commands failing on Qlipper adapters Resolution: Driver was populating TX buffer length in EDC ELS passthrough IOCB incorrectly when only CSC descriptor was present. Initialize buffer length to include 8 bytes of opcode and descriptor length regardless of which descriptors present. Relevance: 27xx adapters and later Between versions 3.1.69.0 and 3.1.70.0: ER148141: USCM: Driver vmkmgt interface doesn't differentiate FPIN vs signal alarm/warnings in the Host Stats counters display. Resolution: Introduced separate internal counters to track FPIN vs signal alarms/warnings. Relevance: 27xx and later FCD-109: USCM: EDC commands failing on Qlipper adapters Resolution: Buffer byte count was incorrectly set for adapters that supported LFC descriptor in EDC command. Also, made some improvements to the error handling code. Relevance: 27xx adapters FCD-XX: USCM: ql2xcontrol_edc_rdf was missing from vmkmgmt list Resolution: Add ql2xcontrol_edc_rdf to vmkmgmt module parameter list Relevance: 27xx adapters and later FCD-XX: USCM: Log messages show throttling state being cleared whenever target relogin occurred regardless of throttling state Resolution: Avoid clearing throttling state when already cleared during device resync Relevance: 27xx adapters and later DCPN73768: qpair spinlock consumes more CPU cycles on VVOL workload compared to VMFS Resolution: Various improvements made in the performance path to optimize usage of qpair lock as well as avoiding non-slab memory allocs. Relevance: All adapters ER148590: PSOD while powering on VM with NPIV enabled Resolution: Check vmk_WorldWaitForDeath status on scsiScan world before continuing to execute vport deletion Relevance: All adapters Between versions 3.1.70.0 and 3.1.71.0: FCD-XXX: USCM: Capability to get accurate signal count not utilized. Resolution: Retrieve signal count from FW using MB Cmd; display in logs for now - later will be incorporated in UCSM main scheme. Relevance: 27xx adapters and later FCD-153: USCM: Internal congestion alarm/warning counters displayed in vmkmgmt don't get cleared Resolution: Clear internal alarm and warning counters when application requests. Relevance: 27xx adapters and later FCD-137/ FCD-117: PSOD during DDV testing Resolution: Check memory buffer allocation was successful before calling qlnativefcGetParamsMbx during driver initializaion. Relevance: 27xx adapters and later DCPN80597: DDV eiloading test hanging for a long time Resolution: A new flags is introduced that will trigger a world exit when a world is killed during driver unload. This is done to prevent missed VMK_DEATH_PENDING status when the world function contains multiple wait related system calls. Relevance: All adapters FCD-143: PSOD on VM powering on VMs Resolution: Removed one aspect of changes made for DCPN73768 as this was leading to two threads accessing the oustanding command array. Relevance: All adapters Between versions 3.1.71.0 and 3.1.72.0: FCD-181: 8G adapter failng login when fw loaded without multi-queue support Resolution: Error path when driver encounters non-MQ fw did not handle resetting request and response queue pointers to point to proper address. Fix was to ensure if firmware does not support multi-queue to re-init pointer registers properlya Relevance: 25xx adapters DCPN78849: PSOD encountered during DDV testing in qlnativefcDmaUnmap call Resolution: Perform DMA buffer NULL check before deallocating the buffer. Relevance: All adapters DCPN78850: PSOD in qlnativefcFreeSpPool during DDV eiloading testing Resolution: Ensure all SRB queued up for deferred work are free before destroying the memory slab Relevance: All adapters FCD-XXX: UCSM: Remove signal count message in log added in last driver version. Resolution: Deferred until next phase of UCSM support Relevance: 27xx adapters and later FCD-243: USCM: EDC/RDF reject response code shows 1 for all error conditions Resolution: Print out the correct field in the response frame. Also, made some log message improvements. Relevance: 27xx adapters and later FCD-236: vmhba info not in stats section of vmkmgmt output Resolution: Added vmhba info to SCM_STATS and EED_STATS output Relevance: All adapters ER148589: PSOD seen during loopback testing Resolution: Buffer used for command passthrough and loopback testing was shared without protection from multiple threads making API calls. Fix was to protect the buffer using a sempahore. Relevance: All adapters Between versions 3.1.72.0 and 3.1.73.0: FCD-XXX: Debug code caused 8G adapter to fail setting up CPU affinity qpairs Resolution: Remove debug code in driver Relevance: 25xx adapters Between versions 3.1.73.0 and 3.1.74.0: FCD-278: PSOD observed while running DDV EILoading Resolution: Semaphore to protect vmk API buffer needs to be checked for NULL before destroying in detach routine. Relevance: All adapters FCD-277: PSOD observed while running DeviceStateChange IOVP test Resolution: Ensure all DMA buffers used in work objects get cleaned up in detach function Relevance: All adapters Between versions 3.1.73.0 and 3.1.74.0: FCD-298: PSOD observed while running DDV EILoading Resolution: Ensure all DMA buffers used in work objects get cleaned up in detach function, key off of type field instead of buffer pointer values Relevance: All adapters Between versions 3.1.74.0 and 3.1.83.0: None. Between versions 3.1.83.0 and 3.1.84.0: FCD-396: Spinlock rank PSOD with ESX beta build Resolution: Ensure proper lock rank when creating various spinlocks Relevance: All adapters FCD-439: No info about encrypted target ports in FC Target list in vmkmgmt output. Resolution: Added StorCryption target port info in list Relevance: 28xx adapters and later FCD-438: StorCryption management port shows up in vmkmgmt output in Non-Target FC Port list Resolution: Only print out whether StorCryption management port is enabled/disabled; remove from non-target list Relevance: 28xx adapters and later FCD-440: StorCryption debug logging messages showing up in standard trace Resolution: Remove StorCryption deubg messages - no longer needed with stable driver Relevance: 28xx adapters and later FCD-425: Unable to disable StorCryption support in driver from module parameter Resolution: Added ql2xstorcryption module parameter (enabled by default) Relevance: 28xx adapters and later FCD-432: Not able to see any encrypted Lun/Lun information from esxcli qlfc Resolution: Allow SCSI Test Unit Ready commands from application to execute without enabling encryption in IOCB. Relevance: All adapters FCD-251: Loopback failures with NVMe targets connected Resolution: Due to NVMe exchanges being persistently active for Async Event Requests, the FW requires the driver to build the Echo payload themselves for the Diagnostic Echo MB Cmd rather than having the FW send a vendor-unique Echo command. Relevance: All adapters Between versions 3.1.84.0 and 3.1.85.0: FCD-236: MBI version info missing from vmkmgmt output Resolution: Driver records MBI version during start of day and prints MBI version in vmkmgmt output Relevance: 26xx adapters and later FCD-421: PSOD while running Driver Load Unload IOVP Test Resolution: Properly cleanup buffers used for Storcryption in driver unload and buffer allocation failure paths. Relevance: 27xx adapters and later Between versions 3.1.85.0 and 3.1.86.0: FCD-XXX: "-debug" string shown in driver version Resolution: ql2xextended_error_logging was inadvertently enabled by default. Relevance: All adapters FCD-612: PSOD during driver unload with ESX 6.7 beta build. Resolution: Remove extra deallocation of Storcryption bit vectors. Relevance: All adapters Between versions 3.1.86.0 and 3.1.87.0: FCD-641: Debug messages printed in FCP-SCSI I/O path for non-standard SCSI read/write opcodes. Resolution: Remove debug messages Relevance: All adapters. 2. Known Issues * Setting ql2xattemptdumponpanic=1 not supported if dump partition is on SAN 3. Notices Information furnished in this document is believed to be accurate and reliable. However, Marvell, Inc. assumes no responsibility for its use, nor for any infringements of patents or other rights of third parties which may result from its use. Marvell, Inc. reserves the right to change product specifications at any time without notice. Applications described in this document for any of these products are only for illustrative purposes. Marvell, Inc. makes no representation nor warranty that such applications are suitable for the specified use without further testing or modification. Marvell, Inc. assumes no responsibility for any errors that may appear in this document. 4. Contacting Support For further assistance, contact Marvell Technical Support at: http://support.marvell.com (c) Copyright 2021. All rights reserved worldwide. Marvell, Inc, the Marvell logo, and the Powered by Marvell logo are registered trademarks of Marvell, Inc. All other brand and product names are trademarks or registered trademarks of their respective owners.