E-Series volume failure on StorageGRID system due to DCM reset on drawer
Applies to
- NetApp E-Series
- SANtricity OS 11.90R3
- StorageGRID SG6060 appliance
Issue
- An automatic AutoSupport generated case created for
E-Series Notification (Volume failure) CRITICAL - Recovery Guru reports multiple volumes in a FAILED state:
Failure Entry 2: FAILED_VOLUME-Recovery Failure Type Code: 17Storage array: array_nameDisk pool: StorageGRID-disk-pool99Status: FailedShelf: Controller/Drive shelf 99, Drawer 2Affected drive bay(s): 9, 4, 2, 6, 3, 1, 7, 0, 10, 8, 11, 5Service action (removal) allowed: NoService action LED on component: NoVolumes: StorageGRID-PGE-Backup, StorageGRID-SG-OS, StorageGRID-obj-00, StorageGRID-obj-01, StorageGRID-obj-02, StorageGRID-obj-03, StorageGRID-obj-04, StorageGRID-obj-05, StorageGRID-obj-06, StorageGRID-obj-07, StorageGRID-obj-08, StorageGRID-obj-09, StorageGRID-obj-10, StorageGRID-obj-11, StorageGRID-obj-12, StorageGRID-obj-13, StorageGRID-obj-14, StorageGRID-obj-15
STORAGE-ARRAY-PROFILE.txt>SHELVESreports all drives in affected Drawer as BY-PASSED- Major Event Log reports Destination Driver errors occurring on Channel 1 and Channel 2 for all drives on affected Drawer:
B:1/1/26, 6:13:44 PM (18:13:44) 25255 1012 Destination driver error - Shelf 99, Drawer 2, Bay 1> Fail Reason: Last Error, Hid: Device fail timeout (all ITNs to device have been disconnected for too long) - Channel 1-> Error#1: Hid: Device fail timeout (all ITNs to device have been disconnected for too long), Ch:1, Next:Fail command;
A:1/1/26, 6:13:43 PM (18:13:43) 25335 1012 Destination driver error - Shelf 99, Drawer 2, Bay 11> Fail Reason: Last Error, Hid: Device fail timeout (all ITNs to device have been disconnected for too long) - Channel 2-> Error#3: Hid: Device fail timeout (all ITNs to device have been disconnected for too long), Ch:2, Next:Fail command;
TRAY-COMPONENT-STATE-CAPTURE.7zreported DCM reset on affected Drawer
Debug data for RBOD DCM-A drawer 1 (Drawer 2)1|00-00:00:00.00 BS: I2C READY1|00-00:00:00.00 BS: GPIO READY1|01-00:00:00.00 I2C: i2cWritePCA9570Bit writing 0xFB1|01-00:00:00.00 I2C: i2cWritePCA9570Bit writing 0xFB1|01-00:00:00.00 I2C: i2cInitMAX7310 4/4 devices initialized (failed bus mask 0x00)1|01-00:00:00.00 NOTE: DCM was reset by: Power On17|01-00:00:09.52 SES: WARNING: DCM_PATH_FAIL Error Fault is ACTIVE!
17|01-00:01:10.87 SES: NOTE: DCM_PATH_FAIL Error Fault is no longer ACTIVE!17|01-00:01:10.91 SES: NOTE: DCM ATTN LED is OFF!
