Skip to main content

NetApp Stage KB

AUTO-ARCHIVED: Disks fail aggressively in a single disk stack

Views:
1
Visibility:
Internal
Votes:
0
Category:
metrocluster
Specialty:
metrocluster
Last Updated:

Applies to

  • ONTAP 9
  • Fabric MetroCluster
  • ATTO FB7500N

Issue

Multiple disks fail in a short space of time
Example:
ClusterA::> storage disk show -broken
Original Owner: ClusterB-01
  Checksum Compatibility: block
                                                Drawer                            Usable Physical
    Disk            Outage Reason  HA Shelf Bay  /Slot Chan   Pool  Type    RPM     Size     Size
    --------------- ------------- --- ----- --- ------ ---- ------ ----- ------ -------- --------
    1.51.22         failed        11b    51  22   -/-     A FAILED   SSD      -        -   6.99TB
Original Owner: ClusterB-01
  Checksum Compatibility: block
                                                Drawer                            Usable Physical
    Disk            Outage Reason  HA Shelf Bay  /Slot Chan   Pool  Type    RPM     Size     Size
    --------------- ------------- --- ----- --- ------ ---- ------ ----- ------ -------- --------
    1.51.15         failed        11b    51  15   -/-     A FAILED   SSD      -        -  894.3GB
Original Owner: ClusterA-01
  Checksum Compatibility: block
                                                Drawer                            Usable Physical
    Disk            Outage Reason  HA Shelf Bay  /Slot Chan   Pool  Type    RPM     Size     Size
    --------------- ------------- --- ----- --- ------ ---- ------ ----- ------ -------- --------
    1.51.6          failed         1d    51   6   -/-     B FAILED   SSD      -  894.0GB  894.3GB
    1.51.16         failed         1d    51  16   -/-     B FAILED   SSD      -   6.99TB   6.99TB
    1.51.17         failed         1d    51  17   -/-     B FAILED   SSD      -   6.99TB   6.99TB
    1.51.18         failed         1d    51  18   -/-     B FAILED   SSD      -   6.99TB   6.99TB
 
 
A significant amount of errors are found in the event log for one of the ATTO bridges
Example:
INFO FC TM Cmd Rcvd: Abort Task Set to LUN:27 on FC Port 1
 
Error counters on the ATTO bridge are increasing
Example:
; Fibre Channel Error Counts
; Port | Link Failures | Sync Loss | Signal Loss | Invalid Tx | Invalid CRC
;==========================================================================
   1                 1           2             0           16          4796
   2                 1           1             0            4             0

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

This is an internal KB article and its content should not be copy/pasted and shared with people outside of NetApp. Always seek Duty Manager authentication of caller for password reset requests. If you need further assistance post a question in Knowledge Xchange