AWS network latency causes disk not responding and offline aggregate
Applies to
- Amazon Web Services (AWS)
- Cloud Volumes ONTAP (CVO)
Issue
CVO shows aggregate offline due to a failed disk. Volumes are offline and data is not being served.
cluster1::*> aggr show
Aggregate Size Available Used% State #Vols Nodes RAID Status
--------- -------- --------- ----- ------- ------ ---------------- ------------
aggr1 42.52TB 27.71TB 35% online 14 cluster1-01 raid0, normal
aggr2 42.52TB 12.25TB 71% online 26 cluster1-01 raid0, normal
aggr3 0B 0B 0% failed 0 cluster1-01 raid0, partial
cluster1::*> aggr show-status aggr3
Aggregate aggr3 (failed, raid0, partial) (advanced_zoned checksums)
Plex /aggr3/plex0 (offline, failed, inactive)
RAID group /aggr3/plex0/rg0 (partial, advanced_zoned checksums)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
data 0b.2 0b - - SA:A - VMDISK N/A 8257528/16911419264 8388607/17179869176
data FAILED N/A 8257528/ -
data 0b.1 0b - - SA:A - VMDISK N/A 8257528/16911419264 8388607/17179869176
data 0b.10 0b - - SA:A - VMDISK N/A 8257528/16911419264 8388607/17179869176
data 0b.14 0b - - SA:A - VMDISK N/A 8257528/16911419264 8388607/17179869176
data 0b.12 0b - - SA:A - VMDISK N/A 8257528/16911419264 8388607/17179869176
Raid group is missing 1 disk.