Skip to main content
NetApp Stage KB

Multi-disk/plex failure in a Fabric MetroCluster

Views:
Visibility:
Public
Votes:
0
Category:
metrocluster
Specialty:
MetroCluster
Last Updated:

 

 

Applies to

  • Fabric MetroCluster
  • ONTAP 9

Issue

Multiple remote disk failures and eventually plex failures are experienced by both sites in a MetroCluster.

Example:

Mon May 17 13:40:32 UTC [SiteA-02: cfdisk_config: cf.disk.skipped:notice]: The disk FC_switch_B_1:9.126L514 was skipped because it reported the status adapter error prevents command from being sent to device.

Mon May 17 14:00:00 UTC [SiteA-02: config_thread: raid.config.check.failedPlex:error]: Plex /SiteA-aggr-02/plex1 has failed.

Mon May 17 14:03:32 UTC [SiteA-02: mgwd: callhome.hm.alert.major:alert]: Call home for Health Monitor process schm: RaidDegradedMirrorAggrAlert[00000000-0000-0000-0000-000000000000].

FCVI resets will also be observed.

Example:

Wed May 19 13:11:48 UTC [SiteA-02: ispfcvi2500_port2: fcvi.qlgc.rmt.link.down:notice]: FC-VI adapter: Link to partner node over port 0f is down. Partner port id = 0x60400, partner node's system id = 000000000. 
Wed May 19 13:11:49 UTC [SiteA-02: ispfcvi2500_main2: pfo.failover.start.error:debug]: params: {'partner': 'DR Primary Partner', 'reason': 'Failover not enabled', 'client': 'WAFL', 'port': '1', 'trigger': 'IPFO_TRIG_BAD_COMPL'}
Wed May 19 13:11:49 UTC [SiteA-02: wafl_exempt03: fcvi.qlgc.ioErr:error]: FC-VI adapter: FCVI driver on port 0f received IO error. Status = Invalid VI state(status code = 0x10c), FCVI opcode = Write Request(0x1), QP name = WAFL, QP index = 3, Remote node's system id = 000000000.
Wed May 19 13:11:49 UTC [SiteA-02: wafl_exempt03: mirror.stream.qp.error:debug]: params: {'error': 'NVMM_ERR_POLL', 'qp_name': 'WAFL', 'mirror': 'DR PARTNER'}
Wed May 19 13:11:49 UTC [SiteA-02: wafl_exempt03: nvmm.mirror.aborting:debug]: mirror of sysid 2, partner_type DR PARTNER and mirror state NVMM_MIRROR_ONLINE is aborted because of reason NVMM_ERR_POLL.
Wed May 19 13:11:49 UTC [SiteA-02: fcvi_cm: ems.engine.suppressed:debug]: Event 'ic.rdma.qpDisconnected' suppressed 5 times in last 171775 seconds.
Wed May 19 13:11:49 UTC [SiteA-02: fcvi_cm: ic.rdma.qpDisconnected:debug]: WAFL is disconnected.
Wed May 19 13:11:49 UTC [SiteA-02: fcvi_cm: ems.engine.suppressed:debug]: Event 'ic.rdma.qpConnected' suppressed 11 times in last 171181 seconds.
Wed May 19 13:11:49 UTC [SiteA-02: fcvi_cm: ic.rdma.qpConnected:debug]: WAFL is connected.
Wed May 19 13:11:49 UTC [SiteA-02: fcvi_cm: rdma.rlib.connected:debug]: WAFL QP is now connected.
Wed May 19 13:11:49 UTC [SiteA-02: nvmm_helper: nvpm.state.changed:debug]: Node 2's NVPM state changed from "2" to "2".
Wed May 19 13:11:49 UTC [SiteA-02: wafl_exempt02: mirror.stream.qp.error:debug]: params: {'error': 'NVMM_ERR_POLL', 'qp_name': 'WAFL', 'mirror': 'HA Partner'}
Wed May 19 13:11:49 UTC [SiteA-02: wafl_exempt02: nvmm.mirror.aborting:debug]: mirror of sysid 1, partner_type HA Partner and mirror state NVMM_MIRROR_ONLINE is aborted because of reason NVMM_ERR_POLL.
Wed May 19 13:11:49 UTC [SiteA-02: mcc_cfd_rnic: nvmm.mirror.aborting:debug]: mirror of sysid 3, partner_type AUXDR PARTNER and mirror state NVMM_MIRROR_LAYOUT_SYNCED is aborted because of reason NVMM_ERR_STREAM.
Wed May 19 13:11:49 UTC [SiteA-02: mcc_cfd_rnic: mirror.stream.qp.error:debug]: params: {'error': 'NVMM_ERR_STREAM', 'qp_name': 'RAID', 'mirror': 'HA Partner'}
Wed May 19 13:11:49 UTC [SiteA-02: mcc_cfd_rnic: mirror.stream.qp.error:debug]: params: {'error': 'NVMM_ERR_STREAM', 'qp_name': 'MISC', 'mirror': 'HA Partner'}
Wed May 19 13:11:49 UTC [SiteA-02: MCC_DR_Callout: fcvi.qlgc.sent.disconnect:notice]: FC-VI adapter: Disconnect request sent on port 0f. QP name = RAID, QP index = 4, Remote node's system id = 000000000. 
Wed May 19 13:11:49 UTC [SiteA-02: MCC_DR_Callout: fcvi.qlgc.sent.disconnect:notice]: FC-VI adapter: Disconnect request sent on port 0f. QP name = MISC, QP index = 10, Remote node's system id = 000000000. 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.