Skip to main content

NetApp Stage KB

Rebooting SP due to loss of ACP comms

Views:
Visibility:
Public
Votes:
0
Category:
fas-systems
Specialty:
HW
Last Updated:

Applies to

  • ONTAP 9
  • Service Processor (SP)

Issue

  • SP reboot after ACP alert issue cleared. EMS log example:

[node_name-01: dsa_worker1: ses.status.ACPError:alert]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor error for SAS shelf ACP processor 2: critical status    ; Alternate Control Path hardware failed This module is on the rear of the shelf at the top right, on shelf module B.
[node_name-01: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors.
[node_name-01: monitor: monitor.globalStatus.critical:EMERGENCY]: Disk shelf fault.
[node_name-01: dsa_worker2: ses.status.ACPError:alert]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor error for SAS shelf ACP processor 1: critical status    ; Alternate Control Path hardware failed This module is on the rear of the shelf at the top left, on shelf module A.
[node_name-01: dsa_worker2: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 2: normal status.
[node_name-01: splog_main: splog.running.normally:info]: Process splogd is operating normally.
[node_name-01: dsa_worker1: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 1: normal status.
[node_name-01: statd: monitor.shelf.fault.ok:notice]: Fault previously reported on disk storage shelf attached to channel 0a has been corrected.
[node_name-01: monitor: monitor.globalStatus.ok:notice]: The system's global status is normal.

[node_name-02: statd: monitor.shelf.fault:alert]: Critical fault reported on disk storage shelf attached to channel 0a. Check fans, power supplies, disks, and temperature sensors.
[node_name-02: monitor: monitor.globalStatus.critical:EMERGENCY]: Disk shelf fault.
[node_name-02: dsa_worker1: ses.status.ACPError:alert]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor error for SAS shelf ACP processor 1: critical status    ; Alternate Control Path hardware failed This module is on the rear of the shelf at the top left, on shelf module A.
[node_name-02: splog_main: splog.running.normally:info]: Process splogd is operating normally.
[node_name-02: dsa_worker3: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 2: normal status.
[node_name-02: dsa_worker2: ses.status.ACPInfo:info]: DS2246 (S/N SHFHU0123456789) shelf 0 on channel 0a ACP Processor information for SAS shelf ACP processor 1: normal status.
[node_name-02: statd: monitor.shelf.fault.ok:notice]: Fault previously reported on disk storage shelf attached to channel 0a has been corrected.
[node_name-02: monitor: monitor.globalStatus.ok:notice]: The system's global status is normal.

  • Automatically SP reboot with event message example:

Record 833: Tue Oct 13 18:20:19 2020 [SP.critical]: Rebooting SP due to loss of ACP comms

  • ACP status is ok and working fine.
  • High number of transmitted frames and bytes/second, through management e0M port: 

    -- interface  e0M  (30 days, 20 hours, 46 minutes, 42 seconds) --
    RECEIVE
    …
    TRANSMIT
>>> Total frames:     2992m | Frames/second:    1122  | Total bytes:      4523g
    Bytes/second:     1696k | Total errors:        0  | Errors/minute:       0
    Total discards:      0  | Queue overflow:      0  | Multi/broadcast: 90594
    …
 
    -- interface  e0M  (30 days, 20 hours, 44 minutes, 31 seconds) --
    RECEIVE
    …
    TRANSMIT
>>> Total frames:      216m | Frames/second:      81  | Total bytes:       322g
    Bytes/second:      120k | Total errors:        0  | Errors/minute:       0
    Total discards:      0  | Queue overflow:      0  | Multi/broadcast: 90526
    …

  • Node management LIF and intercluster LIF sharing same Broadcast Domain.

 

Sign in to view the entire content of this KB article.

New to NetApp?

Learn more about our award-winning Support

NetApp provides no representations or warranties regarding the accuracy or reliability or serviceability of any information or recommendations provided in this publication or with respect to any results that may be obtained by the use of the information or observance of any recommendations provided herein. The information in this document is distributed AS IS and the use of this information or the implementation of any recommendations or techniques herein is a customer's responsibility and depends on the customer's ability to evaluate and integrate them into the customer's operational environment. This document and the information contained herein may be used solely in connection with the NetApp products discussed in this document.