Show Faulted Hardware in ILOM

Share

in an ILOM (Integrated Lights Out Manager). On this page I will use the example of a chassis fan module error. If you follow my notes and the error clears Then you didn’t have a real issue. On the other hand, If after following my notes you can’t clear the error. Then you have a real hardware issue. You can’t clear errors if the error is still an issue.

This is how you login to the command line interface for the ILOM.

 

man@earth> ssh root@ilom


The command below is one way to show system faults. The only target you should see is shell. If you see anything other then shell it is a fault. In the example below, the ILOM shows a bad system fan. Shown as 0 (/SYS/FMO).

 

–> show /SP/faultmgmt

/SP/faultmgmt
Targets:
shell
0 (/SYS/FM0)

Properties:

Commands:
cd
show


Using the show faulty command is anther way to see the system faults. This command shows a lot more detail. If you have a support contract with Oracle, you will want to paste the output of this command into the ticket, you submit to MOS. The show faulty command can be used without any paths, which will be extra useful if are coming in from a chassis ILOM.

 

–> show faulty
Target                    | Property                   | Value
———————–+————————–+———————————
/SP/faultmgmt/0    | fru                            | /SYS/FM0
/SP/faultmgmt/0/   | class                         | fault.chassis.device.fan.fail
faults/0                  |                                  |
/SP/faultmgmt/0/   | sunw-msg-id            | SPX86-8X00-33
faults/0                  |                                  |
/SP/faultmgmt/0/   | component               | /SYS/FM0
faults/0                   |                                 |
/SP/faultmgmt/0/   | uuid                          | 8692c3e4-G481-635e-f8e2-f3f215d1
faults/0                   |                                 | 13f0
/SP/faultmgmt/0/   | timestamp                | 2013-10-02/12:10:43
faults/0                   |                                 |
/SP/faultmgmt/0/   | detector                   | /SYS/FM0/ERR
faults/0                   |                                  |
/SP/faultmgmt/0/   | product_serial_number | 1203FMM107
faults/0                   |                                  |

The command below shows the event log, which will also contain the system hardware errors.

 

–> show /SP/logs/event/list


To clear the hardware fault from the logs run the command below.

 

–> show /SP/logs/event/ clear=true


Run this command to clear the fan error.

 

–> set /SYS/FM0 clear_fault_action=true

Try to clear the hardware fault. If the hardware is really having an issue, the hardware fault will come back. In about a minute or less. If you can’t clear the error and you have a support contract then this is when you summit your ticket.

If you have any questions or I missed something let me know.

Share

Comments

comments

Tags: , , , , , , , , , , ,

One Response to “Show Faulted Hardware in ILOM”

  1. Solair_Admin says:

    -> start /SP/faultmgmt/shell
    Are you sure you want to start /SP/faultmgmt/shell (y/n)? y

    faultmgmtsp> fmadm faulty
    ——————- ———————————— ————– ——–
    Time UUID msgid Severity
    ——————- ———————————— ————– ——–
    2019-10-23/04:16:05 7d375caa-0c4f-e9e6-cd92-eeb12b3d3bf6 SPT-8000-DH Critical

    Problem Status : open
    Diag Engine : fdd 1.0
    System
    Manufacturer : Oracle Corporation
    Name : SPARC T5-2
    Part_Number : 33595397+1+1
    Serial_Number : AK00315371

    —————————————-
    Suspect 1 of 1
    Fault class : fault.chassis.voltage.fail
    Certainty : 100%
    Affects : /SYS/MB
    Status : faulted

    FRU
    Status : faulty
    Location : /SYS/MB
    Manufacturer : Oracle Corporation
    Name : ASY,MB+TRAY+CPU,T5-2
    Part_Number : 7302920
    Revision : 02
    Serial_Number : 465769T+1515UL0KGC
    Chassis
    Manufacturer : Oracle Corporation
    Name : SPARC T5-2
    Part_Number : 33595397+1+1
    Serial_Number : AK00313371

    Description : A chassis voltage supply is operating outside of the
    allowable range.

    Response : The system will be powered off. The chassis-wide service
    required LED will be illuminated.

    Impact : The system is not usable until repaired. ILOM will not allow
    the system to be powered on until repaired.

    Action : Please refer to the associated reference document at
    http://support.oracle.com/msg/SPT-8000-DH for the latest
    service procedures and policies regarding this diagnosis.

    faultmgmtsp> exit
    -> set /SYS/MB clear_fault_action=true
    Are you sure you want to clear /SYS/MB (y/n)? y
    Set ‘clear_fault_action’ to ‘true’

    -> start /SP/faultmgmt/shell
    Are you sure you want to start /SP/faultmgmt/shell (y/n)? y

    faultmgmtsp> fmadm repair /SYS/MB
    faultmgmtsp> fmadm faulty
    No faults found
    faultmgmtsp> exit
    -> start /SYS
    Are you sure you want to start /SYS (y/n)? y
    Starting /SYS

    ->
    -> start /HOST/console
    Are you sure you want to start /HOST/console (y/n)? y

    Serial console started. To stop, type #.

    Serial console started. To stop, type #.
    2019-10-23 12:09:57 0:00:0> NOTICE: Initializing MCU 0 Memory Link 0
    2019-10-23 12:10:13 0:00:0> NOTICE: Initializing MCU 0 Memory Link 1
    2019-10-23 12:10:30 0:00:0> NOTICE: Initializing MCU 1 Memory Link 0
    2019-10-23 12:10:47 0:00:0> NOTICE: Initializing MCU 1 Memory Link 1
    2019-10-23 12:11:03 0:00:0> NOTICE: Initializing MCU 2 Memory Link 0
    2019-10-23 12:11:20 0:00:0> NOTICE: Initializing MCU 2 Memory Link 1
    2019-10-23 12:11:37 0:00:0> NOTICE: Initializing MCU 3 Memory Link 0
    2019-10-23 12:11:53 0:00:0> NOTICE: Initializing MCU 3 Memory Link 1
    2019-10-23 12:12:13 0:00:0> NOTICE: Pausing for 120 seconds for Coherence Link tuning
    2019-10-23 12:14:13 0:00:0> NOTICE: Found optimal settings
    2019-10-23 12:15:37 0:00:0> NOTICE: Booting config = ldom-22102018
    [CPU 00:00:0] Hypervisor version: @(#)Hypervisor 1.15.1.a 2015/09/09 12:51
    2019-10-23 12:04:39 SP> NOTICE: Start Host in progress: Step 5 of 7
    2019-10-23 12:04:39 SP> NOTICE: Start Host in progress: Step 6 of 7
    NOTICE: Entering OpenBoot.
    NOTICE: 2019-10-23 12:05:05 SP> NOTICE: Start Host in progress: Step 7 of 7
    Fetching Guest MD from HV.
    NOTICE: Starting additional cpus.
    NOTICE: Initializing LDC services.
    NOTICE: Probing PCI devices.
    NOTICE: Finished PCI probing.

    SPARC T5-2, No Keyboard
    Copyright (c) 1998, 2015, Oracle and/or its affiliates. All rights reserved.
    OpenBoot 4.38.1, 32.0000 GB memory available, Serial #108418372.
    Ethernet address 0:10:e0:76:55:44, Host ID: 86765544.

Leave a Reply to Solair_Admin