Monday, August 26, 2013

How To Clear ILOM FMA Faults And Reset The SP ?

Objective: To clear ILOM FMA faults and to reset the SP.

Applies to Engineered Systems (Exadata X2, X3) and other machines where ILOM FMA are used.

Solution:

1. Log into the node ILOM and once you are at the SP prompt

 % ssh -l root <IP address of Service Processor>

2. List all known faults in the system.

Example:

-> show /SP/faultmgmt
* Enter the fault management shell to obtain pertinent information about the fault.
-> start /SP/faultmgmt/shell

3. Start the Fault Management:

Are you sure you want to start /SP/faultmgmt/shell(y/n)? y
faultmgmtsp>

* Use the 'fmadm faulty' command to identify the faulty component/FRU.

4. Example of clearing a fan fault:

Show fault:
faultmgmtsp> fmadm faulty

Will list the /SYS/MB fault  with UUID
------------------- ------------------------------------ -------------- --------
Time                UUID                                 msgid          Severity
------------------- ------------------------------------ -------------- --------
2012-12-15/21:53:29 68a0c563-e609-e8fb-9fae-c03c46867474 SPX86-8002-2J  Critical

Fault class : fault.chassis.domain.boot.power-off-unexpected

FRU         : /SYS/MB
             (Part Number: 511-1213-06)
             (Serial Number: 0328MSL-1106BA12XY)

Description : Power to server is not available due to a malfunctioning component detected by CPLD.

Use the above UUID to clear the fault.

5. Clear the fault:

faultmgmtsp> fmadm repair 68a0c563-e609-e8fb-9fae-c03c46867474

show faults again / repeat till empty:
faultmgmtsp> fmadm faulty

Exit out of Fault Manager Shell:
faultmgmtsp> exit

After clearing the actual fault, please continue to reset the SP.

6. Reset the SP:

->reset /SP

Legend:

ILOM - Integrated Lights Out Manager
FMA - Fault Management Architecture
SP - Service Processor

You are all set!  All your faults are clear now.

Reference / Read More:

URL - http://docs.oracle.com/cd/E20815_01/html/E20894/gjuqk.html
Title - How to Clear Faults Using the Oracle ILOM Command-Line Interface

URL - http://docs.oracle.com/cd/E20689_01/html/E20695/z40000971312677.html
Title - Access the SP (Oracle ILOM)

URL - http://docs.oracle.com/cd/E20815_01/html/E20894/gjshy.html
Title - How to Reset the Oracle ILOM SP Using the Web Interface

Sun Server X2-8 (formerly Sun Fire X4800 M2) Diagnostics Guide, Sun Server X2-8 (formerly Sun Fire X4800 M2) Documentation Library
Section: How to Clear Faults Using the Oracle ILOM Command-Line Interface
URL: http://docs.oracle.com/cd/E20815_01/html/E20894/gjuqk.html
Applies to Exadata as well.

Oracle Solaris Administration: Common Tasks, Oracle Solaris 11 Information Library
Section: Fault Management Overview
URL: http://docs.oracle.com/cd/E23824_01/html/821-1451/gliqg.html
Applies to Exadata FMA as well.

Writing Device Drivers, Oracle Solaris 11 Information Library
Section: Oracle Fault Management Architecture I/O Fault Services
URL: http://docs.oracle.com/cd/E23824_01/html/819-3196/fmaiofs.html
Applies to Exadata FMA as well.

1 comment:

  1. I think that thanks for the valuabe information and insights you have so provided here. Fault Location Services Sandy Hook

    ReplyDelete