Archive for the ‘Solaris’ Category

How To Add-Remove Vdisk From Guest LDOMS

September 8th, 2019, posted in Solaris
Share

I’ll share the simple steps needed to add and remove a virtual disk in a running domain without any outage.
This is a system running Oracle VM Server for SPARC 3.1 with a Solaris 11.1 guest domain named ldom0.
I used NFS storage because it is easy to set up and lets me use live migration.

 

Adding a virtual disk to a running domain

The entire sequence of commands in the control domain defines, adds and removes a disk while the guest domain runs:

# mkfile -n 20g  /ldomsnfs/ldom0/disk1.img       # 1. create a disk image file
# ldm add-vdsdev /ldomsnfs/ldom0/disk1.img vol01@primary-vds0 # 2. define vdisk
# ldm add-vdisk vdisk01 vol01@primary-vds0 ldom0 # 3. add disk to the domain
# ldm rm-vdisk vdisk01 ldom0                     # 4. take it away from the domain.
# ldm rm-vdsdev vol01@primary-vds0               # 5. undefine the virtual disk
# rm disk1.img                                   # 6. save a little space.

That’s all there is to it. The new disk is available for the domain’s use after step 3 until I take it away in step 4.

 

Viewing reconfiguration from within the guest

Let’s take a look from the guest domain’s perspective.
In the guest, you can see one disk before adding more (before command 3, above) via the formatcommand:

# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
       0. c3d1 
          /virtual-devices@100/channel-devices@200/disk@1
Specify disk (enter its number): ^C

There’s one disk until ldm add-vdisk is issued in the control domain.
That results in a dynamic reconfiguration event that can be seen, if you are curious, by entering dmesg within the guest:

# dmesg|tail
...snip...
Nov 20 12:03:23 ldom0 vdc: [ID 625787 kern.info] vdisk@0 is online using ldc@16,0
Nov 20 12:03:23 ldom0 cnex: [ID 799930 kern.info] channel-device: vdc0
Nov 20 12:03:23 ldom0 genunix: [ID 936769 kern.info] vdc0 is /virtual-devices@100/channel-devices@200/disk@0
Nov 20 12:03:23 ldom0 genunix: [ID 408114 kern.info] /virtual-devices@100/channel-devices@200/disk@0 (vdc0) online

You can see the added disk using format and then use it. In this case I created a temporary ZFS pool.

# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
       0. c3d0 
          /virtual-devices@100/channel-devices@200/disk@0
       1. c3d1 
          /virtual-devices@100/channel-devices@200/disk@1
Specify disk (enter its number): ^C
# zpool create temp c3d0
# zpool list
NAME    SIZE  ALLOC   FREE  CAP  DEDUP  HEALTH  ALTROOT
rpool  19.9G  5.27G  14.6G  26%  1.00x  ONLINE  -
temp   19.9G   112K  19.9G   0%  1.00x  ONLINE  -

At this point I can just go ahead and use the added disk space. I could have done other things like add it to the existing ZFS pool
to make it a mirror, but this illustrates the point.

 

What happens if I try to remove an in-use disk

It could be very damaging to remove a virtual device while it is in use, so the default behavior
is that Solaris tells logical domains manager that the device is in use and cannot be removed.
That’s a very important advantage of Oracle VM Server for SPARC: the logical domains framework and Solaris work cooperatively,
in this and many other aspects.

In this case, we’re prevented from yanking a disk while it is in use.
If I try to remove the disk while it’s in use, I get an error message – exactly what you want:

# ldm rm-vdisk vdisk01 ldom0
Dynamic reconfiguration of the virtual device on domain ldom0
failed with error code (-122).
The OS on domain ldom0 did not report a reason for the failure.
Check the logs on that OS instance for any further information.
Failed to remove vdisk instance

The reason is “because it’s in use!” 🙂 An administrator would log into the guest to see what file systems are mounted.
This behavior can be overridden using the “-f” option if you are certain
you know what you’re doing.

Removing the disk

I issued zpool destroy temp in the guest and repeated the ldm rm-vdsdev and it worked.
Using zpool export temp would work just as well, and if I choose I can add that virtual disk to a different
domain and it could use zpool import temp to access data created by ldom0.
With other file systems, a regular umount would have the same effect, making it possible to remove the disk without -f.

The format command now shows only one disk again, and dmesg shows kernel messages when disk went offline:

Nov 20 12:42:10 ldom0 vdc: [ID 990228 kern.info] vdisk@0 is offline
Nov 20 12:42:10 ldom0 genunix: [ID 408114 kern.info] /virtual-devices@100/channel-devices@200/disk@0 (vdc0) offline

 

Summary

Solaris and the logical domain manager are engineered to work together in a coordinated fashion
to provide operational flexibility.
One of the values this provides is that
administrators can safely add and remove virtual devices while domains run.
This can be used for operational tasks like adding or removing disk capacity or IOPS as needed.
The same capabilities are also available for virtual network devices.

Share

Show Faulted Hardware in ILOM

August 5th, 2019, posted in Solaris
Share

in an ILOM (Integrated Lights Out Manager). On this page I will use the example of a chassis fan module error. If you follow my notes and the error clears Then you didn’t have a real issue. On the other hand, If after following my notes you can’t clear the error. Then you have a real hardware issue. You can’t clear errors if the error is still an issue.

This is how you login to the command line interface for the ILOM.

 

man@earth> ssh root@ilom


The command below is one way to show system faults. The only target you should see is shell. If you see anything other then shell it is a fault. In the example below, the ILOM shows a bad system fan. Shown as 0 (/SYS/FMO).

 

–> show /SP/faultmgmt

/SP/faultmgmt
Targets:
shell
0 (/SYS/FM0)

Properties:

Commands:
cd
show


Using the show faulty command is anther way to see the system faults. This command shows a lot more detail. If you have a support contract with Oracle, you will want to paste the output of this command into the ticket, you submit to MOS. The show faulty command can be used without any paths, which will be extra useful if are coming in from a chassis ILOM.

 

–> show faulty
Target                    | Property                   | Value
———————–+————————–+———————————
/SP/faultmgmt/0    | fru                            | /SYS/FM0
/SP/faultmgmt/0/   | class                         | fault.chassis.device.fan.fail
faults/0                  |                                  |
/SP/faultmgmt/0/   | sunw-msg-id            | SPX86-8X00-33
faults/0                  |                                  |
/SP/faultmgmt/0/   | component               | /SYS/FM0
faults/0                   |                                 |
/SP/faultmgmt/0/   | uuid                          | 8692c3e4-G481-635e-f8e2-f3f215d1
faults/0                   |                                 | 13f0
/SP/faultmgmt/0/   | timestamp                | 2013-10-02/12:10:43
faults/0                   |                                 |
/SP/faultmgmt/0/   | detector                   | /SYS/FM0/ERR
faults/0                   |                                  |
/SP/faultmgmt/0/   | product_serial_number | 1203FMM107
faults/0                   |                                  |

The command below shows the event log, which will also contain the system hardware errors.

 

–> show /SP/logs/event/list


To clear the hardware fault from the logs run the command below.

 

–> show /SP/logs/event/ clear=true


Run this command to clear the fan error.

 

–> set /SYS/FM0 clear_fault_action=true

Try to clear the hardware fault. If the hardware is really having an issue, the hardware fault will come back. In about a minute or less. If you can’t clear the error and you have a support contract then this is when you summit your ticket.

If you have any questions or I missed something let me know.

Share

Solaris Fault Manager

July 1st, 2019, posted in Solaris
Share
Fault Manager is part of self-healing functionality that provides fault isolation and component restart, in this case hardware component 
(SMF will take care of software components).

Make sure that you run the service and have required packages.
# pkginfo |grep fmd
system      SUNWfmd         Fault Management Daemon and Utilities
system      SUNWfmdr        Fault Management Daemon and Utilities (Root) 
# svcs fmd
STATE          STIME    FMRI
online         Jun_29   svc:/system/fmd:default



Display Fault Manager Configuration:
# fmadm config
MODULE                   VERSION STATUS  DESCRIPTION
cpumem-diagnosis         1.6     active  CPU/Memory Diagnosis
cpumem-retire            1.1     active  CPU/Memory Retire Agent
eft                      1.16    active  eft diagnosis engine
fmd-self-diagnosis       1.0     active  Fault Manager Self-Diagnosis
io-retire                1.0     active  I/O Retire Agent
sysevent-transport       1.0     active  SysEvent Transport Agent
syslog-msgs              1.0     active  Syslog Messaging Agent
zfs-diagnosis            1.0     active  ZFS Diagnosis Engine
zfs-retire               1.0     active  ZFS Retire Agent



For example, kernel sends error to FMD and FMD forwards error to module. There are two types of module: 1. Diagnosis engine : provides diagnosis based on symptoms 2. Agents : respond to given diagnosis and takes action, say offline faulty CPU. The fault manager maintains two log files: 1. error log - list of errors sent to the fault manager daemon 2. fault log - list of diagnosed and repaired problems See fault log with: # fmdump See error log with: # fmdump -e Tips: -u - limits the output to a specific UUID -T - displays events that occurred BEFORE specific time yyyy-mm-dd -t - displays events that occurred AFTER specific time yyyy-mm-dd -V - verbose output Run command below to see if Faulty Manager shows some failed resources. In this example we see that memory module DIMM 3 failed.


# fmadm faulty
--------------- ------------------------------------  -------------- ---------
TIME            EVENT-ID                              MSG-ID         SEVERITY
--------------- ------------------------------------  -------------- ---------
Jun 23 02:30:30 2578e639-38cd-4cd8-9c16-87e96116f41e  AMD-8000-2F    Major

Fault class : fault.memory.dimm_sb
Affects     : mem:///motherboard=0/chip=1/memory-controller=0/dimm=3/rank=0
                  degraded but still in service
FRU         : "CPU 1 DIMM 3" (hc://:product-id=Sun-Fire-X4200-Server:chassis-id=0000000000:server-id=oryx/motherboard=0 
		/chip=1/memory-controller=0/dimm=3)

Description : The number of errors associated with this memory module has
              exceeded acceptable levels.  Refer to
              http://sun.com/msg/AMD-8000-2F for more information.

Response    : Pages of memory associated with this memory module are being
              removed from service as errors are reported.

Impact      : Total system memory capacity will be reduced as pages are
              retired.

Action      : Schedule a repair procedure to replace the affected memory
              module.  Use fmdump -v -u <EVENT_ID> to identify the module.



Note that there is the link with more info (like knowledge base), go there and it tells you about resolution. Okay, so say you are replacing DIMM now. Once DIMM is replaced, you need to update resource cache to indicate there is no issue any more.
# fmadm repair 2578e639-38cd-4cd8-9c16-87e96116f41e
fmadm: recorded repair to 2578e639-38cd-4cd8-9c16-87e96116f41e



Reset the Fault Manager module. Don't know which one, previously mentioned web link will tell you.
# fmadm reset eft
fmadm: eft module has been reset



Verify that there is no more faulty resources. # fmadm faulty No output, super! Means there is no h/w issue!
Share

Running Commands in the Solairs Background

September 25th, 2018, posted in Solaris
Share

When you type a command and press the Return key, your system runs the command, waits for the command to complete a task, and then prompts you for another command. However, some commands can take a long time to finish, and you might prefer to type other commands in the meantime. If you want to run additional commands while a previous command runs, you can run a command in the background.

If you know you want to run a command in the background, type an ampersand (&) after the command as shown in the following example.

$ bigjob &
[1] 7493
$

The number that follows is the process id. The command bigjob will now run in the background, and you can continue to type other commands. After the job completes, you will see a message similar to the following the next time you type another command, such as date in the following example.

$ date
Tue Oct 31 15:44:59 MST 2000
[1]    Done                 bigjob
$

If you plan to log off before a background job completes, use the nohup (no hangup) command to enable the job to complete, as shown in the following example. If you do not use the nohup command , the background job terminates when you log off.

$ nohup bigjob &
[3] 7495
$
Share

AD ADMIN Utilities

September 17th, 2018, posted in Oracle Queries, Solaris, Uncategorized
Share

Oracle application,oracle AdAdmin,Oracle application AdAdmin,oracle apps,oracle apps envirnoment,Oracle DBA,Oracle Application DBA,Oracle Database DBA,Application DBA,Database DBA,AdAdmin

AdAdmin

Ad admin is a utility which is used to perform

Login as a Applmgr user:
[applmgr@apps ~]$ adadmin

Preliminary Tasks:

1. Running the environment file.
2. Verifying that ORACLE_HOME set properly.
3. Ensuring that ORACLE_HOME/bin and AD_TOP/bin are in your path.
4. Shutting down the concurrent managers when relinking certain files or performing certain database tasks.
5. Ensuring Sufficient temporary disk space.

Ad administration prompts:
Your default directory is ‘/u01/app/apps/uatappl’.
Is this the correct APPL_TOP [Yes]

Filename [adadmin.log] :

APPL_TOP is set to /u01/app/apps/uatappl

Please enter the batchsize [1000] :

Enter the password for your ‘SYSTEM’ ORACLE schema: manager

Enter the ORACLE password of Application Object Library [APPS] : apps

Ad admin Log files:

The main AD Administration log file is called adadmin.log by default. This name can be
Changed when starting up AD Administration.

Errors and warnings are listed in the log file
/u01/app/apps/uatappl/admin/UAT/log/adadmin.log

General Applications file menu

There are five functional choices in the generate applications file menu.

1. Generate Message Files:
This task generates message binary files in the $PROD_TOP/mesg directory from oracle application object library tables.

We generally perform this task only when instructed to do so in a readme file of a patch.

2. Generate Form Files:This task generate binary oracle forms file for all installed languages from the form definition files. Extension (*.fmx)

Perform this task whenever we have issue with a form or set forms.

Oracle application uses these binary files to display the data entry forms.

When we choose Generate form files:
It prompts following:

Generate Report files:
This task generates binary report files for all installed languages. Extension of the file name like (*.rdf)

When we choose Generate report files menu, it prompts the following.

Generate Graphics files:
This task generates Oracle graphics files for all installed languages. Extension of the file name like (*.ogd)

The serious of prompts and actions in this task are very similar to the prompts and actions in the Generate form files task.

Generate Product JAR files:
This generate product jar files task prompts
Do you wish to force generation of all jar files? [No]
If we choose No, it only generates JAR (Jave Archive) files that are missing or out of date.
Choose yes for this option when generating JAR files after upgrading the developer technology stack or after updating your Java version.

Maintain Applications Files tasks

Relink Application programs:
This task relinks all your oracle applications binary executables.
Select this task after us:
1. Install new version of the database or a technology stack component.
2. Install an underlying technology component used with oracle applications.
3. Apply a patch to the application technology stack.
4. Apply a patch to the operating systems

These tasks execute AD relink utility. Use AD admin, not the AD relink utility directly, to relink non AD-executables.
Create Applications environment file:
Select this task when you want to:
1. Create an environment file with settings that are different from your current environment file.
2. Recreate an environment file that is missing or currept.

Copy files to destinations:
The copy files to destinations task copies files from each product area to central locations where they can easily referenced by oracle applications.

Use this option to update the java, HTML and media files in the common directories (such as JAVA_TOP, OA_TOP) when users have issues accessing them.

1. Java file are copied to JAVA_TOP
2. HTML files are copied to OAH_TOP
3. Media files are copied to OAM_TOP

Convert Character Set:

1. This task converts the character set of all translatable files in APPL_TOP.
2. You should select this task when changing the base language or adding additional languages to oracle applications
3. You may need to convert database character set and file system character set to one that will support the additional languages.

Maintain snapshot information:

1. This task record details for each file in the APPL_TOP (like file name and file version).
2. They also record summary information’s about patches that have been applied to the APPL_TOP.
3. The maintain snapshot information task stores information about files, file versions and bug fixes present in an APPL_TOP.
4. You must run Maintain snapshot information option once for each APPL_TOP before you apply any patch that contains a “compatible feature prereq” line on that APPL_TOP.

Check for Missing files:

1. The check for missing files task verifies files needed to install, upgrade, or run oracle applications for the current configuration are in the current APPL_TOP.
2. Choose this task if you suspect there are files missing in your APPL_TOP.

Maintain Database Entries tasks:

Validate Apps Schema:

Validate apps schema task run SQL script (advrfapp.sql) against the apps schema to verify the integrity of the schema.
It determines:

1. Problems you must fix(not specific to apps schema)
2. problems you must fix(specific to apps schema)
3. Issues you may want to address(specific to apps schema)
A report called APPSschemaname.lst is produced in APPL_TOP/admin//out.
This report contains information about how to fix the issues. We can find following things by running the Validate Apps schema.

1. Missing or invalid package.
2. Missing or invalid synonyms.
3. Invalid objects in apps schema.

This task is more effective if run:

1. Immediately after an upgrading or applying maintenance pack
2. After a patch is applied.
3. After performing export/import (migration)
4. When doing custom development in the apps schema.

Recreate Grants and Synonyms:

This task recreates grants and synonyms for oracle application public schema (applsyspub)
Recreate grants on some packages from system to apps

Run this task when grants and synonyms are missing from the database. This may occur as a result of
1. Custom development
2. Incomplete database migrations
3. Patches and administrative sessions that failed to run successfully to completion

Maintain multi-lingual tables:

MLS or multilingual support is oracle application’s ability to operate in a multi languages simultaneously. When running Maintain multi-lingual task you can select the number of parallel workers. In generally run during the NLS install and maintenance processes. This task runs the NLINS.sql script for every product. It invokes pl/sql routines that maintain multilingual tables and untranslated rows.

Check Dual Tables:

This task looks for a dual table accessible by oracle applications and ensures the correct grants are set up. If such table not exists or if an existing DUAL table has more than one row, AD administration displays error. If a DUAL table containing only one row exists, AD admin completes successfully.

Maintain multiple reporting currencies:

This option varies depending on whether you currently have multiple reporting currencies (MRC) enabled or not.
If MRC functionality is implemented in your database, the option reads maintain multiple reporting currencies.

Convert to multiple organizations:

To convert in to multiple-org does the following thing:

1. Confirms that you want to run the task
2. Asks for the number of parallel workers
3. Create script to disable and re-enable triggers in the APPS schema
4. Disable all triggers in the apps schema
5. Converts seed data and transaction data to multiple organizations in parallel.
6. Re-enable all previously disabled triggers in the apps schema.

Compile and Reload Database Entries tasks:

Compile Apps Schema:

This task compiles uncompiled program units (pl/sql and java) in the apps schema.
You can perform this task with multiple workers.
When running this task, AD administration prompts,
Run Invoker’s Rights processing in incremental mode [No]?
Type Yes at this prompt to run Invoker Rights processing only on packages that have changed
Since Invoker Rights processing was last run or accept the default to run Invoker Rights
Processing on all packages.
During the upgrade progress.

Compile Menu Information:

This option compiles menu data structures.
Choose this task after uploading menu entries.

It asks if you want to force compilation for all menus,

1. If you choose the default [no] only menus with changes are saved
2. If you enter yes all menus are compiled

Compile flexfield:

Run this task if the readme of a patch indicates that this step should be performed.
Details of the task with a list of compilation status of every flexfield are written to a log file.
The name of the log file is in the format .req. The main AD Administration log file contains the exact name of this log file.

Reload JAR Files to Database:

This option runs the loadjava utility to reload all appropriate oracle applications
JAR files into the database.

Change Maintenance mode:

1. Must be enabled before patching oracle applications
2. Improves patching performance
3. Restricts users access to system
4. Is enabled and disabled using AD administration

Share