OSWatcher should be an integral part and de facto standard for every database server as the tool of choice to collect and archive operating system and network metrics that can be leveraged to diagnose performance issues. This blog will show you the better way to download and configure OSWatcher for your database and even non-database servers. First download the OSWatch RPM from Oracle Support: How To Start OSWatcher Black Box (OSWBB) Every System Boot [ID 580513.1]

Oracle Support recommends that you run and collect OSWatcher information for an extended period. Normally, in the event that the database node crashes, the system admins needs to manually restart the OSWBB. Now, there is a RPM supplied by Oracle that will automatically start OSWatcher for you at system startup.

After you download oswbb-service RPM, install the RPM with the rpm -ihv command:

$ sudo rpm -ihv oswbb-service-1.1.5-1.noarch.rpm
Preparing...                ########################################### [100%]
   1:oswbb-service          ########################################### [100%]

Before you start osWatcher, make sure that you modify the configuration file: /etc/oswbb.conf. You will want to pay particular attention to the OSW_HOME and OSW_RETENTION parameters:

# Set OSW_HOME to the directory where you unpacked OSW or OSWbba
OSW_HOME='/u01/app/oracle/oswbb'
# Set OSW_INTERVAL to the number of seconds between collections
OSW_INTERVAL='30'
# Set OSW_RETENTION to the number of hours logs are to be retained
OSW_RETENTION='168'
# Set OSW_USER to the owner of the OSW_HOME directory
OSW_USER='root'
# Set OSW_COMPRESSION to the desired compression facility
OSW_COMPRESSION='gzip'

By default, the retention is set for 2 days; in this example, we extended the retention to 7 days (7×24=168). Space permitting, you may want to increase this parameter as high as 31 days.

After you make the appropriate changes to the oswbb.conf file, you can start OSWatcher with the service command:

$ sudo /sbin/service oswbb start
Starting OSWatcher:                                        [  OK  ]

Allowed options for oswbb are:

Usage: /etc/init.d/oswbb {start|stop|reload|restart|condrestart|status|info}

Next, setup OSWatcher to restart during system boot up with the chkconfig command:

$ sudo /sbin/chkconfig on

You can confirm that OSWatcher is configure to restart with the system with the following command:

# /sbin/chkconfig --list |grep -i oswbb
oswbb          	0:off	1:off	2:on	3:on	4:on	5:on	6:off

The OSWatcher logs will be stored in ${OSW_HOME}/archive directory.

Here’s a simple one-liner cronjob to check to see is OSWatcher process is running. If the oswbb process is not running, it will simply re-start it with the appropriate parameters. This process simply runs once per day. This in combination with the auto-start procedures described above should ensure that OSWatcher is always running to provide vital metrics about the health of the oracle server:

1 0 * * * ps -ef | grep oswbb | grep -v grep || cd /u01/app/oracle/oswbb;nohup ./startOSWbb.sh 60 168 &

Posted by Charles Kim, Oracle ACE Director, VMware vExpert


Here’s some find command tips for the DBAs to save time performing day-to-day tasks.

1. Removing files older than 14 Days from the current working directory

find . -type f -mtime +14 -exec rm -f {} \;

2. Finding the top 5 largest files from the current working directory

find . -ls | sort -nrk 7 | head -5

3. Find files larger than 100MB from the current working directory

find . -size +100000k

4. Delete audit records that’s older than 30 days

find $ORACLE_HOME/rdbms/audit -name "*.aud" -mtime +30 -exec rm {} \;

5. Delete files in /tmp that’s older than 30 days

find /tmp -group dba -type f -mtime +5 -exec rm -f {} \;
find /tmp/dba -group dba -type f -mtime +5 -exec rm -f {} \;

6. Delete *.trc files more than 5 days old.

find $TRACE_DIR -name '*.trc' -type f -mtime +5 -exec rm {} \;

Every DBA should have raccheck in their arsenal and should be a pre-requisite for every RAC deployment as one of the audit tools to confirm a proper configuration. RACcheck is a RAC Configuration Audit tool provided by oracle support and designed to audit core configuration settings within a Real Application Clusters (RAC), Automatic Storage Management (ASM) and Grid Infrastructure stack. RACcheck tool audits configuration for the following categories:

* OS kernel parameters
* OS packages
* Other OS configuration settings important to RAC.
* Grid Infrastructure
* RDBMS
* ASM
* Database parameters
* Other database configuration settings important to RAC
* Upgrade Readiness assessment

vnaqa1 > ./raccheck

CRS stack is running and CRS_HOME is not set. Do you want to set CRS_HOME to /u01/app/grid/11203?[y/n][y]

Checking ssh user equivalency settings on all nodes in cluster

Node dalvna2 is configured for ssh user equivalency for oracle user
 
Node dalvna3 is configured for ssh user equivalency for oracle user
 

Searching for running databases . . . . .

. . 
List of running databases registered in OCR
1. vnadbqa
2. vnatmpqa
3. All of above
4. None of above

Select databases from list for checking best practices. For multiple databases, select 3 for All or comma separated number like 1,2 etc [1-4][3].1

Searching out ORACLE_HOME for selected databases.

. . . 


Checking Status of Oracle Software Stack - Clusterware, ASM, RDBMS

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
-------------------------------------------------------------------------------------------------------
                                                 Oracle Stack Status                            
-------------------------------------------------------------------------------------------------------
Host Name  CRS Installed  ASM HOME       RDBMS Installed  CRS UP    ASM UP    RDBMS UP  DB Instance Name
-------------------------------------------------------------------------------------------------------
dalvna1 Yes             Yes             Yes             Yes        Yes      Yes      vnaqa1  
dalvna2 Yes             Yes             Yes             Yes        Yes      Yes      vnadbqa2  
dalvna3 Yes             Yes             Yes             Yes        Yes      Yes      vnadbqa3  
-------------------------------------------------------------------------------------------------------

104 of the included audit checks require root privileged data collection . If sudo is not configured or the root password is not available, audit checks which  require root privileged data collection can be skipped.


1. Enter 1 if you will enter root password for each  host when prompted

2. Enter 2 if you have sudo configured for oracle user to execute root_raccheck.sh script 

3. Enter 3 to skip the root privileged collections 

4. Enter 4 to exit and work with the SA to configure sudo  or to arrange for root access and run the tool later.

Please indicate your selection from one of the above options[1-4][1]:- 2

*** Checking Best Practice Recommendations (PASS/WARNING/FAIL) ***



Log file for collections and audit checks are at
/home/oracle/ck/raccheck/raccheck_062713_142945/raccheck.log

Running raccheck in serial mode because expect(/usr/bin/expect) is not available to supply root passwords on remote nodes

NOTICE:  Installing the expect utility (/usr/bin/expect) will allow raccheck to gather root passwords at the beginning of the process and execute raccheck on all nodes in parallel speeding up the entire process. For more info - http://www.nist.gov/el/msid/expect.cfm.  Expect is available for all major platforms.  See User Guide for more details.

=============================================================
                    Node name - dalvna1                                
=============================================================
Collecting - ASM DIsk I/O stats 
Collecting - ASM Disk Groups 
Collecting - ASM Diskgroup Attributes 
Collecting - ASM disk partnership imbalance 
Collecting - ASM diskgroup attributes 
Collecting - ASM initialization parameters 
Collecting - Active sessions load balance for vnadbqa database 
Collecting - Archived Destination Status for vnadbqa database 
Collecting - Cluster Interconnect Config for vnadbqa database 
Collecting - Database Archive Destinations for vnadbqa database 
Collecting - Database Files for vnadbqa database 
Collecting - Database Instance Settings for vnadbqa database 
Collecting - Database Parameters for vnadbqa database 
Collecting - Database Properties for vnadbqa database 
Collecting - Database Registry for vnadbqa database 
Collecting - Database Sequences for vnadbqa database 
Collecting - Database Undocumented Parameters for vnadbqa database 
Collecting - Database Workload Services for vnadbqa database 
Collecting - Dataguard Status for vnadbqa database 
Collecting - Files not opened by ASM 
Collecting - Log Sequence Numbers for vnadbqa database 
Collecting - Percentage of asm disk  Imbalance 
Collecting - Process for shipping Redo to standby for vnadbqa database 
Collecting - Redo Log information for vnadbqa database 
Collecting - Standby redo log creation status before switchover for vnadbqa database 
Collecting - CPU Information
Collecting - CRS active version
Collecting - CRS oifcfg
Collecting - CRS software version
Collecting - CSS Reboot time
Collecting - CSS disktimout
Collecting - Cluster interconnect (clusterware)
Collecting - Clusterware OCR healthcheck 
Collecting - Clusterware Resource Status
Collecting - Huge pages configuration
Collecting - Kernel parameters
Collecting - Linux module config.
Collecting - Maximum number of semaphore sets on system
Collecting - Maximum number of semaphores on system
Collecting - Maximum number of semaphores per semaphore set
Collecting - Memory Information
Collecting - OS Packages
Collecting - Operating system release information and kernel version
Collecting - Oracle Executable Attributes
Collecting - Patches for Grid Infrastructure 
Collecting - Patches for RDBMS Home 
Collecting - Shared memory segments
Collecting - Table of file system defaults
Collecting - Voting disks (clusterware)
Collecting - number of semaphore operations per semop system call

Preparing to run root privileged commands  dalvna1.

Collecting - ACFS and ASM driver version comparison [ACFS] 
Collecting - Broadcast Requirements for Networks 
Collecting - CRS user time zone check 
Collecting - Custom rc init scripts (rc.local) 
Collecting - Generic ACFS health [ACFS] 
Collecting - Grid Infastructure user shell limits configuration 
Collecting - Health of the mounted ACFS file systems [ACFS] 
Collecting - Health of unmounted ACFS file systems [ACFS] 
Collecting - Interconnect interface config 
Collecting - Network interface stats 
Collecting - OCFS2 disks 
Collecting - Root Open File Limit 
Collecting - Verify ioctl to advm [ACFS] 
Collecting - Volume list for unmount ACFS file system [ACFS] 
Collecting - ocsf status 
Collecting - root time zone check 


Data collections completed. Checking best practices on dalvna1.
--------------------------------------------------------------------------------------


 INFO =>    ORA-00600 errors found in alert log for vnadbqa
 INFO =>    ORA-07445 errors found in alert log for vnadbqa
 WARNING => Some user sessions lack proper failover mode (BASIC) and method (SELECT) for vnadbqa
 INFO =>    Some data or temp files are not autoextensible for vnadbqa
 WARNING => Open file limit for root user (ulimit -n) is NOT >= 65536 or unlimited
 WARNING => NIC bonding is NOT configured for public network (VIP)
 WARNING => OSWatcher is not running as is recommended.
 INFO =>    Jumbo frames (MTU >= 8192) are not configured for interconnect
 FAIL =>    Database parameter DB_BLOCK_CHECKSUM is NOT set to recommended value on vnaqa1 instance
 FAIL =>    Database parameter DB_LOST_WRITE_PROTECT is NOT set to recommended value on vnaqa1 instance
 INFO =>    umask for RDBMS owner is not set to 0022
 WARNING => Database parameter DB_BLOCK_CHECKING on PRIMARY is NOT set to the recommended value. for vnadbqa
 INFO =>    Operational Best Practices
 INFO =>    Consolidation Database Practices
 INFO =>    Computer failure prevention best practices
 INFO =>    Data corruption prevention best practices
 INFO =>    Logical corruption prevention best practices
 INFO =>    Database/Cluster/Site failure prevention best practices
 INFO =>    Client failover operational best practices
 WARNING => fast_start_mttr_target has NOT been changed from default on vnaqa1 instance

 INFO =>    Information about hanganalyze and systemstate dump
 INFO =>    Validate Your Configuration is in Compliance with Oracle Security Alert for CVE-2012-1675 /u01/app/oracle/product/11203/db_1
 INFO =>    Validate Your Listener Configurations are in Compliance with Oracle Security Alert for CVE-2012-1675 /u01/app/grid/11203
 INFO =>    Database failure prevention best practices
 FAIL =>    Primary database is NOT protected with Data Guard (standby database) for real-time data protection and availability for vnadbqa
 WARNING => Redo log write time is more than 500 milliseconds for vnadbqa
 WARNING => ASM memory_target is < recommended value
 WARNING => TFA Collector is either not installed or not running
 INFO =>    Parallel Execution Health-Checks and Diagnostics Reports for vnadbqa
 FAIL =>    The data files should be recoverable for vnadbqa


Best Practice checking completed.Checking recommended patches on dalvna1.
---------------------------------------------------------------------------------


Collecting patch inventory on  CRS HOME /u01/app/grid/11203
Collecting patch inventory on ORACLE_HOME /u01/app/oracle/product/11203/db_1 
---------------------------------------------------------------------------------
1 Recommended CRS patches for 112030 from /u01/app/grid/11203 on dalvna1
---------------------------------------------------------------------------------
Patch#   CRS  ASM    RDBMS RDBMS_HOME                              Patch-Description                            
---------------------------------------------------------------------------------
16083653 yes          yes /u01/app/oracle/product/11203/db_1GRID INFRASTRUCTURE PATCH SET UPDATE 11      
---------------------------------------------------------------------------------


---------------------------------------------------------------------------------
35 Recommended RDBMS patches for 112030 from /u01/app/oracle/product/11203/db_1 on dalvna1
---------------------------------------------------------------------------------
Patch#   RDBMS    ASM     type                Patch-Description                       
---------------------------------------------------------------------------------
13819954  yes            N-APPLY             Patch description: "Database Patch Set U
16083653  yes            merge               GRID INFRASTRUCTURE PATCH SET UPDATE 11.
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------


---------------------------------------------------------------------------------
              Clusterware patches summary report
---------------------------------------------------------------------------------
Total patches  Applied on CRS Applied on RDBMS Applied on ASM 
---------------------------------------------------------------------------------
1              1              1                0              
---------------------------------------------------------------------------------


---------------------------------------------------------------------------------
              RDBMS homes patches summary report
---------------------------------------------------------------------------------
Total patches  Applied on RDBMS Applied on ASM ORACLE_HOME    
---------------------------------------------------------------------------------
 35             35             0                /u01/app/oracle/product/11203/db_1
---------------------------------------------------------------------------------



=============================================================
                    Node name - dalvna2                                
=============================================================
Collecting - CPU Information
Collecting - CRS active version
Collecting - CRS oifcfg
Collecting - CRS software version
Collecting - Cluster interconnect (clusterware)
Collecting - Huge pages configuration
Collecting - Kernel parameters
Collecting - Linux module config.
Collecting - Maximum number of semaphore sets on system
Collecting - Maximum number of semaphores on system
Collecting - Maximum number of semaphores per semaphore set
Collecting - Memory Information
Collecting - OS Packages
Collecting - Operating system release information and kernel version
Collecting - Oracle Executable Attributes
Collecting - Patches for Grid Infrastructure 
Collecting - Patches for RDBMS Home 
Collecting - Shared memory segments
Collecting - Table of file system defaults
Collecting - number of semaphore operations per semop system call

Preparing to run root privileged commands  dalvna2.

Collecting - ACFS and ASM driver version comparison [ACFS] 
Collecting - Broadcast Requirements for Networks 
Collecting - CRS user time zone check 
Collecting - Generic ACFS health [ACFS] 
Collecting - Grid Infastructure user shell limits configuration 
Collecting - Health of the mounted ACFS file systems [ACFS] 
Collecting - Health of unmounted ACFS file systems [ACFS] 
Collecting - Interconnect interface config 
Collecting - Network interface stats 
Collecting - OCFS2 disks 
/tmp/root_raccheck.sh: line 54: /sbin/mounted.ocfs2: No such file or directory
Collecting - Root Open File Limit 
Collecting - Verify ioctl to advm [ACFS] 
Collecting - Volume list for unmount ACFS file system [ACFS] 
Collecting - ocsf status 
/tmp/root_raccheck.sh: line 74: /etc/init.d/o2cb: No such file or directory
Collecting - root time zone check 


Data collections completed. Checking best practices on dalvna2.
--------------------------------------------------------------------------------------


 WARNING => Open file limit for root user (ulimit -n) is NOT >= 65536 or unlimited
 WARNING => NIC bonding is NOT configured for public network (VIP)
 WARNING => OSWatcher is not running as is recommended.
 INFO =>    Jumbo frames (MTU >= 8192) are not configured for interconnect
 FAIL =>    Database parameter DB_BLOCK_CHECKSUM is NOT set to recommended value on vnadbqa2 instance
 FAIL =>    Database parameter DB_LOST_WRITE_PROTECT is NOT set to recommended value on vnadbqa2 instance
 INFO =>    umask for RDBMS owner is not set to 0022
 WARNING => Database parameter DB_BLOCK_CHECKING on PRIMARY is NOT set to the recommended value. for vnadbqa
 WARNING => fast_start_mttr_target has NOT been changed from default on vnadbqa2 instance

 INFO =>    Validate Your Configuration is in Compliance with Oracle Security Alert for CVE-2012-1675 /u01/app/oracle/product/11203/db_1
 INFO =>    Validate Your Listener Configurations are in Compliance with Oracle Security Alert for CVE-2012-1675 /u01/app/grid/11203
 WARNING => Redo log write time is more than 500 milliseconds for vnadbqa
 WARNING => ASM memory_target is < recommended value
 WARNING => TFA Collector is either not installed or not running


Best Practice checking completed.Checking recommended patches on dalvna2.
---------------------------------------------------------------------------------


Collecting patch inventory on  CRS HOME /u01/app/grid/11203 
Collecting patch inventory on ORACLE_HOME /u01/app/oracle/product/11203/db_1 
---------------------------------------------------------------------------------
1 Recommended CRS patches for 112030 from /u01/app/grid/11203 on dalvna2
---------------------------------------------------------------------------------
Patch#   CRS  ASM    RDBMS RDBMS_HOME                              Patch-Description                            
---------------------------------------------------------------------------------
16083653 yes          yes /u01/app/oracle/product/11203/db_1GRID INFRASTRUCTURE PATCH SET UPDATE 11      
---------------------------------------------------------------------------------


---------------------------------------------------------------------------------
35 Recommended RDBMS patches for 112030 from /u01/app/oracle/product/11203/db_1 on dalvna2
---------------------------------------------------------------------------------
Patch#   RDBMS    ASM     type                Patch-Description                       
---------------------------------------------------------------------------------
13819954  yes            N-APPLY             Patch description: "Database Patch Set U
16083653  yes            merge               GRID INFRASTRUCTURE PATCH SET UPDATE 11.
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------


---------------------------------------------------------------------------------
              Clusterware patches summary report
---------------------------------------------------------------------------------
Total patches  Applied on CRS Applied on RDBMS Applied on ASM 
---------------------------------------------------------------------------------
1              1              1                0              
---------------------------------------------------------------------------------


---------------------------------------------------------------------------------
              RDBMS homes patches summary report
---------------------------------------------------------------------------------
Total patches  Applied on RDBMS Applied on ASM ORACLE_HOME    
---------------------------------------------------------------------------------
 35             35             0                /u01/app/oracle/product/11203/db_1
---------------------------------------------------------------------------------



=============================================================
                    Node name - dalvna3                                
=============================================================
Collecting - CPU Information
Collecting - CRS active version
Collecting - CRS oifcfg
Collecting - CRS software version
Collecting - Cluster interconnect (clusterware)
Collecting - Huge pages configuration
Collecting - Kernel parameters
Collecting - Linux module config.
Collecting - Maximum number of semaphore sets on system
Collecting - Maximum number of semaphores on system
Collecting - Maximum number of semaphores per semaphore set
Collecting - Memory Information
Collecting - OS Packages
Collecting - Operating system release information and kernel version
Collecting - Oracle Executable Attributes
Collecting - Patches for Grid Infrastructure 
Collecting - Patches for RDBMS Home 
Collecting - Shared memory segments
Collecting - Table of file system defaults
Collecting - number of semaphore operations per semop system call

Preparing to run root privileged commands  dalvna3.

Collecting - ACFS and ASM driver version comparison [ACFS] 
Collecting - Broadcast Requirements for Networks 
Collecting - CRS user time zone check 
Collecting - Generic ACFS health [ACFS] 
Collecting - Grid Infastructure user shell limits configuration 
Collecting - Health of the mounted ACFS file systems [ACFS] 
Collecting - Health of unmounted ACFS file systems [ACFS] 
Collecting - Interconnect interface config 
Collecting - Network interface stats 
Collecting - OCFS2 disks 
/tmp/root_raccheck.sh: line 54: /sbin/mounted.ocfs2: No such file or directory
Collecting - Root Open File Limit 
Collecting - Verify ioctl to advm [ACFS] 
Collecting - Volume list for unmount ACFS file system [ACFS] 
Collecting - ocsf status 
/tmp/root_raccheck.sh: line 74: /etc/init.d/o2cb: No such file or directory
Collecting - root time zone check 


Data collections completed. Checking best practices on dalvna3.
--------------------------------------------------------------------------------------


 WARNING => Open file limit for root user (ulimit -n) is NOT >= 65536 or unlimited
 WARNING => NIC bonding is NOT configured for public network (VIP)
 WARNING => OSWatcher is not running as is recommended.
 INFO =>    Jumbo frames (MTU >= 8192) are not configured for interconnect
 FAIL =>    Database parameter DB_BLOCK_CHECKSUM is NOT set to recommended value on vnadbqa3 instance
 FAIL =>    Database parameter DB_LOST_WRITE_PROTECT is NOT set to recommended value on vnadbqa3 instance
 INFO =>    umask for RDBMS owner is not set to 0022
 WARNING => Database parameter DB_BLOCK_CHECKING on PRIMARY is NOT set to the recommended value. for vnadbqa
 WARNING => fast_start_mttr_target has NOT been changed from default on vnadbqa3 instance

 INFO =>    Validate Your Configuration is in Compliance with Oracle Security Alert for CVE-2012-1675 /u01/app/oracle/product/11203/db_1
 INFO =>    Validate Your Listener Configurations are in Compliance with Oracle Security Alert for CVE-2012-1675 /u01/app/grid/11203
 WARNING => Redo log write time is more than 500 milliseconds for vnadbqa
 WARNING => ASM memory_target is < recommended value
 WARNING => TFA Collector is either not installed or not running


Best Practice checking completed.Checking recommended patches on dalvna3.
---------------------------------------------------------------------------------


Collecting patch inventory on  CRS HOME /u01/app/grid/11203 
Collecting patch inventory on ORACLE_HOME /u01/app/oracle/product/11203/db_1 
---------------------------------------------------------------------------------
1 Recommended CRS patches for 112030 from /u01/app/grid/11203 on dalvna3
---------------------------------------------------------------------------------
Patch#   CRS  ASM    RDBMS RDBMS_HOME                              Patch-Description                            
---------------------------------------------------------------------------------
16083653 yes          yes /u01/app/oracle/product/11203/db_1GRID INFRASTRUCTURE PATCH SET UPDATE 11      
---------------------------------------------------------------------------------


---------------------------------------------------------------------------------
35 Recommended RDBMS patches for 112030 from /u01/app/oracle/product/11203/db_1 on dalvna3
---------------------------------------------------------------------------------
Patch#   RDBMS    ASM     type                Patch-Description                       
---------------------------------------------------------------------------------
13819954  yes            N-APPLY             Patch description: "Database Patch Set U
16083653  yes            merge               GRID INFRASTRUCTURE PATCH SET UPDATE 11.
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------


---------------------------------------------------------------------------------
              Clusterware patches summary report
---------------------------------------------------------------------------------
Total patches  Applied on CRS Applied on RDBMS Applied on ASM 
---------------------------------------------------------------------------------
1              1              1                0              
---------------------------------------------------------------------------------


---------------------------------------------------------------------------------
              RDBMS homes patches summary report
---------------------------------------------------------------------------------
Total patches  Applied on RDBMS Applied on ASM ORACLE_HOME    
---------------------------------------------------------------------------------
 35             35             0                /u01/app/oracle/product/11203/db_1
---------------------------------------------------------------------------------





---------------------------------------------------------------------------------
                      CLUSTERWIDE CHECKS
---------------------------------------------------------------------------------
---------------------------------------------------------------------------------
 
Detailed report (html) - /home/oracle/ck/raccheck/raccheck_dalvna1_vnadbqa_062713_142945/raccheck_dalvna1_vnadbqa_062713_142945.html


UPLOAD(if required) - /home/oracle/ck/raccheck/raccheck_dalvna1_vnadbqa_062713_142945.zip
Posted in RAC


We are already discussing our presentation delivery options for VMware World 2013 at San Franciso and Barcelona. We are considering videos and animations to take our presentation to a whole new level. I will be co-presenting on Virtualizing Mission Critical Oracle RAC with vSphere and vCenter Operations.

Session ID: VAPP5834
Session Title: Virtualizing Mission Critical Oracle RAC with vSphere and vCOPS
Track: Virtualizing Applications
Accepted presentation location: Both at San Francisco and Barcelona

Our Abstract:
“Oracle Real Application Clusters (RAC) is one of the most complex and mission critical platforms to virtualize. This detailed technical session will teach advanced deployment techniques for enterprise virtual platforms. Best practices and lessons learned from numerous enterprise projects will be shared with attendees. We will show how to fully provision advanced clusters in a matter of hours. Save days and even weeks of deploy time by attending this session. Anyone virtualizing Tier1 and Tier0 platforms will gain insights and knowledge of how to properly deploy business critical systems in VMware environments. Understanding key monitoring and performance metrics for Oracle complex systems through vCOPS will be shared with attendees. The secret sauce of customized vCOPS user interfaces for Oracle on VMware will be discussed.”

Presented by Charles Kim, Oracle ACE Director, VMware vExpert


To uninstall a package from the OS, you will simply pass the remove parameter to the yum command:

[root@rac1 Packages]# yum remove rsync
Loaded plugins: product-id, security, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Setting up Remove Process
Resolving Dependencies
--> Running transaction check
---> Package rsync.x86_64 0:3.0.6-9.el6 will be erased
--> Finished Dependency Resolution
Repository rhel-source is listed more than once in the configuration
Repository rhel-source-beta is listed more than once in the configuration

Dependencies Resolved

==============================================================================================================================================================================
 Package                    Arch                        Version                            Repository                                                                    Size
==============================================================================================================================================================================
Removing:
 rsync                      x86_64                      3.0.6-9.el6                        @anaconda-RedHatEnterpriseLinux-201301301459.x86_64/6.4                      682 k

Transaction Summary
==============================================================================================================================================================================
Remove        1 Package(s)

Installed size: 682 k
Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Erasing    : rsync-3.0.6-9.el6.x86_64                                                                                                                                   1/1 
  Verifying  : rsync-3.0.6-9.el6.x86_64                                                                                                                                   1/1 

Removed:
  rsync.x86_64 0:3.0.6-9.el6                                                                                                                                                  

Complete!

Yum can be leveraged to remove multiple packages at the same time by adding more package names to the command syntax. Yum is not capable of removing packages with dependencies. We can only perform this kind of operation with the RPM command and is not advised as we can potentially introduce instability and leave the system in a non-functioning state.


Posted by: Charles Kim, Oracle ACE Director and VMware vExpert

We will review the basics of installing Red Hat Linux Enterprise 6 Update 4 for the Intel 64-bit platform in a virtualized infrastructure to prepare an environment to install an Oracle cluster and database(s). To simplify the process and to demonstrate package installation procedures, we will select the option to install the Basic Server and required packages for Oracle databases afterwards. We will also go through a process of creating a local yum repository from the installation media to install packages with dependencies. As the final lesson, we will configure and customize Linux environment.

If you have a Red Hat subscription, you can download RHEL 6 ISO image files from Red Hat’s Software & Download Center customer portal. If you do not already have a subscription, you can obtain a free 30 evaluation subscription from https://access.redhat.com/downloads. Each of the DVD ISO images are about 3-4 GB in size. After you download the ISO image, create a bootable DVD and USB and reboot the system to start the installation.

RHEL64_1.png
In the Boot Menu, if there is no response within 60 seconds, the default option to Install or upgrade an existing system using the GUI will be executed.

RHEL_2.png
You will be given the option to perform a disk check on the installation media. Click on Skip.

RHEL_3.png
The Welcome screen does not accept any actionable inputs to respond to. Click on Next to continue.

RHEL_4.png
Select the language preference to be used for the installation. Please choose English from this option and click on Next

Rhel keyboard
Please select the default U.S. English and click on Next

Rhel basic storage
Select the Basic Storage type and click on Next

Rhel storage device warning

Since this is a fresh install, click on Yes, discard any data button

Enter computer name

Add hostname for the node
Click on Configure Network
Click on Edit

Edit system eth0

Change the Method to Manual
Supply IP and Netmask
Click on Apply
Then click on Close
Then Click on Next

Enter time zone

Select your timezone
Click on Next

Enter root password

Enter the password for root
Click on Next

Weak password

If this is a non-development environment, you will want to choose a more secure password. Since this is my lab, I will choose to Click on Use Anyway and continue.

Installation

Click on Review and Modify partition layout
Click on Next

Select a device

Click on Create

Add a partition

Select /tmp for Mount Point
Enter 4096 for Size (MB)
Click on OK

Format warnings

Click on Format

Warning storage configuration

Click on Write changes to disk

Boot loader

Click Next from the Boot loader list screen

Basic server

Select Basic Server and click on Next
It will perform a dependency check and start to perform the installation

Packages completed

Congratulations

Let’s remount our DVD so that we can copy all the RPMs from the DVD to a centralized location on the file system:
Mount cd

In order to setup a local Yum repository, we need to install the createrepo package. The createrepo package has dependencies on two additional packages: deltarpm and python-deltarpm. To successfully install the createrepo package, we will invoke the rpm command with the -ihv option and provide the names of all three packages:
Rpm install

We have successfully installed the createrepo package. The next step will be to copy all the RPMs from the DVD to an area on the local file system. In my example, I am copying the files to the /tmp file system but you will want to select a more permanent location.

We need to create a local yum repository (local.repo) file with the following entries:

[root@rac1 yum.repos.d]# cat /etc/yum.repos.d/local.repo
[local]
name=Local Repository
baseurl=file:///tmp/Packages
enabled=1
gpgcheck=0
protect=1

Notice the baseurl parameter. You will need to replace this parameter with the location of your local directory where you placed all the RPMs. After the files are copied, we will invoke the createrepo command and provide the location of the directory where the RPMs were copied to:
Createrepo

We have successfully created our local yum repository. Now we are ready to install required packages for oracle with yum. Let’s install the rsync RPM for demonstration purposes:

[root@rac1 Packages]# yum install rsync
Loaded plugins: product-id, security, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Repository rhel-source is listed more than once in the configuration
Repository rhel-source-beta is listed more than once in the configuration
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package rsync.x86_64 0:3.0.6-9.el6 will be installed
--> Finished Dependency Resolution

Dependencies Resolved

==============================================================================================================================================================================
 Package                                 Arch                                     Version                                       Repository                               Size
==============================================================================================================================================================================
Installing:
 rsync                                   x86_64                                   3.0.6-9.el6                                   local                                   334 k

Transaction Summary
==============================================================================================================================================================================
Install       1 Package(s)

Total download size: 334 k
Installed size: 682 k
Is this ok [y/N]: y
Downloading Packages:
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : rsync-3.0.6-9.el6.x86_64                                                                                                                                   1/1 
  Verifying  : rsync-3.0.6-9.el6.x86_64                                                                                                                                   1/1 

Installed:
  rsync.x86_64 0:3.0.6-9.el6                                                                                                                                                  

Complete!

Now you are ready to install the remaining packages required by oracle. Here’s a comprehensive list for Oracle Database 11g Release 2:

•	binutils-2.20.51.0.2-5.11.el6 (x86_64)
•	compat-libstdc++-33-3.2.3-69.el6 (x86_64)
•	glibc-2.12-1.7.el6 (x86_64)
•	ksh-*.el6 (x86_64) 
•	libaio-0.3.107-10.el6 (x86_64)
•	libgcc-4.4.4-13.el6 (x86_64)
•	libstdc++-4.4.4-13.el6 (x86_64)
•	make-3.81-19.el6 (x86_64)
•	compat-libcap1-1.10-1 (x86_64)
•	gcc-4.4.4-13.el6 (x86_64)
•	gcc-c++-4.4.4-13.el6 (x86_64)
•	glibc-devel-2.12-1.7.el6 (x86_64)
•	libaio-devel-0.3.107-10.el6 (x86_64)
•	libstdc++-devel-4.4.4-13.el6 (x86_64)
•	sysstat-9.0.4-11.el6 (x86_64)
•	unixODBC-2.2.11-7.1.x86_64.rpm 
•	unixODBC-2.2.11-7.1.i386.rpm 
•	UnixODBC-devel-2.2.11-7.1.x86_64.rpm 
•	UnixODBC-devel-2.2.11-7.1.i386.rpm

Optional 32-bit Client Software

•	compat-libstdc++-33-3.2.3-69.el6 (i686)
•	glibc-2.12-1.7.el6 (i686)
•	glibc-devel-2.12-1.7.el6 (i686)
•	libaio-0.3.107-10.el6 (i686)
•	libaio-devel-0.3.107-10.el6 (i686)
•	libgcc-4.4.4-13.el6 (i686)
•	libstdc++-4.4.4-13.el6 (i686)
•	libstdc++-devel-4.4.4-13.el6 (i686)

In addition to the above mention RPMs, here’s some additional set of executables that may be of interest to you as an administrator:

•	screen
•	wget
•	rsync
•	nmon
•	uuencode

Next, let’s modify our kernel parameters in /etc/sysctl.conf. In your environment, if the kernel parameter value is higher than the value listed below, you should not lower the value. Range values (such as net.ipv4.ip_local_port_range) should match accordingly. Parameters like SHMMAX should be adjusted accordingly to the amount of physical memory on the database server (i.e. For example, I generally tell customers that a good number to start is ½ or 2/3 or more of physical memory depending on how much physical memory that they have and how much PGA and number of dedicated server processes they expect). SHMALL is also derived based on physical RAM size / pagesize.

# --Setting up Kernel Parameters for Oracle Database 11g R2 installation
#

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 4398046511104

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 1073741824
# For 11g, recommended value for file-max is 6815744
fs.file-max = 6815744
# For 10g, uncomment 'fs.file-max 327679', comment other entries for this parameter and re-run sysctl -p
# fs.file-max:327679
kernel.msgmni = 2878
kernel.sem = 250 32000 100 142
kernel.shmmni = 4096
net.core.rmem_default = 262144
# For 11g, recommended value for net.core.rmem_max is 4194304
net.core.rmem_max = 4194304
# For 10g, uncomment 'net.core.rmem_max 2097152', comment other entries for this parameter and re-run sysctl -p
# net.core.rmem_max=2097152
net.core.wmem_default = 262144
# For 11g, recommended value for wmem_max is 1048576
net.core.wmem_max = 1048576
# For 10g, uncomment 'net.core.wmem_max 262144', comment other entries for this parameter and re-run sysctl -p
# net.core.wmem_max:262144
fs.aio-max-nr = 3145728
# For 11g, recommended value for ip_local_port_range is 9000 65500
net.ipv4.ip_local_port_range = 9000 65500
# For 10g, uncomment 'net.ipv4.ip_local_port_range 1024 65000', comment other entries for this parameter and re-run sysctl -p
# net.ipv4.ip_local_port_range:1024 65000
# Added min_free_kbytes 50MB to avoid OOM killer on EL4/EL5
vm.min_free_kbytes = 51200
#
fs.file-max = 6815744

To activate these new settings into the running kernel space, run the “sysctl –p” command as root.

As the final steps, we need to create the dba, oinstall, and asmadmin groups and the oracle user account. The following set of scripts can be leveraged to automate this process:

/usr/sbin/groupadd -g xxxx oinstall
/usr/sbin/groupadd -g xxxx dba
/usr/sbin/groupadd -g xxxx asmadmin

Where xxx is the gid that conforms to your corporate standards.

useradd -u xxxx -g xxxx -s /bin/bash -d /home/oracle -c "Oracle Admin" oracle
passwd oracle

Where xxxx is the uid and gid that conforms to your corporate standards.

Lastly, we will create the directories for grid infrastructure and oracle database software:

mkdir –p /u01/app/11.2.0/grid
mkdir -p /u01/app/oracle/product
mkdir -p /u01/app/oraInventory
mkdir –p /u01/app/oracle/general

chown -R oracle:oinstall /u01/app
chmod -R 775 /u01/app

Please set the home directory for oracle to be: /home/oracle
Please make sure we have the nobody user created


You should synchronize your system time between your RAC nodes and even for your primary and standby database server by enabling the NTP daemon. You should enable NTP with the –x option to allow for gradual time changes, also referred to as slewing. This slewonly option is mandatory for Real Application Clusters (RAC), but is also recommended for Data Guard configurations. To setup NTP with the –x option, you need to modify the /etc/sysconfig/ntpd file and add the desired flag to the OPTIONS variable, and restart the service with the “service ntpd restart” command.

# Drop root to id 'ntp:ntp' by default.
#OPTIONS="-u ntp:ntp -p /var/run/ntpd.pid -g"
OPTIONS="-x -u ntp:ntp -p /var/run/ntpd.pid"

You can check your current NTP configuration by checking the process status and filtering on the ntp daemon. In the example below, we will start the ntpd service and check to confirm that the settings are correct with the ps command:

[root@rac1 sysconfig]# service ntpd start
Starting ntpd:                                             [  OK  ]

[root@rac1 sysconfig]# ps -ef |grep -i ntp
ntp       3496     1  0 10:38 ?        00:00:00 ntpd -x -u ntp:ntp -p /var/run/ntpd.pid
root      3500  2420  0 10:39 pts/1    00:00:00 grep -i ntp

We need to test the private interconnect network bandwidth to make sure that it will accommodate our RAC cache fusion workload. Majority of the companies leverage their existing network switches and provide a separate non-routable private VLAN available to only the members of the cluster.

For this particular example, we are testing the bandwidth with 1M and 4M input and output block sizes.

cat scp_performance.ksh
echo == 1M files x 100
dd if=/dev/zero bs=1M count=100 |ssh rhel59dra-priv dd of=/dev/null
echo == 1M files x 1000
dd if=/dev/zero bs=1M count=1000 |ssh rhel59dra-priv dd of=/dev/null

echo == 4M files x 100
dd if=/dev/zero bs=4M count=100 |ssh rhel59dra-priv dd of=/dev/null
echo == 10M files x 100
dd if=/dev/zero bs=10M count=100 |ssh rhel59dra-priv dd of=/dev/null
echo == 100M files x 100
dd if=/dev/zero bs=100M count=100 |ssh rhel59dra-priv dd of=/dev/null

Here a sample output from executing this simple shell script:

$ ./scp_performance.ksh
== 1M files x 100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 1.99392 seconds, 52.6 MB/s
200671+10061 records in
204800+0 records out
104857600 bytes (105 MB) copied, 1.89443 seconds, 55.4 MB/s
== 1M files x 1000
1000+0 records in
1000+0 records out
1048576000 bytes (1.0 GB) copied, 18.505 seconds, 56.7 MB/s
2012814+89686 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 18.4261 seconds, 56.9 MB/s
== 4M files x 100
100+0 records in
100+0 records out
419430400 bytes (419 MB) copied, 7.16756 seconds, 58.5 MB/s
806277+34772 records in
819200+0 records out
419430400 bytes (419 MB) copied, 7.09103 seconds, 59.1 MB/s
== 10M files x 100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB) copied, 17.959 seconds, 58.4 MB/s
2016332+86632 records in
2048000+0 records out
1048576000 bytes (1.0 GB) copied, 17.9087 seconds, 58.6 MB/s
== 100M files x 100
100+0 records in
100+0 records out
10485760000 bytes (10 GB) copied, 183.415 seconds, 57.2 MB/s
20145249+862669 records in
20480000+0 records out
10485760000 bytes (10 GB) copied, 183.329 seconds, 57.2 MB/s
Posted in RAC

This little exercise will demonstrate how we can move the control file from one ASM disk group to another or make multiple copies of the control file or even move the control file from the file system to ASM. First change the initialization parameter for the new location of the control files. We should at a minimum always multiplex the control files in ASM. One copy of the control file should go into the +DATA disk group. Another copy of the control file should go to the +FRA disk group:

SQL> alter system set control_files='+DATA/PROD/controlfile/control01.ctl','+fra/PROD/controlfile/control02.ctl' scope=spfile;

System altered.

Next, we need to shutdown the database and start an instance in nomount mode. If you are on RAC, this is particularly important as all instances need to be down:

[oracle@rhel59a dbca]$ rman target /
RMAN> shutdown immediate;

using target database control file instead of recovery catalog
Oracle instance shut down

RMAN> startup nomount;

connected to target database (not started)
Oracle instance started

Total System Global Area     801701888 bytes

Fixed Size                     2232640 bytes
Variable Size                595594944 bytes
Database Buffers             197132288 bytes
Redo Buffers                   6742016 bytes

Next, we want to restore the control file from one of the copies (what was the original control_files initialization parameter) and mount the database to confirm that it worked:

RMAN> restore controlfile from '+DATA02/PROD/controlfile/current.266.809597795';

Starting restore at 11-MAR-13
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=16 instance=PROD1 device type=DISK

channel ORA_DISK_1: copied control file copy
output file name=+DATA/PROD/controlfile/control01.ctl
output file name=+FRA/PROD/controlfile/control02.ctl
Finished restore at 11-MAR-13

RMAN> alter database mount;

database mounted
released channel: ORA_DISK_1

As the final step, we want to shut the instance down again and bring the database up across all the RAC nodes with the srvctl command:

RMAN> shutdown immediate;

database dismounted
Oracle instance shut down

RMAN> exit

Recovery Manager complete.

[oracle@rhel59a dbca]$ srvctl start database -d PROD

SQL> show parameter control_files

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
control_files                        string      +DATA/PROD/controlfile/contr
                                                 ol01.ctl, +FRA/PROD/controlf
                                                 ile/control02.ctl

Posted in ASM