APPS:ASM:DATAGUARD:EXADATA:OEM:RAC: 2012

Friday, September 21, 2012

Simple Mantra on Developing Database Software:

Here is the simple mantra from my favourite Oracle Man Tom Kyte.

1).You should do it in a single SQL statement if at all possible.

2).If you cannot do it in a single SQL statement, do it in PL/SQL.

3).If you cannot do it in PL/SQL, try a Java stored procedure.

4).If you cannot do it in Java, do it in a C external procedure.

5).If you cannot do it in a C external procedure, you might want to seriously think about why it is you need to do it.

Wednesday, August 29, 2012

Tom Kyte's View on Pro C

"You need to know C well before using Pro*C.Pro*c simply lets you embedded SQL in C code. If you do not know how to code in C, you will not get very far at all with Pro*C.I use pro*c only when I cannot accomplish the task efficiently in PLSQL or SQL. In 9i or Above,with external tables, merge, pipelined functions -- I'm very very hard pressed to find a reason to use C.I've use pro*c to dump data to flat files and sometimes to load it."

Friday, August 3, 2012

How To Set Time Prompt in SQLPLUS

We need to modify the glogin.sql sql file which will be located under $ORACLE_HOME/sqlplus/admin.
Take the backup of the glogin.sql and add the below entry.

set lines 200
set pages 200
set termout off
col dbname new_value prompt_dbname
select substr(global_name,1,instr(global_name,'.')-1) dbname
from global_name;
set sqlprompt "&&prompt_dbname> "
set termout on
set time on

System Statistics : A Quick Look

1).Till 8i ANALYZE command was used to collect statistics.However ANALYZE has following Problems

       1).Doesn't support External Tables.
       2).Not good for partitioned Table. As it gathers stats for only segment and index level.

2).From 9i Its recommended to use dbms_stats.With dbms_stats,You can export, import, lock, unlock, get, set, gather and delete at database/dictionary/table/schema/index/system level.

3).Till 8i the CBO's Cost Estimations were based on I/O Cost Model.

4).Single Block and Multi Block reads are equally costly. Though in reality single block reads are faster. This gave unfair advantage to the Full Table Scans.To fix this we used optimizer_index_caching and optimizer_index_cost_adj is used.

5).New Cost model was introduced in 9i. Known as "CPU Cost Model".Wherein it gets performance of CPU and I/O Subsystem.

6).9i has no System Stats were available by default.10g has.

7).There are 2 types of System Statistics

        1).No workload Stats(Artificial Application Load).

        2).Workload Stats(Real Application Load)

8).The main difference is method used to measure I/O Performance.You can still use I/O Cost Model in 10g by using _optimizer_cost_model to IO

9).System Stats are stored in AUX_STATS$ table. 
 
   SQL>SELECT pname, pval1, pval2   FROM sys.aux_stats$  WHERE sname = 'SYSSTATS_INFO';
 
       For Pname – Status can be completed / badstat

10).SYSSTATS_MAIN set contains system statistics themselves.
    
   SQL>SELECT pname, pval1   FROM sys.aux_stats$  WHERE sname = 'SYSSTATS_MAIN';
 
11).Single set of statistics per database.So if a RAC machine has different configuration.Need to decide whether you need enable system statistics.

12).You Gather system statistics using below command,
 
   SQL>dbms_stats.gather_system_stats(gathering_mode => 'noworkload');

13).When using “noworkload” make sure the system is idle. As database uses synthetic benchmark. In 10g even if you delete the system statistics. Database will gather the statistics during next start-up.

14).Gathering statistics usually takes less than one minute.

15).With workload Statistcs,you use

         1).exec dbms_stats.gather_system_stats(gathering_mode => 'start')
         2).Do actual work.  
         3).dbms_stats.gather_system_stats(gathering_mode => 'stop')

16).To avoid manually taking the ending snapshot, it is also possible to set the parameter gathering_mode

   SQL> dbms_stats.gather_system_stats(gathering_mode => 'interval', interval => 30)

17).You can also manually the set the statistics. This is helpful if you have different workload for in different period and you want to average the values.Say for example,

   SQL>dbms_stats.set_system_stats(pname => 'CPUSPEED', pvalue => 772);

18).The CPU cost is defined as number of single-block reads that can be performed per second.


19).Workload statistics are preferred to “noworkload” when both of them are available.

20).Query optimizer performs several sanity checks that could disable or partially replace workload statistics.For example if mreadtim is less than sreadtime CBO re-computes again.

Courtesy: Troubleshooting Oracle Performance by Christian Antognini.

How to Start 11g R2 RAC Cluster

There are 2 methods.

Method 1:

1).Start the HAS (High Availability Service) on each node.

$ crsctl start has

2).To Start the Cluster on both the nodes.

$ crsctl start cluster –all

If OHASD is already running, you can start the Clusterware stack on any node from local node

Method 2:

1).Run the below command on each node. Starts both HAS and CRS

$ crsctl start crs

Noticed that just starting has also starts up the whole stack.

How to Check Clusterware Version and Name

To check the software versions on a single node. Software version is the lastest version installed in on an cluster.

You would use this option,When you do the rolling upgrade.

$ crsctl query crs softwareversion [hostname]

Active version is lowest version anywhere in the cluster. This is the command you would normally need to use

$ crsctl query crs activeversion

The version of oracle Clusterware must be always greater than the oracle products installed in the clusters

Permanently operating oracle Clusterware (software version vs active version) is not supported.

To check the cluster name use,

$ORA_CRS_HOME/bin/cemutlo -n

UNIX Useful Commands - FIND

1).Find files modified within specific time

Find files that have been modified less than 3 days ago.

`find . -mtime -3`

To be more specific and just check for the file change time, use:

`find . -ctime -3`

To be even more specific you can set a date range by using: Find all files changed, and thus modified, at least one day ago but within three days ago.

`find . -ctime +1 -a -ctime -3`

2).Find and Replace text in Multiple Files

Use,

`find . -type f | xargs grep -l 'IBM' | xargs sed -i '' -e 's/IBM/ORACLE/g'`

The first part of the command is find, which finds all files and excludes directories. That result is then piped to grep, which lists all files that contain IBM. The results of is then sent to sed, which replaces all occurances of IBM with ORACLE.

3).Find and Move files to a specific location

Use,

`find . -newer lmd.trc -exec cp '{}' /tmp/ \ ;`

Above command find the file newer than the lmd.trc and then copies all the files to the tmp directory.

Unix Useful Commands - Part 2

1). Find whether OS is 64/32 Bit Kernel in UNIX

OS	Command
AIX	getconf -a \| grep –i KERN OR bootinfo -K OR file /usr/lib/boot/unix
HP-UX	getconf -a \| grep -i KERN
Solaris	isainfo –kv
Linux	uname –a

2).Find Free Physical Memory in UNIX

OS	Command
AIX	bootinfo -r
HP-UX	grep -i Physical /var/adm/syslog/syslog.log
Solaris	prtconf
Linux	free -m

3).Find CPU details in UNIX

OS	Command
AIX	lsdev -Cc processor
HP-UX	ioscan –fnC processor
Solaris	psrinfo -v
Linux	cat /proc/cpuinfo

4).Find SWAP Size details in UNIX

OS	Command
AIX	lsps –a
HP-UX	`swapinfo -a`
Solaris	swap –l
Linux	free -m

Clusterware Processes in 11g RAC R2 Environment

i).Cluster Ready Services (CRS)

$ ps -ef | grep crs | grep -v grep

root 25863 1 1 Oct27 ? 11:37:32 /opt/oracle/grid/product/11.2.0/bin/crsd.bin reboot

crsd.bin => The above process is responsible for start, stop, monitor and failover of resource. It maintains OCR and also restarts the resources when the failure occurs.

This is applicable for RAC systems. For Oracle Restart and ASM ohasd is used.

ii).Cluster Synchronization Service (CSS)

$ ps -ef | grep -v grep | grep css

root 19541 1 0 Oct27 ? 00:05:55 /opt/oracle/grid/product/11.2.0/bin/cssdmonitor

root 19558 1 0 Oct27 ? 00:05:45 /opt/oracle/grid/product/11.2.0/bin/cssdagent

oragrid 19576 1 6 Oct27 ? 2-19:13:56 /opt/oracle/grid/product/11.2.0/bin/ocssd.bin

cssdmonitor => Monitors node hangs(via oprocd functionality) and monitors OCCSD process hangs (via oclsomon functionality) and monitors vendor clusterware(via vmon functionality).This is the multi threaded process that runs with elavated priority.

Startup sequence: INIT --> init.ohasd --> ohasd --> ohasd.bin --> cssdmonitor

cssdagent => Spawned by OHASD process.Previously(10g) oprocd, responsible for I/O fencing.Killing this process would cause node reboot.Stops,start checks the status of occsd.bin daemon

Startup sequence: INIT --> init.ohasd --> ohasd --> ohasd.bin --> cssdagent

occsd.bin => Manages cluster node membership runs as oragrid user.Failure of this process results in node restart.

Startup sequence: INIT --> init.ohasd --> ohasd --> ohasd.bin --> cssdagent --> ocssd --> ocssd.bin

iii) Event Management (EVM)

$ ps -ef | grep evm | grep -v grep

oragrid 24623 1 0 Oct27 ? 00:30:25 /opt/oracle/grid/product/11.2.0/bin/evmd.bin

oragrid 25934 24623 0 Oct27 ? 00:00:00 /opt/oracle/grid/product/11.2.0/bin/evmlogger.bin -o /opt/oracle/grid/product/11.2.0/evm/log/evmlogger.info -l /opt/oracle/grid/product/11.2.0/evm/log/evmlogger.log

evmd.bin => Distributes and communicates some cluster events to all of the cluster members so that they are aware of the cluster changes.

evmlogger.bin => Started by EVMD.bin reads the configuration files and determines what events to subscribe to from EVMD and it runs user defined actions for those events.

iv).Oracle Root Agent

$ ps -ef | grep -v grep | grep orarootagent

root 19395 1 0 Oct17 ? 12:06:57 /opt/oracle/grid/product/11.2.0/bin/orarootagent.bin

root 25853 1 1 Oct17 ? 16:30:45 /opt/oracle/grid/product/11.2.0/bin/orarootagent.bin

orarootagent.bin => A specialized oraagent process that helps crsd manages resources owned by root, such as the network, and the Grid virtual IP address.

The above 2 process are actually threads which looks like processes. This is a Linux specific

v).Cluster Time Synchronization Service (CTSS)

$ ps -ef | grep ctss | grep -v grep

root 24600 1 0 Oct27 ? 00:38:10 /opt/oracle/grid/product/11.2.0/bin/octssd.bin reboot

octssd.bin => Provides Time Management in a cluster for Oracle Clusterware

vi).Oracle Agent

$ ps -ef | grep -v grep | grep oraagent

oragrid 5337 1 0 Nov14 ? 00:35:47 /opt/oracle/grid/product/11.2.0/bin/oraagent.bin

oracle 8886 1 1 10:25 ? 00:00:05 /opt/oracle/grid/product/11.2.0/bin/oraagent.bin

oragrid 19481 1 0 Oct27 ? 01:45:19 /opt/oracle/grid/product/11.2.0/bin/oraagent.bin

oraagent.bin => Extends clusterware to support Oracle-specific requirements and complex resources. This process runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g Release 1 (11.1).

ORACLE HIGH AVAILABILITY SERVICES STACK

i) Cluster Logger Service

$ ps -ef | grep -v grep | grep ologgerd

root 24856 1 0 Oct27 ? 01:43:48 /opt/oracle/grid/product/11.2.0/bin/ologgerd -m mg5hfmr02a -r -d /opt/oracle/grid/product/11.2.0/crf/db/mg5hfmr01a

ologgerd => Receives information from all the nodes in the cluster and persists in a CHM repository-based database. This service runs on only two nodes in a cluster

ii).System Monitor Service (osysmond)

$ ps -ef | grep -v grep | grep osysmond

root 19528 1 0 Oct27 ? 09:42:16 /opt/oracle/grid/product/11.2.0/bin/osysmond

osysmond => The monitoring and operating system metric collection service that sends the data to the cluster logger service. This service runs on every node in a cluster

iii). Grid Plug and Play (GPNPD):

$ ps -ef | grep gpn

oragrid 19502 1 0 Oct27 ? 00:21:13 /opt/oracle/grid/product/11.2.0/bin/gpnpd.bin

gpnpd.bin => Provides access to the Grid Plug and Play profile, and coordinates updates to the profile among the nodes of the cluster to ensure that all of the nodes have the most recent profile.

iv).Grid Interprocess Communication (GIPC):

$ ps -ef | grep -v grep | grep gipc

oragrid 19516 1 0 Oct27 ? 01:51:41 /opt/oracle/grid/product/11.2.0/bin/gipcd.bin

gipcd.bin => A support daemon that enables Redundant Interconnect Usage.

v). Multicast Domain Name Service (mDNS):

$ ps -ef | grep -v grep | grep dns

oragrid 19493 1 0 Oct27 ? 00:01:18 /opt/oracle/grid/product/11.2.0/bin/mdnsd.bin

mdnsd.bin => Used by Grid Plug and Play to locate profiles in the cluster, as well as by GNS to perform name resolution. The mDNS process is a background process on Linux and UNIX and on Windows.

vi).Oracle Grid Naming Service (GNS)

$ ps -ef | grep -v grep | grep gns

gnsd.bin => Handles requests sent by external DNS servers, performing name resolution for names defined by the cluster.

Troubleshooting :Cursor :Pin S Wait Event

Recently we had performance issue with a batch job on a 3 Node RAC running on Linux.The job got hung from the database side, all we could see is session waiting on the Cursor pin S wait event. is not moving anywhere

When I tried to find out the BLOCKING_SESSION & BLOCKING_INSTANCE in the v$session. The column was blank.!!

Here is how the issue got fixed.

select sid,serial#,p1,p2,p3 from v$session where sid=178;
SID SERIAL# P1 P2 P3
---------- ---------- ---------- ---------- -------------------- ---------- ---------- ---------- ----------
178 12856 3552538230 6.4425E+11 2.5761E+10

Get the P3 Value from the above query and then substitute in the below query.You will the get Location

select decode(trunc(2.5376E+10/4294967296),0,trunc(2.5376E+10/65536),trunc(2.5376E+10/4294967296)) LOCATION_ID from dual;

LOCATION_ID
----------------------
5

The below query may not be useful in debugging. But could help to find more about the wait event.

SELECT MUTEX_TYPE, LOCATION FROM x$mutex_sleep WHERE mutex_type like 'Cursor Pin%'
and location_id=&&LOCATION_ID;
Enter value for location_id: 5
old 4: and location_id=&&LOCATION_ID
new 4: and location_id=5
MUTEX_TYPE LOCATION
--------------------------------------------
Cursor Pin kkslce [KKSCHLPIN2]

select MUTEX_TYPE,LOCATION,REQUESTING_SESSION,BLOCKING_SESSION from V$MUTEX_SLEEP_HISTORY where MUTEX_IDENTIFIER=3552538230;

MUTEX_TYPE LOCATION REQUESTING_SESSION BLOCKING_SESSION
------------------------------------------------------------------------------------------------ ------------------ ----------------
Cursor Pin kkslce [KKSCHLPIN2] 178 150

Here P3 from the v$session is value for MUTEX_IDENTIFIER.

Lets see what session 150 is doing.

select SID,SERIAL#,STATUS,ACTION,STATE,EVENT from v$session where sid=150;

SID SERIAL# STATUS ACTION STATE EVENT
---------------- ----------------------------------------------------------------------------------------------------------
150 2905 KILLED JDBC Thin Client WAITED SHORT TIME library cache lock

Here we see that session is actually killed but still holding the lock.

select p.pid,p.SPID,s.SID from v$process p,v$session s where s.paddr = p.addr and s.SID = 150

SPID is the Operating system identifier (OS-PID)
SID is the Oracle session identifier
PID is the Oracle process identifier

Find the Oracle process and kill it at OS level.Once done.the Batch resumed smoothly.