Wednesday, April 16, 2014

All about Infiniband switches on Exadata :


1).The InfiniBand Network connects the database servers and Exadata Storage Servers using the InfiniBand switches on the rack. It is a private network between the database servers and Exadata Storage Servers.
2).A Exadata Rack contains at least 2 Infiniband switches. The third switch is called Spine Switch. Which connects both the leaf switches in half and full rack database machines.






3). Spine switch is for Connection of multiple racks to form a single and larger database machine environment.

4). Each Server (Storage and Database Server) Contains 2 Infiniband ports which are bonded together in ACTIVE/PASSIVE way (till X3 and ACTIVE/ACTIVE in X4).

5). The Active Passive connections are spread across both the switches using FAT-TREE switched Fabric Network Topology.

6). Infiniband switches run centOS.

7). MONITOR SWITCH PORTS:   To check failed switch and sensor hardware that exceeds preset thresholds, Run these commands every 1-2 minutes.
Use,
Login to the switch using root .
and run,
$ showunhealthy
OK - No unhealthy sensors
$ checkpower
PSU 0 present OK
PSU 1 present OK
All PSUs OK

7.1).In case of any issue reported on the above command, Use "env_test" command.
Login as root on IB switch and run,
# env_test
Environment test started:
Starting Environment Daemon test:
Environment daemon running
Environment Daemon test returned OK
Starting Voltage test:
Voltage ECB OK
Measured 3.3V Main = 3.27 V
Measured 3.3V Standby = 3.39 V
Measured 12V = 11.97 V
Measured 5V = 4.99 V
Measured VBAT = 3.09 V
Measured 2.5V = 2.49 V
Measured 1.8V = 1.78 V
Measured I4 1.2V = 1.22 V
Voltage test returned OK
Starting PSU test:
PSU 0 present OK
PSU 1 present OK
PSU test returned OK
Starting Temperature test:
Back temperature 29
Front temperature 30
SP temperature 48
Switch temperature 43, maxtemperature 45
Temperature test returned OK
Starting FAN test:
Fan 0 not present
Fan 1 running at rpm 12099
Fan 2 running at rpm 11881
Fan 3 running at rpm 12208
Fan 4 not present
FAN test returned OK
Starting Connector test:
Connector test returned OK
Starting Onboard ibdevice test:
Switch OK
All Internal ibdevices OK
Onboard ibdevice test returned OK
Starting SSD test:
SSD test returned OK
Environment test PASSED

8).MONITOR IB SWITCH PORTS : Use ibqueryerrors.pl on any of the database node or switches. Storage servers need not be checked as its automatically checked by Exadata Cell software(Part of MS)
Login as root to database or IB Switch and run,
# ibqueryerrors.pl -s RcvSwRelayErrors,RcvRemotePhysErrors,XmtDiscards,XmtContraintErrors,RcvContraintErrors,ExcBufOverrunErrors,Vl15Dropped
You should run this every 1 or 2 min to check if the value is raising.
Check for SymbolErrors,RcvErrors,LinkIntegrityErrors

9).  To check infiniband Firmware versions,
On infiniband Switch, Login as root user and then,
# version | head -1 | cut -d" " -f5
10).Monitor Database Node IB Ports:
Login to database server as root and then run
ibstatus => check that every port shows up in the output(2 per node).

 Sample Output :

Infiniband device 'mlx4_0' port 1 status:
        default gid:     fe80:0000:0000:0000:0021:2800:01ce:d28b
        base lid:        0x26
        sm lid:          0x3
        state:           4: ACTIVE
        phys state:      5: LinkUp
        rate:            40 Gb/sec (4X QDR)
        link_layer:      IB

Infiniband device 'mlx4_0' port 2 status:
        default gid:     fe80:0000:0000:0000:0021:2800:01ce:d28c
        base lid:        0x27
        sm lid:          0x3
        state:           4: ACTIVE
        phys state:      5: LinkUp
        rate:            40 Gb/sec (4X QDR)
        link_layer:      IB

perfquery => Check for SymbolErrors, RcvErrors, LinkIntegrityErrors
ifconfig, => Check the bondib0(ib0 and ib1) are up.
ping => Check for connectivity.
rds-ping => Check for connectivity.

11). Monitor the infiniband Fabric. Can be run from either Database node or one of the infiniband switches.
To Locate SM running,
Login as root on DB node or IB switch and run
# sminfo
sminfo: sm lid 3 sm guid 0x2128469156a0a0, activity count 55495849 priority 14 state 3 SMINFO_MASTER
and then
# ibswitches
Switch  : 0x002128469156a0a0 ports 36 "SUN DCS 36P QDR aeldb3sw-ibs0 10.146.28.50" enhanced port 0 lid 3 lmc 0
Switch  : 0x00212846914ba0a0 ports 36 "SUN DCS 36P QDR aeldb3sw-ibb0 10.146.28.52" enhanced port 0 lid 2 lmc 0
Switch  : 0x002128469157a0a0 ports 36 "SUN DCS 36P QDR aeldb3sw-iba0 10.146.28.51" enhanced port 0 lid 1 lmc 0

From the above command, 0x002128469156a0a0   is the Switch where SM is running(Compare the GUID) from the above command(sminfo)
Or login to one of IB switch and simply run,
# getmaster
Local SM not enabled
20140131 09:55:06 Master SubnetManager on sm lid 3 sm guid 0x2128469156a0a0 : SUN DCS 36P QDR aeldb3sw-ibs0 10.146.28.50

12) On a Full or Half rack node, Spine switch is present and thats where the SM should be running,
To identify spine switch,
run,
# ibnetdiscover -p | awk '/^SW + [0-9] +  + [0-9] + + 0x[0-9 \ a-e]+ + [0 - 9] + x .DR - [SW | CA] .*/ {if (spine [$4] == " ") spine[$4] == "yes" if ((spine [$8] == "CA")  spine[$4] == "no" } END { for (val in spine) if (spine [val] == "yes") print  val }'

13). Infiniband Cables are not as robust as Ethernet (RJ45) ones.  InfiniBand copper cables have strict

Specifications which define the minimum bend radius that they can tolerate.

Thursday, March 27, 2014

OPTIMIZER MODE in Oracle

                ALL_ROWS   : The optimizer will attempt to find an execution plan that completes the statement (typically meaning “returns all the rows”) in the shortest possible time. There are no special constraints built into this code. Used typically in OLAP Systems.

                FIRST_ROWS_N (N => 1, 10,100,1000) The optimizer first estimates the number of rows that will be returned by completely analyzing just the first join order. This tells it what fraction of the total data set the query is supposed to fetch, and it restarts the entire optimization process with the target of finding the execution plan that minimizes the resources required to return that fraction of the total data. This option was introduced in 9i. Used typically in OLTP systems.

                FIRST_ROWS:  Deprecated in 9i, but maintained for backward compatibility. The optimizer will attempt to find an execution plan to return the first row of a result set as rapidly as possible. There are several high-level constraints built into the code.

There are two other options for the optimizer_mode (even in 10g): rule and choose (default).

Up to Oracle9i, the default value is choose. This means that if object statistics are available for at least one of the objects referenced in the SQL statement that has to be optimized, the value all_rows is used; otherwise, rule is used.

As of Oracle Database 10g, the rule-based optimizer has been desupported , and therefore the new default value is all_rows. Consequently, the values choose and rule are no longer supported

Any tables that missed the most recent pass of statistics generation will then be subject to dynamic sampling. This is because the default value in 10g for the parameter optimizer_dynamic_sampling is 2 (which means use dynamic sampling on any table without statistics), rather than 1, as it was in 9i. 


Courtesy  : CBO Fundamentals by Johnathan Lewis and Troubleshooting Oracle Performance  by Christian Antognini

Friday, March 7, 2014

My SQL Scrapbook for Performance Tuning

1).Historic Details about Execution Plan , Elapsed Time , Plan Hash Value and Number of Execution.

col smple_time for a23
col SMPLE_TIME for 999999999999999
select sn.end_interval_time smple_time,
       sql.SQL_ID,
       sql.instance_number    ins,
       sql.PLAN_HASH_VALUE      planhash,
       sql.OPTIMIZER_COST       cost,
       sql.FETCHES_DELTA        fchs,
       sql.EXECUTIONS_DELTA     exes,
       sql.ROWS_PROCESSED_DELTA/1000    Krwps,
       sql.ELAPSED_TIME_DELTA/1000000   elps
from DBA_HIST_SQLSTAT sql, dba_hist_snapshot sn
where sql.sql_id='&sql_iD'
and sql.snap_id= sn.snap_id
and sql.instance_number=sn.instance_number
order by 1;

2).Historic Execution from AWR

select * from table(dbms_xplan.display_awr('&SQL_ID',PLAN_HASH_VALUE));

3). Bind Variables Used in a Query.

col END_INTERVAL_TIME for 999999999
col NAME for a999999
col name for 99999999
select
sn.END_INTERVAL_TIME,
sb.NAME,
sb.VALUE_STRING
from
DBA_HIST_SQLBIND sb,
DBA_HIST_SNAPSHOT sn
where
sb.sql_id='&SQL_ID' and
sb.WAS_CAPTURED='YES' and
sn.snap_id=sb.snap_id
order by
sb.snap_id,
sb.NAME;

4).Check the Histogram.

select column_name from dba_tab_col_statistics where table_name='&TABLE_NAME' and  and owner='&OWNER' order by COLUMN_NAME ;

5).Check the Index Positioning.

select index_name,column_name,column_position from dba_ind_columns where index_name='&INDEX_NAME' and index_owner='&INDEX_OWNER'

6).To Tune the Query.

From the Session level (as sys).

Alter session set sql_trace=TRUE; => This will not give Binds and Waits Events.
Alter session set statistics_level=ALL;  OR
Use Gather_Plan_Statistics Hint in the query you want to tune. Something like below one..
select /*GATHER_PLAN_STATISTICS */

And get the explain plan using.

SELECT * FROM table (DBMS_XPLAN.DISPLAY_CURSOR('&SQL_ID', NULL, 'ALLSTATS LAST')); 

7). Run SQL Tuning Advisor 

DECLARE
  l_sql_tune_task_id  VARCHAR2(100);
BEGIN
  l_sql_tune_task_id := DBMS_SQLTUNE.create_tuning_task (
                          begin_snap  => &BEG_SNAP,
                          end_snap    => &END_SNAP,
                          sql_id      => '&SQL_ID',
                          scope       => DBMS_SQLTUNE.scope_comprehensive,
                          time_limit  => 10000,
                          task_name   => 'MYTUNINGTASK_&SQL_ID',
                          description => 'Tuning task for statement &SQL_ID in AWR.');
  DBMS_OUTPUT.put_line('l_sql_tune_task_id: ' || l_sql_tune_task_id);
END;
/

EXEC DBMS_SQLTUNE.execute_tuning_task(task_name => '&SQL_ID');

SET LONG 10000;
SET PAGESIZE 1000
SET LINESIZE 200
SELECT DBMS_SQLTUNE.report_tuning_task('&SQL_ID') AS recommendations FROM dual;
SET PAGESIZE 24

SET LONG 10000;
SET PAGESIZE 1000
SET LINESIZE 200
spool sqltune_report.lst
SELECT DBMS_SQLTUNE.report_tuning_task('&SQL_ID') AS recommendations FROM dual;
SET PAGESIZE 24
spool off

8).Find AWR Snapshot  Interval and Change its Interval 

select
       extract( day from snap_interval) *24*60+
       extract( hour from snap_interval) *60+
       extract( minute from snap_interval ) "Snapshot Interval",
       extract( day from retention) *24*60+
       extract( hour from retention) *60+
       extract( minute from retention ) "Retention Interval"
from dba_hist_wr_control;

Output is in Minutes.

execute dbms_workload_repository.modify_snapshot_settings (interval => 60,retention => 1576800);


9).Find the Optimizer Statistics for an Index.

set serveroutput on
declare
      l_numrows   number;
     l_numlblks  number;
     l_numdist   number;
     l_avglblk   number;
     l_avgdblk   number;
     l_clstfct   number;
     l_indlevel  number;
   begin
    dbms_stats.get_index_stats(ownname ='&SCHEMA_NAME', indname ='&INDEX_NAME',
                    numrows => l_numrows, numlblks => l_numlblks,
                    numdist => l_numdist, avglblk => l_avglblk,
                    avgdblk => l_avgdblk, clstfct => l_clstfct,
                    indlevel => l_indlevel);
     dbms_output.put_line('No. of rows: ' || l_numrows);
     dbms_output.put_line('No. of blks: ' || l_numlblks);
     dbms_output.put_line('No. of distinct values: ' || l_numdist);
     dbms_output.put_line('Avg leaf blocks for distinct keys: ' || l_avglblk);
     dbms_output.put_line('Avg data blocks pointed to in the table: ' || l_avgdblk);
     dbms_output.put_line('Clustering factor: ' || l_clstfct);
     dbms_output.put_line('Index height: ' || l_indlevel);
   end;
   /

10). To find Longops Sessions

set pagesize 2000
set linesize 150
col message for a70
col opname for a30
col username for a10
select sid||','||serial# sid_Serial, start_time, opname, sofar, totalwork, round(sofar/totalwork*100) Done, round(time_remaining/60) Time_left, username, message
from gv$session_longops where totalwork != 0 and (sid,serial#) in (select sid,serial# from gv$session)
--and totalwork!=sofar
order by start_time;