Wednesday, April 16, 2014

All about Infiniband switches on Exadata :


1).The InfiniBand Network connects the database servers and Exadata Storage Servers using the InfiniBand switches on the rack. It is a private network between the database servers and Exadata Storage Servers.
2).A Exadata Rack contains at least 2 Infiniband switches. The third switch is called Spine Switch. Which connects both the leaf switches in half and full rack database machines.






3). Spine switch is for Connection of multiple racks to form a single and larger database machine environment.

4). Each Server (Storage and Database Server) Contains 2 Infiniband ports which are bonded together in ACTIVE/PASSIVE way (till X3 and ACTIVE/ACTIVE in X4).

5). The Active Passive connections are spread across both the switches using FAT-TREE switched Fabric Network Topology.

6). Infiniband switches run centOS.

7). MONITOR SWITCH PORTS:   To check failed switch and sensor hardware that exceeds preset thresholds, Run these commands every 1-2 minutes.
Use,
Login to the switch using root .
and run,
$ showunhealthy
OK - No unhealthy sensors
$ checkpower
PSU 0 present OK
PSU 1 present OK
All PSUs OK

7.1).In case of any issue reported on the above command, Use "env_test" command.
Login as root on IB switch and run,
# env_test
Environment test started:
Starting Environment Daemon test:
Environment daemon running
Environment Daemon test returned OK
Starting Voltage test:
Voltage ECB OK
Measured 3.3V Main = 3.27 V
Measured 3.3V Standby = 3.39 V
Measured 12V = 11.97 V
Measured 5V = 4.99 V
Measured VBAT = 3.09 V
Measured 2.5V = 2.49 V
Measured 1.8V = 1.78 V
Measured I4 1.2V = 1.22 V
Voltage test returned OK
Starting PSU test:
PSU 0 present OK
PSU 1 present OK
PSU test returned OK
Starting Temperature test:
Back temperature 29
Front temperature 30
SP temperature 48
Switch temperature 43, maxtemperature 45
Temperature test returned OK
Starting FAN test:
Fan 0 not present
Fan 1 running at rpm 12099
Fan 2 running at rpm 11881
Fan 3 running at rpm 12208
Fan 4 not present
FAN test returned OK
Starting Connector test:
Connector test returned OK
Starting Onboard ibdevice test:
Switch OK
All Internal ibdevices OK
Onboard ibdevice test returned OK
Starting SSD test:
SSD test returned OK
Environment test PASSED

8).MONITOR IB SWITCH PORTS : Use ibqueryerrors.pl on any of the database node or switches. Storage servers need not be checked as its automatically checked by Exadata Cell software(Part of MS)
Login as root to database or IB Switch and run,
# ibqueryerrors.pl -s RcvSwRelayErrors,RcvRemotePhysErrors,XmtDiscards,XmtContraintErrors,RcvContraintErrors,ExcBufOverrunErrors,Vl15Dropped
You should run this every 1 or 2 min to check if the value is raising.
Check for SymbolErrors,RcvErrors,LinkIntegrityErrors

9).  To check infiniband Firmware versions,
On infiniband Switch, Login as root user and then,
# version | head -1 | cut -d" " -f5
10).Monitor Database Node IB Ports:
Login to database server as root and then run
ibstatus => check that every port shows up in the output(2 per node).

 Sample Output :

Infiniband device 'mlx4_0' port 1 status:
        default gid:     fe80:0000:0000:0000:0021:2800:01ce:d28b
        base lid:        0x26
        sm lid:          0x3
        state:           4: ACTIVE
        phys state:      5: LinkUp
        rate:            40 Gb/sec (4X QDR)
        link_layer:      IB

Infiniband device 'mlx4_0' port 2 status:
        default gid:     fe80:0000:0000:0000:0021:2800:01ce:d28c
        base lid:        0x27
        sm lid:          0x3
        state:           4: ACTIVE
        phys state:      5: LinkUp
        rate:            40 Gb/sec (4X QDR)
        link_layer:      IB

perfquery => Check for SymbolErrors, RcvErrors, LinkIntegrityErrors
ifconfig, => Check the bondib0(ib0 and ib1) are up.
ping => Check for connectivity.
rds-ping => Check for connectivity.

11). Monitor the infiniband Fabric. Can be run from either Database node or one of the infiniband switches.
To Locate SM running,
Login as root on DB node or IB switch and run
# sminfo
sminfo: sm lid 3 sm guid 0x2128469156a0a0, activity count 55495849 priority 14 state 3 SMINFO_MASTER
and then
# ibswitches
Switch  : 0x002128469156a0a0 ports 36 "SUN DCS 36P QDR aeldb3sw-ibs0 10.146.28.50" enhanced port 0 lid 3 lmc 0
Switch  : 0x00212846914ba0a0 ports 36 "SUN DCS 36P QDR aeldb3sw-ibb0 10.146.28.52" enhanced port 0 lid 2 lmc 0
Switch  : 0x002128469157a0a0 ports 36 "SUN DCS 36P QDR aeldb3sw-iba0 10.146.28.51" enhanced port 0 lid 1 lmc 0

From the above command, 0x002128469156a0a0   is the Switch where SM is running(Compare the GUID) from the above command(sminfo)
Or login to one of IB switch and simply run,
# getmaster
Local SM not enabled
20140131 09:55:06 Master SubnetManager on sm lid 3 sm guid 0x2128469156a0a0 : SUN DCS 36P QDR aeldb3sw-ibs0 10.146.28.50

12) On a Full or Half rack node, Spine switch is present and thats where the SM should be running,
To identify spine switch,
run,
# ibnetdiscover -p | awk '/^SW + [0-9] +  + [0-9] + + 0x[0-9 \ a-e]+ + [0 - 9] + x .DR - [SW | CA] .*/ {if (spine [$4] == " ") spine[$4] == "yes" if ((spine [$8] == "CA")  spine[$4] == "no" } END { for (val in spine) if (spine [val] == "yes") print  val }'

13). Infiniband Cables are not as robust as Ethernet (RJ45) ones.  InfiniBand copper cables have strict

Specifications which define the minimum bend radius that they can tolerate.