EXPERT ORACLE DBA: Oracle RAC interview questions and answers

RAC :

1-Difference between 11gr1 and 11gr2

Well, there is not much difference between 10g and 11gR (1) RAC.

But there is a significant difference in 11gR2.

Prior to 11gR1(10g) RAC, the following were managed by Oracle CRS

Databases
Instances
Applications
Node Monitoring
Event Services
High Availability

From 11gR2(onwards) its completed HA stack managing and providing the following resources as like the other cluster software like VCS etc.

Databases
Instances
Applications
Cluster Management
Node Management
Event Services
High Availability
Network Management (provides DNS/GNS/MDNSD services on behalf of other traditional services) and SCAN – Single Access Client Naming method, HAIP
Storage Management (with help of ASM and other new ACFS filesystem)
Time synchronization (rather depending upon traditional NTP)
Removed OS dependent hang checker etc, manages with own additional monitor process

racle 9i RAC

OPS (Oracle Parallel Server) was renamed as RAC
CFS (Cluster File System) was supported
OCFS (Oracle Cluster File System) for Linux and Windows
watchdog timer replaced by hangcheck timer

Oracle 10g R1 RAC

Cluster Manager replaced by CRS
ASM introduced
Concept of Services expanded
ocrcheck introduced
ocrdump introduced
AWR was instance specific

Oracle 10g R2 RAC

CRS was renamed as Clusterware
asmcmd introduced
CLUVFY introduced
OCR and Voting disks can be mirrored
Can use FAN/FCF with TAF for OCI and ODP.NET

Oracle 11g R1 RAC

Oracle 11g RAC parallel upgrades - Oracle 11g have rolling upgrade features whereby RAC database can be upgraded without any downtime.
Hot patching - Zero downtime patch application.
Oracle RAC load balancing advisor - Starting from 10g R2 we have RAC load balancing advisor utility. 11g RAC load balancing advisor is only available with clients who use .NET, ODBC, or the Oracle Call Interface (OCI).
ADDM for RAC - Oracle has incorporated RAC into the automatic database diagnostic monitor, for cross-node advisories. The script addmrpt.sql run give report for single instance, will not report all instances in RAC, this is known as instance ADDM. But using the new package DBMS_ADDM, we can generate report for all instances of RAC, this known as database ADDM.
Optimized RAC cache fusion protocols - moves on from the general cache fusion protocols in 10g to deal with specific scenarios where the protocols could be further optimized.
Oracle 11g RAC Grid provisioning - The Oracle grid control provisioning pack allows us to "blow-out" a RAC node without the time-consuming install, using a pre-installed "footprint".

Oracle 11g R2 RAC

We can store everything on the ASM. We can store OCR & voting files also on the ASM.
ASMCA
Single Client Access Name (SCAN) - eliminates the need to change tns entry when nodes are added to or removed from the Cluster. RAC instances register to SCAN listeners as remote listeners. SCAN is fully qualified name. Oracle recommends assigning 3 addresses to SCAN, which create three SCAN listeners.
Clusterware components: crfmond, crflogd, GIPCD.
AWR is consolidated for the database.
11g Release 2 Real Application Cluster (RAC) has server pooling technologies so it’s easier to provision and manage database grids. This update is geared toward dynamically adjusting servers as corporations manage the ebb and flow between data requirements for datawarehousing and applications.
By default, LOAD_BALANCE is ON.
GSD (Global Service Deamon), gsdctl introduced.
GPnP profile.
Cluster information in an XML profile.
Oracle RAC OneNode is a new option that makes it easier to consolidate databases that aren’t mission critical, but need redundancy.
raconeinit - to convert database to RacOneNode.
raconefix - to fix RacOneNode database in case of failure.
racone2rac - to convert RacOneNode back to RAC.
Oracle Restart - the feature of Oracle Grid Infrastructure's High Availability Services (HAS) to manage associated listeners, ASM instances and Oracle instances.
Oracle Omotion - Oracle 11g release2 RAC introduces new feature called Oracle Omotion, an online migration utility. This Omotion utility will relocate the instance from one node to another, whenever instance failure happens.
Omotion utility uses Database Area Network (DAN) to move Oracle instances. Database Area Network (DAN) technology helps seamless database relocation without losing transactions.
Cluster Time Synchronization Service (CTSS) is a new feature in Oracle 11g R2 RAC, which is used to synchronize time across the nodes of the cluster. CTSS will be replacement of NTP protocol.
Grid Naming Service (GNS) is a new service introduced in Oracle RAC 11g R2. With GNS, Oracle Clusterware (CRS) can manage Dynamic Host Configuration Protocol (DHCP) and DNS services for the dynamic node registration and configuration.
Cluster interconnect: Used for data blocks, locks, messages, and SCN numbers.
Oracle Local Registry (OLR) - From Oracle 11gR2 "Oracle Local Registry (OLR)" something new as part of Oracle Clusterware. OLR is node’s local repository, similar to OCR (but local) and is managed by OHASD. It pertains data of local node only and is not shared among other nodes.
Multicasting is introduced in 11gR2 for private interconnect traffic.
I/O fencing prevents updates by failed instances, and detecting failure and preventing split brain in cluster. When a cluster node fails, the failed node needs to be fenced off from all the shared disk devices or diskgroups. This methodology is called I/O Fencing, sometimes called Disk Fencing or failure fencing.
Re-bootless node fencing (restart)‏ - instead of fast re-booting the node, a graceful shutdown of the stack is attempted.
Clusterware log directories: acfs*
HAIP (IC VIP).
Redundant interconnects: NIC bonding, HAIP.
RAC background processes: DBRM – Database Resource Manager, PING – Response time agent.
Virtual Oracle 11g RAC cluster - Oracle 11g RAC supports virtualization.
Additionally, in Version 11g R2 with ASMCMD, you can manage an ASM instance and disk group operations such as:
• Instance startup/shutdown
• Set the disk discovery paths
• Create a disk group
• Drop a disk group

2-Cache fusion

-We know that every instance of the RAC database has its own local buffer cache which performs the usual cache functionality for that instance. Now there could be occasions when a transaction/user on instance A needs to access a data block which is being owned/locked by the other instance B. In such cases, the instance A will request instance B for that data block and hence accesses the block through the interconnect mechanism. This concept is known as CACHE FUSION where one instance can work on or access a data block in other instance?s cache via the high speed interconnect.

-Global Cache Service (GCS) is the heart of Cache Fusion concept. It is through GCS that data integrity in RAC is maintained when more than one instance need a particular data block.

-CACHE FUSION helps resolve all the possible contentions that could happen between instances in a RAC setup. There are 3 possible contentions in a RAC setup which we are going to discuss in detail here with a mention of cache fusion where ever applicable.

1. Read/Read contention: Read-Read contention might not be a problem at all because the table/row will be in a shared lock mode for both transactions and none of them is trying an exclusive lock anyways.

2. Read/Write contention: This one is interesting.

Here is more about this contention and how the concept of cache fusion helps resolve this contention

1. A data block is in the buffer cache of instance A and is being updated. An exclusive lock has been acquired on it.

2. After some time instance B is interested in reading that same data block and hence sends a request to GCS. So far so good ? Read/Write contention has been induced

3. GCS checks the availability of that data block and finds that instance A has acquired an exclusive lock. Hence, GCS asks instance A to release the block for instance B.

4. Now there are two options ? either instance A releases the lock on that block (if it no longer needs it) and lets instance B read the block from the disk OR instance A creates a CR image of the block in its own buffer cache and ships it to the requesting instance via interconnect

5. The holding instance notifies the GCS accordingly (if the lock has been released or the CR image has been shipped)

6. Creation of CR image, shipping it to the requesting instance and involvement of GCS is where CACHE FUSION comes into play

http://www.dba-oracle.com/images/OracleRACCacheFusion001.JPG

3. Write/Write contention:

This is the case where both instance A as well as B are trying to acquire an exclusive lock on the data block. A data block is in the buffer cache of instance A and is being updated. An exclusive lock has been acquired on it

1. Instance B send the data block request to the GCS

2. GCS checks the availability of that data block and finds that instance A has acquired an exclusive lock. Hence, GCS asks instance A to release the block for instance B

3. There are 2 options - either instance A releases the lock on that block (if it no longer needs it) and lets instance B read the block from the disk OR instance A creates a PI image of the block in its own buffer cache, makes the redo entries and ships the block to the requesting instance via interconnect

4. Holding instance also notifies the GCS that lock has been released and a PI has been preserved

5. Instance B now acquires the exclusive lock on that block and continues with its normal processing. At this point GCS records that data block is now with instance B

6. The whole mechanism of resolving this contention with the due involvement of GCS is attributed to the CACHE FUSION.

http://www.dba-oracle.com/images/OracleRACCacheFusion002.JPG

-Global Enqueue Services (GES) is responsible for managing locks across the cluster. As a side note, GES was previously called the Distributed Lock Manager (DLM). When Oracle introduced their first cluster on DEC VAX systems, the clustered database used VAX?s cluster lock manager but it was not designed for the demands of a transactional database and did not scale well. Oracle designed the DLM to have a scalable global lock manager. The DLM still persists in many publications.

-for Oracle RAC, we need the Buffer Caches on all instances to appear to be global across the cluster.

The processes running to support an Oracle RAC instance include:

? LMS: This process is GCS. This process used to be called the Lock Manager Server.

? LMON: The Lock Monitor. This process is the GES master process.

? LMD: The Lock Manager Daemon. This process manages incoming lock requests.

? LCK0: The instance enqueue process. This process manages lock requests for library cache objects.

3-Background process-

ACMS    — Atomic Control file to Memory Service (ACMS)
GTX0-j — Global Transaction Process
LMON    — Global Enqueue Service Monitor
LMD     — Global Enqueue Service Daemon
LMS     — Global Cache Service Process
LCK0    — Instance Enqueue Process
DIAG    — Diagnosability Daemon
RMSn    — Oracle RAC Management Processes (RMSn)
RSMN    — Remote Slave Monitor
DBRM    — Database Resource Manager (from 11g R2)
PING    — Response Time Agent (from 11g R2)

ACMS: Atomic Controlfile to Memory Service (ACMS)

In RAC ACMS per-instance process ensure a distributed SGA memory update is either globally committed if success or globally aborted if a failure occurs.

GTX0-j: Global Transaction Process

The GTX0-j process provides transparent support for XA global transactions.

The database autotunes the number of these processes based on the workload of XA global transactions.

LMON: Global Enqueue Service Monitor

The LMON process monitors global enqueues and resources across the cluster and performs global enqueue recovery operations.

LMD: Global Enqueue Service Daemon

The LMD process manages incoming remote resource requests within each instance.

LMS: Global Cache Service Process

The LMS process maintains records of the data file statuses and each cached block by recording in a Global Resource Directory (GRD). Also controls the flow of messages to remote instances and manages global data block access. Transmits block images between the buffer caches of different instances. This processing is part of the Cache Fusion feature.

LCK0: Instance Enqueue Process

The LCK0 process manages non-Cache Fusion resource requests such as library and row cache requests.

RMSn: Oracle RAC Management Processes (RMSn)

The RMSn processes perform manageability tasks for RAC. Tasks accomplished by an RMSn process include creation of resources related to Oracle RAC when new instances are added to the clusters.

RSMN: Remote Slave Monitor manages background slave process creation and communication on remote instances. These background slave processes perform tasks on behalf of a coordinating process running in another instance.

11gr2 background process—

Process Name	Functionality
crsd	•The CRS daemon (crsd) manages cluster resources based on configuration information that is stored in Oracle Cluster Registry (OCR) for each resource. This includes start, stop, monitor, and failover operations. The crsd process generates events when the status of a resource changes.
cssd	•Cluster Synchronization Service (CSS): Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware, then CSS processes interfaces with your clusterware to manage node membership information. CSS has three separate processes: the CSS daemon (ocssd), the CSS Agent (cssdagent), and the CSS Monitor (cssdmonitor). The cssdagent process monitors the cluster and provides input/output fencing. This service formerly was provided by Oracle Process Monitor daemon (oprocd), also known as OraFenceService on Windows. A cssdagent failure results in Oracle Clusterware restarting the node.
diskmon	•Disk Monitor daemon (diskmon): Monitors and performs input/output fencing for Oracle Exadata Storage Server. As Exadata storage can be added to any Oracle RAC node at any point in time, the diskmon daemon is always started when ocssd is started.
evmd	•Event Manager (EVM): Is a background process that publishes Oracle Clusterware events
mdnsd	•Multicast domain name service (mDNS): Allows DNS requests. The mDNS process is a background process on Linux and UNIX, and a service on Windows.
gnsd	•Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS and external DNS servers. The GNS process performs name resolution within the cluster.
ons	•Oracle Notification Service (ONS): Is a publish-and-subscribe service for communicating Fast Application Notification (FAN) events
oraagent	•oraagent: Extends clusterware to support Oracle-specific requirements and complex resources. It runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g Release 1 (11.1).
orarootagent	•Oracle root agent (orarootagent): Is a specialized oraagent process that helps CRSD manage resources owned by root, such as the network, and the Grid virtual IP address
oclskd	•Cluster kill daemon (oclskd): Handles instance/node evictions requests that have been escalated to CSS
gipcd	•Grid IPC daemon (gipcd): Is a helper daemon for the communications infrastructure
ctssd	•Cluster time synchronisation daemon(ctssd) to manage the time syncrhonization between nodes, rather depending on NTP

4-Different parameters in init.ora files.

There are three types of initialization parameters in the Real Application Clusters environment:

* Parameters that must be Identical across all Instances
* Parameters that must be Unique across all Instances
* Multi-Valued Parameters

Parameters that must be Identical across all Instances

Specify these parameter values in the SPFILE, or within the individual PFILEs for each instance. The following list contains the parameters that must be identical on every instance:

· ACTIVE_INSTANCE_COUNT

· ARCHIVE_LAG_TARGET

· CLUSTER_DATABASE

· CLUSTER_DATABASE_INSTANCES

· COMPATIBLE

· CONTROL_FILES

· DB_BLOCK_SIZE

· DB_DOMAIN

· DB_FILES

· DB_NAME

· DB_RECOVERY_FILE_DEST

· DB_RECOVERY_FILE_DEST_SIZE

· DB_UNIQUE_NAME

· INSTANCE_TYPE (RDBMS or ASM)

· PARALLEL_MAX_SERVERS

· REMOTE_LOGIN_PASSWORD_FILE

· UNDO_MANAGEMENT

The setting for DML_LOCKS must be identical on every instance only if set to zero.

Unique Parameters (across instances)

In this category of parameters are uniquely identified for a particular instance. They specify the identifiers of the independent instance, and give independent characteristics to an instance

· instance_name :- Unique name of the instance.

· instance_number:- unique number that maps the instance

· rollback_segments:- one or more rollback segments by name to this instance.

· thread :- number of the redo thread to be used by an instance.

· undo_tablespace:- undo tablespace to be used when an instance starts up.

· Note :- If your SPFILE contains instance-specific settings, then these settings take precedence over settings made with ALTER SYSTEMcommands.

· SPFILE parameter values and RAC

· You can change parameter settings using the
ALTER SYSTEM SET command from any instance

· ALTER SYSTEM SET scope=memory sid=’’;

· To remove an entry from your spfile
ALTER SYSTEM RESET scope=spfile sid=’’;

5-OCR and voting disk and their backup recovery

OCR: It created at the time of Grid Installation. It’s store information to manage Oracle cluster-ware and it’s component such as RAC database, listener, VIP,Scan IP & Services.
Minimum 1 and maximum 5 copy of OCR is possible.
Voting Disk: It manage information about node membership. Each voting disk must be accessible by all nodes in the cluster.If any node is not passing heat-beat across other note or voting disk, then that node will be evicted by Voting disk.
Minimum 1 and maximum 15 copy of voting disk is possible.

New Facts:

We can store OCR And Voting disk on ASM or certified cluster file system.
We can dynamically add or replace voting disk & OCR.
Backup of Voting disk using “dd” command not supported.
Voting disk and OCR can be keep in same disk-group or different disk-group
Voting disk and OCR automatic backup kept together in a single file.
Automatic backup of Voting disk and OCR happen after every four hours, end of the day, end of the week
You must have root or sudo privilege account to manage it.

To find current location of Voting disk:
[oracle@rsingle ~]$ crsctl query css votedisk
## STATE    File Universal Id                File Name Disk group
— —–    —————–                ——— ———
1. ONLINE   6a60a2c3510c4fbfbff62dcdc279b247 (ORCL:DATA1) [DATA]

Relocate or multiplexing Voting disk to another disk-group (With normal redundancy)

[root@rsingle ~]# /u01/app/11.2.0/grid/bin/crsctl replace votedisk +CRS
Successful addition of voting disk afb77b2693a24f1ebfe876784103e82a.
Successful addition of voting disk 3e2542c5b1154ffdbfc8b6dea7dce390.
Successful addition of voting disk 8e0f3c5921cc4f93bf223de1465d83cc.
Successful deletion of voting disk 6a60a2c3510c4fbfbff62dcdc279b247.
Successfully replaced voting disk group with +CRS.
CRS-4266: Voting file(s) successfully replaced

New location of Voting disk:
[root@rsingle ~]# /u01/app/11.2.0/grid/bin/crsctl query css votedisk
## STATE    File Universal Id                File Name Disk group
— —–    —————–                ——— ———
1. ONLINE   afb77b2693a24f1ebfe876784103e82a (ORCL:DATA2) [CRS]
2. ONLINE   3e2542c5b1154ffdbfc8b6dea7dce390 (ORCL:DATA3) [CRS]
3. ONLINE   8e0f3c5921cc4f93bf223de1465d83cc (ORCL:DATA4) [CRS]
Located 3 voting disk(s).

To find location of Corrent OCR:

[oracle@rsingle ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       2964
Available space (kbytes) :     259156
ID                       : 1390115973
Device/File Name         :      +DATA
Device/File integrity check succeeded

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check bypassed due to non-privileged user
Create mirror copy of OCR online
[oracle@rsingle ~]$ sudo ocrconfig -add +CRS
Password:

Check location of OCR after mirror copy creation:
[root@rsingle ~]# /u01/app/11.2.0/grid/bin/ocrcheck
Status of Oracle Cluster Registry is as follows :
Version                  :          3
Total space (kbytes)     :     262120
Used space (kbytes)      :       2964
Available space (kbytes) :     259156
ID                       : 1390115973
Device/File Name         :      +DATA
Device/File integrity check succeeded
Device/File Name         :       +CRS
Device/File integrity check succeeded

Device/File not configured

Cluster registry integrity check succeeded

Logical corruption check succeeded

Another file to find location of OCR:

[root@rsingle ~]# cat /etc/oracle/ocr.loc
#Device/file getting replaced by device +CRS
ocrconfig_loc=+DATA
ocrmirrorconfig_loc=+CRS

6-Node eviction/Split brain/Fencing/STONITH(Shoot the node in the head):

-Split-brain scenarios in a RAC cluster can be defined as functional overlapping of separate instances, each in its own world without any guaranteed consistency or coordination between them.

-Basically, communication is lost among the nodes of a cluster, with them being evicted/kicked out of the cluster,resulting in node/instance reboots. This can happen due to a variety of reasons ranging from hardware failure on the private cluster interconnect to nodes becoming unresponsive because of CPU/memory starvation issues.

Ø Establish redundancy at the networking tier : redundant switches/Network Interface Cards

(NICs) trunked/bonded/teamed together.

Ø Allocate enough CPU for the application workloads and establish limits for CPU consumption.

Basically, CPU exhaustion can lead to a point where the node becomes unresponsive to the

other nodes of the cluster, resulting in a split-brain scenario leading in turn to node evictions.

Ø Allocate enough memory for the various applications and establish limits for memory

consumption. Automatic Memory Management comes in handy, as it puts limits on both

SGA and Program Global Area (PGA) areas.

Ø Employ/deploy DBRM along with IORM (if you are operating RAC clusters on Exadata).

Ø Set up and configure instance caging (CPU_COUNT parameter) for multi-tenant database

RAC nodes; monitor and watch out for RESMGR:CPU quantum waits related to instance

caging, especially when instances have been overconsolidated.

Ø Ensure that any kind of antivirus software is not active/present on any of the RAC nodes of

the cluster.

Ø Patch to the latest versions of the Oracle Database software. Many bugs have been associated

with various versions that are known to cause split-brain scenarios to occur. Staying current

with the latest CPU/PSUs is known to mitigate stability/performance issues.

Ø Avoid allocating/configuring an excessive no. of LMS_PROCESSES. LMS is a CPU-intensive

process, and if not configured properly, can cause CPU starvation to occur very rapidly,

ultimately resulting in node evictions.

Ø Partition large objects to reduce I/O and improve overall performance.

Ø Parallelization and AUTO DOP: Set up/configure/tune carefully. Turning on Automatic

Degree of Parallelism (AUTO DOP—PARALLEL_DEGREE_POLICY= AUTO) can have negative

consequences on RAC performance.

NODE EVICTION : -

-A node eviction is the mechanism/process (piece of code) designed within the Oracle Clusterware to ensure cluster consistency and maintain overall cluster health by removing the node(s) that either suffers critical issues or doesn’t respond to other nodes’ requests in the cluster in a timely manner.

-For example, when a node in the cluster is hung or suffering critical problems, such as network latency or disk latency to maintain the heartbeat rate within the internaltimeout value, or if the cluster stack or clusterware is unhealthy, the node will leave the cluster and do a fast selfrebootto ensure overall cluster health.

-When a node doesn’t respond to another node’s request in a timely manner,the node will receive a position packet through disk/network with the instruction to leave the cluster by killing itself.When the problematic node reads the position (kill) pockets, it will evict and leave the cluster.

Node Evictions—Top/Common Causes and Factors

The following are only a few of the most common symptoms/factors that lead to node evictions, cluster stack sudden death, reboots, and status going unhealthy:

· Network disruption, latency, or missing network heartbeats

• Delayed or missing disk heartbeats

• Corrupted network packets on the network may also cause CSS reboots on certain platforms

• Slow interconnect or failures

• Known Oracle Clusterware bugs

• Unable to read/write or access the majority of the voting disks (files)

• Lack of sufficient resource (CPU/memory starvation) availability on the node for OS

scheduling by key CRS daemon processes

• Manual termination of the critical cluster stack daemon background processes

(css, cssdagent, cssdmonitor)

• No space left on the device for the GI or /var file system

• Sudden death or hang of CSSD process

• ORAAGENT/ORAROOTAGENT excessive resource (CPU, MEMORY, SWAP) consumption resulting in

node eviction on specific OS platforms

7-TAF ,FAN ,ONS

1) TAF with tnsnames
a feature of Oracle Net Services for OCI8 clients. TAF is transparent application failover which will move a session to a backup connection if the session fails. With Oracle 10g Release 2, you can define the TAF policy on the service using dbms_service package. It will only work with OCI clients. It will only move the session and if the parameter is set, it will failover the select statement. For insert, update or delete transactions, the application must be TAF aware and roll back the transaction. YES, you should enable FCF on your OCI client when you use TAF, it will make the failover faster.
Note: TAF will not work with JDBC thin.
2) FAN with tnsnames with aq notifications true

FAN is a feature of Oracle RAC which stands for Fast Application Notification. This allows the database to notify the client of any change (Node up/down, instance up/down, database up/down). For integrated clients, inflight transactions are interrupted and an error message is returned. Inactive connections are terminated.
FCF is the client feature for Oracle Clients that have integrated with FAN to provide fast failover for connections. Oracle JDBC Implicit Connection Cache, Oracle Data Provider for .NET (ODP.NET) and Oracle Call Interface are all integrated clients which provide the Fast Connection Failover feature.
3) FCF, along with FAN when using connection pools
FCF is a feature of Oracle clients that are integrated to receive FAN events and abort inflight transactions, clean up connections when a down event is received as well as create new connections when a up event is received. Tomcat or JBOSS can take advantage of FCF if the Oracle connection pool is used underneath. This can be either UCP (Universal Connection Pool for JAVA) or ICC (JDBC Implicit Connection Cache). UCP is recommended as ICC will be deprecated in a future release.

4) ONS, with clusterware either FAN/FCF

ONS is part of the clusterware and is used to propagate messages both between nodes and to application-tiers
ONS is the foundation for FAN upon which is built FCF.
RAC uses FAN to publish configuration changes and LBA events. Applications can react as those published events in two way :
- by using ONS api (you need to program it)
- by using FCF (automatic by using JDBC implicit connection cache on the application server)
you can also respond to FAN event by using server-side callout but this on the server side (as their name suggests it)

Relationship between FAN/FCF/ONS

ONS –> FAN –> FCF
ONS -> send/receive messages on local and remote nodes.
FAN -> uses ONS to notify other processes about changes in configuration of service level
FCF -> uses FAN information working with conection pools JAVA and others.

8-Oracle RAC startup sequence

Whilst reading the book “http://www.amazon.com/Pro-Oracle-Database-11g-Linux/dp/1430229586” found excellent diagram representing the same but very understandable manner. (at least for me)

Without much hassle or discussion with bottom to top approach, I understand now,

OHASD Phase:-

OHASD (Oracle High Availability Server Daemon) starts Firsts and it will start

OHASD Agent Phase:-

OHASD Agent starts and in turn this will start

gipcd	Grid interprocess communication daemon, used for monitoring cluster interconnect
mdnsd	Multicast DNS service It resolves DNS requests on behalf of GNS
gns	The Grid Naming Service (GNS), a gateway between DNS and mdnsd, resolves DNS requests
gpnpd	Grid Plug and Play Daemon, Basically a profile similar like OCR contents stored in XML format in $GI_HOME/gpnp/profiles/<peer> etc., this is where used by OCSSD also to read the ASM disk locations to start up with out having ASM to be up, moreover this also provides the plug and play profile where this can be distributed across nodes to cluster
evmd/ evmlogger	Evm service will be provided by evmd daemon, which is a information about events happening in cluster, stop node,start node, start instance etc.

cssdagent (cluster synchronization service agent), in turn starts

ocssd

Cluster synchronization service daemon which manages node membership in the cluster

If cssd found that ocssd is down, it will reboot the node to protect the data integrity.

cssdmonitor (cluster synchronization service monitor), replaces oprocd and provides I/O fencing

OHASD orarootagent starts and in turn starts

crsd.bin	Cluster ready services, which manages high availability of cluster resources , like stopping , starting, failing over etc.
diskmon.bin	disk monitor (diskdaemon monitor) provides I/O fencing for exadata storage
octssd.bin	Cluster synchronization time services , provides Network time protocol services but manages its own rather depending on OS

CRSD Agent Phase:- crsd.bin starts two more agents

crsd orarootagent(Oracle root agent) starts and in turn this will start

gns	Grid interprocess communication daemon, used for monitoring cluster interconnect
gns vip	Multicast DNS service It resolves DNS requests on behalf of GNS
Network	Monitor the additional networks to provide HAIP to cluster interconnects
Scan vip	Monitor the scan vip, if found fail or unreachable failed to other node
Node vip	Monitor the node vip, if found fail or unreachable failed to other node

crsd oraagent(Oracle Agent) starts and in turn it will start (the same functionality in 11gr1 and 10g managed by racgmain and racgimon background process) which is now managed by crs Oracle agent itself.

ASM & disk groups	Start & monitor local asm instance
ONS	FAN feature, provides notification to interested client
eONS	FAN feature, provides notification to interested client
SCAN Listener	Start & Monitor scan listener
Node Listener	Start & monitor the node listener (rdbms?)

Hope this helps, comments & suggestions are most welcome.

9-Installation of RAC database

10-Preinstallation check using runcluvfy statement

11-RAC patching steps

Applying PSU Patch 11.2.0.1.2 To A Two Node RAC

Posted by Srikrishna Murthy Annam on May 5, 2011

NOTE : This article is aimed at showing the issues and resolution while applying the PSU patch 11.2.0.1.2 (9655006) to a specific environment and not generalized. Please refer the readme of the PSU patch for detailed procedure to apply PSU patch.

Video Demos are also uploaded to show you the errors and successfull installation of the PSU to 2 node RAC.

1. Record Pre Patch Information.

2. OPatch Utility Information.

3. OCM Configuration.

4. Validation of Oracle Inventory.

5. One-off Patch Conflict Detection and Resolution.

6. Download and Unzip the PSU Patch 9655006

7. Patching GI Home

8. Patching RAC Database Homes

9. Loading Modified SQL Files into the Database.

10. Patch Successful Verification Steps.

11. Issues & Resolutions.

1. Record Pre Patch Information.

1. Login to each node in RAC as grid user and execute the following command.

$GRID_ORACLE_HOME/OPatch/opatch lsinventory

$GRID_ORACLE_HOME/OPatch/opatch lsinventory -bugs_fixed | grep -i ‘GI PSU’

2. Login to each node in RAC as oracle user and execute the following command.

$ORACLE_HOME/OPatch/opatch lsinventory

$ORACLE_HOME/OPatch/opatch lsinventory -bugs_fixed | grep -i ‘DATABASE PSU’

3. Connect to each instance and record registry information.

SQL> select comp_name,version,status from dba_registry;

2. OPatch Utility Information.

$ORACLE_HOME/OPatch/opatch version -h /u01/home/oracle/product/11.2.0/db_1

$GRID_ORACLE_HOME/OPatch/opatch version -h /u01/home/11.2.0/grid

3. OCM Configuration.

Create ocm response file using the following command and provide appropriate values for the prompts.

$GRID_ORACLE_HOME/OPatch/ocm/bin/emocmrsp

Verify the created file using,

$GRID_ORACLE_HOME/OPatch/ocm/bin/emocmrsp –verbose ocm.rsp

NOTE: The Opatch utility will prompt for your OCM (Oracle Configuration Manager) response file when it is run. Without which we cant proceed further.

4. Validation of Oracle Inventory.

$GRID_ORACLE_HOME/OPatch/opatch lsinventory -detail -oh /u01/home/11.2.0/grid

$ORACLE_HOME/OPatch/opatch lsinventory -detail –oh /u01/home/oracle/product/11.2.0/db_1

5. One-off Patch Conflict Detection and Resolution.

6. Download and Unzip the PSU Patch 9655006

$cd /u01/home/oracle/product/11.2.0/db_1/patches/psupatch

$unzip p9655006_11201_Linux.zip

$chmow -R 777 *

7. Patching GI Home

NOTE: If the GI home is shared, then make sure to shut down the GI stack on all remote nodes. Keep the GI stack up and running on the local node.

NOTE: If the GI home is not shared, then make sure the GI stack is running on all nodes in the cluster.

Our Grid Home is not shared, So don’t shutdown any services.

$su – ( Login to root user )

#/u01/home/11.2.0/grid/OPatch/opatch auto /u01/home/oracle/product/11.2.0/db_1/patches/psupatch -oh /u01/home/11.2.0/grid

Monitor the logfile created in $GRID_ORACLE_HOME/cfgtoollogs/

Execute the above opatch command on each RAC node as root user.

** Please refer the Issue & Resolutions secion in the same document for any issues.

8. Patching RAC Database Homes

All Oracle processes and applications (such as emconsole and emagent) that are running from the database home and that are not managed by clusterware should be stopped manually before you apply the patch using the opatch auto command.

$su – ( Login to root user )

# /u01/home/oracle/product/11.2.0/db_1/OPatch/opatch auto /u01/home/oracle/product/11.2.0/db_1/patches/psupatch -oh /u01/home/oracle/product/11.2.0/db_1

Monitor the logfile created in $ORACLE_HOME/cfgtoollogs/

Execute the above opatch command on each RAC node as root user.

** Please refer the Issue & Resolutions secion in the same document for any issues.

9. Loading Modified SQL Files into the Database.

For each database instance running on the Oracle home being patched, connect to the database using SQL*Plus. Connect as SYSDBA and run the catbundle.sql script as follows:

cd $ORACLE_HOME/rdbms/admin

sqlplus /nolog

SQL> CONNECT / AS SYSDBA

SQL> @catbundle.sql psu apply

SQL> QUIT

Check the log files in $ORACLE_HOME/cfgtoollogs/catbundle for any errors

10. Patch Successful Verification Steps.

1. Login to each node in RAC as grid user and execute the following command.

$GRID_ORACLE_HOME/OPatch/opatch lsinventory -bugs_fixed | grep -i ‘GI PSU’

2. Login to each node in RAC as oracle user and execute the following command.

$ORACLE_HOME/OPatch/opatch lsinventory -bugs_fixed | grep -i ‘DATABASE PSU’

3. Connect to each instance and record registry information.

SQL> select comp_name,version,status from dba_registry;

SQL> select * from dba_registry_history;

11. Issues & Resolutions.

Issue 1:

When applying the patch, you may get the following error

The opatch minimum version check for patch /u01/home/oracle/product/11.2.0/db_1/patches/9655006/custom failed for /u01/home/11.2.0/grid

The opatch minimum version check for patch /u01/home/oracle/product/11.2.0/db_1/patches/9655006/etc failed for /u01/home/11.2.0/grid

The opatch minimum version check for patch /u01/home/oracle/product/11.2.0/db_1/patches/9655006/files failed for /u01/home/11.2.0/grid

Opatch version check failed for oracle home /u01/home/11.2.0/grid

Opatch version check failed

update the opatch version for the failed homes and retry

Solution:

Ref Note : 1308858.1

We need to provide the Patch unzipped base directory, not the directory including patch number

Ex : /u01/home/oracle/product/11.2.0/db_1/patches/psupatch/9655006 ( Wrong )

/u01/home/oracle/product/11.2.0/db_1/patches/psupatch ( correct )

Issue 2

Patch may exit with the following error messages

Unable to determine if /u01/home/11.2.0/grid is shared oracle home

Enter ‘yes’ if this is not a shared home or if the prerequiste actions are performed to patch this shared home (yes/no):yes

You must kill crs processes or reboot the system to properly

cleanup the processes started by Oracle clusterware

The Oracle Clusterware stack failed to stop.

You should stop the stack with ‘crsctl stop crs’ and rerun the command

The opatch Applicable check failed for /u01/home/11.2.0/grid. The patch is not applicable for /u01/home/11.2.0/grid

patch ././9655006 apply failed for home /u01/home/11.2.0/grid

Solution :

This error may be specific to this environment though want to specify it here. This error is due to the reason that there are some cluster resources available referring the 10g database installed earlier. Delete those resources from clusterware.

# crsctl delete resource <resource_name> -f

Issue 3 :

PSU patch for GRID home on node 2 failed with the following error

Unable to determine if /u01/home/11.2.0/grid is shared oracle home

Enter ‘yes’ if this is not a shared home or if the prerequiste actions are performed to patch this shared home (yes/no):yes

Successfully unlock /u01/home/11.2.0/grid

patch ././9655006 apply failed for home /u01/home/11.2.0/grid

Verified the detailed log file locate in /u01/home/11.2.0/grid/cfgtoollogs/ and found the permission issue on some files.

The following actions have failed:

Copy failed from ‘/u01/home/oracle/product/11.2.0/db_1/patches/psupatch/9655006/files/bin/crsctl.bin’ to ‘/u01/home/11.2.0/grid/bin/crsctl.bin’…

Copy failed from ‘/u01/home/oracle/product/11.2.0/db_1/patches/psupatch/9655006/files/bin/oifcfg.bin’ to ‘/u01/home/11.2.0/grid/bin/oifcfg.bin’…

Solution:

Give 777 permission to these files crsctl.bin and oifcfg.bin

Issue 4 :

PSU Patch for RDBMS Home on node 1 failed with the following error

Unable to determine if /u01/home/oracle/product/11.2.0/db_1 is shared oracle home

Enter ‘yes’ if this is not a shared home or if the prerequiste actions are performed to patch this shared home (yes/no):yes

patch ././9655006/custom/server/9655006 apply failed for home /u01/home/oracle/product/11.2.0/db_1

Verified the detailed log file locate in /u01/home/oracle/product/11.2.0/db_1/cfgtoollogs and found the platform issue.

Running prerequisite checks…

Prerequisite check “CheckPatchApplicableOnCurrentPlatform” failed.

The details are:

Patch ( 9655006 ) is not applicable on current platform.

Platform ID needed is : 46

Platform IDs supported by patch are: 226

UtilSession failed: Prerequisite check “CheckPatchApplicableOnCurrentPlatform” failed.

OPatch failed with error code 73

Solution:

export OPATCH_PLATFORM_ID=226 and execute the psu patch command again.

Issue 5:

While applying Grid Infrastructure PSU (patch 9343627) to $GRID_HOME, opatch prerequisite check CheckSystemSpace failed:

Running prerequisite checks…

Prerequisite check “CheckSystemSpace” failed.

The details are:

Required amount of space(4373569440) is not available.

UtilSession failed: Prerequisite check “CheckSystemSpace” failed.

OPatch failed with error code 73

Solution:

Follow the note ID 1088455.1 and make sure you have enough space on disk in GRID HOME mount point.

12-RAC upgrade—

E:\DBA\DBA documents\Database upgrade

13-Server kernel parameter shared memory

E:\DBA\RAC\Semaphore_sharedmemory

14-SCAN IP,VIP,PUBLIC IP and private IP

VIPs

-A VIP is an IP address assigned to a cluster node, monitored by Oracle Clusterware, which can be failed over to a surviving node in case of node failure.

-In a RAC cluster, a VIP6 is assigned to a specific node and monitored by the Clusterware, and each node can be configured with one or more VIPs. Suppose that rac1.example.com has VIP 10.21.8.100. If the node rac1.example.com fails, then the Clusterware running on other surviving cluster nodes will immediately relocate VIP 10.21.8.100 to a surviving node (say rac2.example.com).

-Since an IP address is available in the network, connection requests to the IP address 10.21.8.100 will immediately receive a CONN_RESET message (think of it as a No listener error message),and the connection will continue to try the next address in the address list. There is no wait for TCP timeout, as the IP address responds to the connection request immediately.

-VIP eliminates unnecessary waits for TCP timeout. VIPs are also monitored by Clusterware,providing additional high availability for incoming connections. Clusterware uses ioctl functions (on Unix and

Linux) to check the availability of network interfaces assigned for VIPs.

-As a secondary method, the gateway server is also pinged to verify network interface availability. Any errors in the ioctl command will immediately relocate the IP address to another available interface or relocate the VIP resource to another node.

-Listeners are not started after a VIP failover to another node in RAC. The network layer level CONN_REST error message is the key to eliminate TCP timeouts.

15-Describe SCAN working behavior-

Single Client Access Name (SCAN) is a new Oracle Real Application Clusters (RAC) 11g Release 2 feature that provides a single name for clients to access Oracle Databases running in a cluster. Thebenefit is that the client’s connect information does not need to change if you add or remove nodes in the cluster. Having a single name to access the cluster allows clients to use the EZConnect client and the simple JDBC thin URL to access any database running in the cluster, independently of which server(s) in the cluster the database is active. SCAN provides load balancing and failover for client connections to the database. The SCAN works as a cluster alias for databases in the cluster.

SCAN Concepts

Single client access name (SCAN) is the virtual hostname to provide for all clients connecting to the cluster (as opposed to the vip hostnames in 10g and 11gR1).
SCAN is a domain name registered to at least one and up to three IP addresses, either in the domain name service (DNS) or the Grid Naming Service (GNS).
By default, the name used as the SCAN is also the name of the cluster and must be globally unique throughout your enterprise. The default value for the SCAN is based on the local node name. SCAN name must be at least one character long and no more than 15 characters in length, must be alphanumeric - cannot begin with a numeral and may contain hyphens (-). If you require a SCAN that is longer than 15 characters, then select an Advanced installation.
For installation to succeed, the SCAN must resolve to at least one address.
SCAN VIP addresses must be on the same subnet as virtual IP addresses and public IP addresses.
Oracle strongly recommends that you do not configure SCAN VIP addresses in the hosts file. But if you use the hosts file to resolve SCAN name, you can have only one SCAN IP address.
If hosts file is used to resolve SCAN hostname, you will receive Cluster Verification Utility failure at end of installation (see Note: 887471.1 for more details)
For high availability and scalability, Oracle recommends that you configure the SCAN to use DNS Round Robin resolution to three addresses.
Because the SCAN is associated with the cluster as a whole, rather than to a particular node, the SCAN makes it possible to add or remove nodes from the cluster without needing to reconfigure clients. It also adds location independence for the databases, so that client configuration does not have to depend on which nodes are running a particular database.
Clients can continue to access the cluster in the same way as with previous releases, but Oracle recommends that clients accessing the cluster use the SCAN. Clients using the SCAN can also access the cluster using EZCONNECT.
Grid Infrastructure will start local listener LISTENER on all nodes to listen on local VIP, and SCAN listener LISTENER_SCAN1 (up to three cluster wide) to listen on SCAN VIP(s); 11gR2 database by default will set local_listener to local LISTENER, and remote_listener to SCAN listener.
SCAN listener will be running off GRID_HOME, and by default, in 11gR2 local listener will be running off GRID_HOME as well.

· There are 2 options for defining the SCAN:

· 1. Define the SCAN in your corporate DNS (Domain Name Service)

· 2. Use the Grid Naming Service (GNS)

· Define the SCAN in your corporate DNS (Domain Name Service)

· If you choose Option 1, you must ask your network administrator to create a single name that resolves to 3 IP addresses using a round-robin algorithm. Three IP addresses are recommended considering load balancing and high availability requirements regardless of the number of servers in the cluster. The IP addresses must be on the same subnet as your public network in the cluster. The name must be 15 characters or less in length, not including the domain, and must be resolvable without the domain suffix (for example: “krac-scan’ must be resolvable as opposed to “krac-scan.india.com”). The IPs must not be assigned to a network interface (on the cluster), since Oracle Clusterware will take care of it.

· kracnode-scan 192.168.1.72

· 192.168.1.70

· 192.168.1.71

· You can check the SCAN configuration in DNS using “nslookup”. If your DNS is set up to provide round-robin access to the IPs resolved by the SCAN entry, then run the “nslookup” command at least twice to see the round-robin algorithm work. The result should be that each time, the “nslookup” would return a set of 3 IPs in a different order.

· First nslookup

· [root@kracnode2 ~]# nslookup kracnode-scan

· Server: 192.168.1.100

· Address: 192.168.1.100#53