Why is a Virtual IP Necessary in an Oracle RAC Environment

By: Scott Jesse, Bill Burton, Bryan Vongray


Why must we use a VIP address in an Oracle RAC environment? The simple answer to this question is TCP timeouts. Let’s discuss this a bit more.

TCP timeouts, believe it or not, play a huge part in the perceived availability of applications. When a node in an Oracle RAC environment goes down, or in any MAA environment with multiple addresses to attempt, there may be no way for the client to know this. If a client is connecting using a TNS alias or a service that allows connection to multiple nodes, the client may unknowingly try its first connection attempt to the node that is down. This in and of itself is not a problem, as multiple addresses should be in the list, so that when the client fails to get a response from the first address in the list, the next address will be tried, until the connection succeeds. The problem lies with the time that it takes to go to the next address in the list.

How long does the client wait to determine that the host it is trying to reach is not accessible? The time can range anywhere from a few seconds, to a few minutes, and in some environments this is simply unacceptable. If a node goes down for several hours, days, or weeks, the database may still be humming along just fine, with x number of nodes still accessing the database. However, some clients may always be trapped into making the initial connection attempt to the down node, and will therefore be stuck in front of a (seeming) interminable hourglass while the connection is timing out, before being rerouted to the next address in the list.

Reigning in TCP timeouts at the OS

Unfortunately, this time is something that is generally outside of the control of Oracle. In addition, it varies from client to client, and operating system to operating system. It is controlled by the operating system timeout values, on the client side, so making modifications to all clients can be cumbersome, since there may be many clients and many variations to configuration changes need to be made. In addition, changing the timeout values may also result in adverse consequences on other applications that the clients are running, if other applications rely on a higher TCP timeout value, for whatever reason.

To make matters worse, the behavior may not be consistent. If client-side load-balancing is enabled, it is possible that some connections will succeed immediately on their first attempt, because they just happened to connect randomly to a node that is available. At other times, however, the connection time increases, because the client randomly and unwittingly picks the down node for its first connection attempt. The result of this is confusion and frustration at the client side, even though from the database’s perspective everything is functioning as it should.

Giving the MAA DBA control over TCP timeouts

Enter the VIP address. By using a VIP address, Oracle eliminates the problem with TCP timeouts on the initial connection, without the need to make any changes to a single client machine. This is done by enforcing client connections first to come in on the VIP address for all connections. When all nodes are functioning properly, each VIP is running on its assigned node, and connections are directed to the appropriate listener and service. When the unthinkable happens, and a node fails (gasp!), CRS will kick in, and the VIP for that node will actually be brought online on one of the remaining nodes of the cluster, where it can respond to a ping and also to connection attempts. Note that this VIP will not be accepting connections to the database at this time. However, since the IP address is available, it will be able to respond to a connection attempt immediately.

The response given to the client will generally be in the form of an ORA-12541, advising that no listener is available. This is because the node where the VIP now resides has its own listener, but it is listening on its own VIP, not the VIP of any other nodes. The client, receiving the message that there is no listener, will then immediately retry, using the next IP in the ADDRESS_LIST, rather than waiting up to 2 minutes for the timeout we would normally expect. Thus, a connect-time failover has still occurred, but the connection attempt succeeds within a second or even faster. Even though a client uses the SCAN, the local listener is still listening on the VIP.

Why a SCAN virtual IP?

Having answered the question of why we need VIPs, your next logical question might be, “Why a SCAN VIP?” and subsequently, “Why does Oracle recommend three of them?” The SCAN VIP works a bit differently from the “normal” VIP. We will call the “normal” VIP a local VIP for the moment, as these VIPs are associated to the local listener. We know that if a node fails, the VIP fails over to another node. So far, the SCAN VIP and local VIP act the same. The difference comes into the game as the failed over local VIP replies to a ping, but its local listener does not listen, for SQL*Net connections because it cannot failover. The SCAN listeners in comparison can run on any node of the cluster. So the SCAN VIP not only has the task of avoiding a wait for the TCP timeout, but it must also ensure that the SCAN LISTENER associated with the SCAN VIP can be started on every available node in the cluster, if needed.

Why does Oracle recommend that you set up the SCAN name with three IP addresses, thus having three SCAN VIPs and three SCAN listeners? The answer is related to the subject of this book: Maximum Availability. You can configure your cluster with only one SCAN VIP/listener, but this would make a responsible MAA DBA very nervous. An MAA DBA could not sleep at night, concerned that her SCAN listener would fail. Redundancy is the answer to the question, but it’s not the whole story, because the question of Why not two? or Why not one SCAN per node? can still be asked. Having two would cover the redundancy requirement, but having three ensures that the connect load of the SCAN listener would not get exhausted, would reduce the CPU cost per node, and would require the least amount of decision on the part of the cluster. Having three is plenty.

NOTE

The benefit of using SCAN is that the network configuration files on the client computer do not need to be modified when nodes are added to or removed from the cluster.

Comments

Leave a Reply