Overview

In any mission critical communications system it is imperative to have built  in-redundancy. VCOM supports this function with a Failover capability. 

Failover by definition is the ability to automatically switch over from a  primary working server to a secondary backup server should there be any catastrophic failure in operation of the primary server or its associated network.


Implementation


In the primary server, the IP address of the secondary server is configured.  When the primary server starts, a connection to the secondary server is immediately established. Through this connection the primary server shares its licensing information to the secondary server so that individual licensing for the secondary server is not required. Additionally, the primary server conveys any operational changes to the system configuration to secondary  server in real time so that the system configurations remain synchronized. This connection will remain active as long as both servers are running. If this connection is lost, the secondary server will immediately assume it is an active server allowing VCOM clients to connect. However in some cases even if the connection is not lost, the complexities of some network failures may still warrant the secondary server becoming the active server.


When any VCOM client logs into the primary server, the secondary server IP  address is automatically provided to it. In the event that communications with the primary server is lost, the client will automatically attempt to connect to the secondary server. If the secondary server is available and active, the VCOM client will log into the secondary server. Once the secondary server becomes the active server, switching back to the primary server generally will require a manual authorization as the condition that caused the failover would need to be properly evaluated to ensure there is no possibility for re-occurrence of that event which would unnecessarily disrupt active communications. The manual switchover can be controlled through the System Administration application.


Failover Criteria


In normal operations, the primary server is always the active server. In  general as long as the communications link between the primary and secondary server is connected, the primary server will remain as the active server. When a server is not the active server, logins will not be allowed. There are many different scenarios that can result in a failover event. The most common are as follows:


Communication link between primary and secondary servers lost due to primary  server failure. 

In this simplest scenario, the secondary server would recognize the  loss of the primary server and immediately become the active server. 

All clients would also recognize the loss of the primary server and  would immediately reconnect to the secondary server. 


  • Communication link between primary and secondary servers lost due to failure of network infrastructure. In this scenario, the secondary server would recognize the loss of the primary server and immediately become the active server. Since the primary server is still running, it too would also still consider itself to be the active server. However, if the network failure also resulted in the simultaneous loss of the majority of connected VCOM clients the primary server will deactivate itself forcing all remaining clients to connect to the secondary server. 


  • Communication link between primary and secondary servers is not lost but partial failure of network infrastructure. In this scenario, the partial network failure may result in the loss of a large portion of the VCOM clients. In this case the primary server will inform the secondary server to activtate allowing connections to be made. If the secondary server reports the client connection were established the primary server will deactivate itself forcing all remaining clients to connect to the secondary server.


Summary

The VCOM Failover support is an integral part of any mission critical  communications solution. While its operation will never be visible to the end user, it availability in the case of an unforeseen catastrophic primary server failure will quickly restore communications capability.