This FAQ answers questions regarding High Availability on SAP systems. General nomenclature is covered as well as SAP specific questions. Links to detailed infromation are available at the High Availability Page on SDN.
- What is a "High Availability Cluster"?
- What is "switchover"?
- What is "failover"?
- What is "fallback"?
- What is an "active-passive cluster"?
- What is a "standby system"?
- What is an "active-active cluster"?
- What does "virtualization" mean?
- What do "shared nothing" and "shared all" mean?
- What is a "SPOF"?
- How can I set up a high available system?
- What is ASCS/SCS?
- What is the difference between HA for ABAP and HA for Java?
- Is SAP HA also supported for heterogeneous system landscapes?
- Does SAP support Microsoft Geo Clusters?
- Is there a session failover mechanism for SAP NetWeaver AS Java?
- What is an "Enqueue Replication Server"?
- Is an "Enqueue Replication Server" necessary for setting up an SAP HA system landscape?
- Is SAP NetWeaver AS Java available also if a central service or the database fails?
- What is the difference between Enqueue Server and Standalone Enqueue Server?
- Why is SAP using an active-passive cluster configuration?
- Are my transactions lost after a database failover?
- Is there a certification for SAP HA solutions?
- Who performs the High Availability setups for SAP solutions?
- Does SAP NetWeaver AS ABAP/AS Java/Composition Environment support High Availability
SAP High Availability and the Windows Environment
- How does SAP support High Availability in Windows environments?
- What is a cluster resource group?
- Why can't I access the shared disks when they are part of another cluster node?
- How many failover nodes are currently supported by MSCS/SAP?
- How can I initiate a failover on MSCS manually?
- What is the "Threshold"?
- Are there other cluster environments available for the Microsoft Windows platform?
- Where can I find additional information regarding the Microsoft Cluster Service?
SAP High Availability and the UNIX Environment
- How does SAP support High Availability solutions on Unix/Linux?
- Where can I find documentation and resources for SAP HA setups on Unix/Linux?
- I want to set up my Dialog Instance on Windows and the SCS on Unix. Is this type of setup supported by SAP?
General High Availability
What is a "High Availability Cluster"?
What is "Switchover"?
"Switchover" is referred to as a planned switchover of a primary server to a standby server, which means without a failure of the primary server. A switchover is always initiated by the system administrator.
What is "Failover"?
“Failover” is referred to the process of an unplanned switchover from a primary server to a standby server system in case of a system fail of the primary server node. Other than a switchover a failover is performed automatically by the cluster software. Some cluster software, such as Microsoft Cluster Service, does also provide an option for a manual failover for testing purposes.
What is "Fallback"?
“Fallback” is referred to the process of switching back from a secondary server node to the primary server node after a failover occurred and the primary server node is available again. A fallback can be done automatically by the cluster software or intelligent, which means manual by the system administrator.
What is an "Active-Passive Cluster"?
An active-passive cluster consists of two independent server nodes at a minimum. The primary server node performs all operations. A secondary node acts as a so called "standby system."
In case of a system failure of the primary node, the cluster software fails over automatically to the standby server node, which starts the processes and resumes the work of the primary server node. Cluster groups are only active on one server node at the same time.
Please note that an active-passive cluster configuration do not implicate that the standby server node does not contain any workload. The active-passive configuration only referrers to the cluster group, which means that in an active-passive configuration the resource, can only be active on one server node at the same time. However, if a cluster environment contains more than one cluster group, theses groups can be distributed within the cluster environments.
The SAP Central Services Instance is implemented as an active-passive cluster.
Active-Passive cluster in normal state
Active-Passive cluster in failover state
What is a "Standby System"?
A standby system is a redundant cluster node that takes over the processes if the primary cluster server fails. This is referred to as a "failover" and is performed automatically by cluster software. There can be several standby cluster nodes. The amount is only limited by the capabilities of the cluster software. For example, the Microsoft Cluster Service provides up to 8 server nodes, which means 7 standby nodes as a maximum.
"Standby systems" can be in a "hot" state or a "cold" state. A "hot standby" means that the processes run on the standby node also, which means that in case of a failure the cluster resource is running already and does not need to be started on the standby system. A "cold standby" means that in case of a failover the clustered resource needs to be started on the standby system which means that a (short) downtime during the failover occurs. The SAP Central Services Instance is usually implemented as a "cold standby" system due to the fact that SCS is a light component and does not need a long time for startup.
What is an "Active-Active Cluster"?
An active-active cluster consists of two independent server nodes at a minimum. The workload within a cluster resource is shared between the server nodes. If a cluster node crashes the processes are resumed by the remaining cluster nodes. An active-active cluster configuration means that a cluster resource is active on all cluster nodes. The aim of an active-active cluster is not only to provide high availability system but to distribute the workload between the cluster nodes. Applications with a very high workload like databases benefit from an active-active setup. Due to the SAP Central Services Instance is a light component an
active-active setup does not make any sense. Therefore the SCS is implemented as an active-passive cluster resource.
What does "virtualization" Mean?
The term “virtualization” in the context of HA refers to a kind of abstraction performed by the cluster software. The software creates a virtual host that owns a virtual hostname, virtual disk, and so on. “Virtual” in that manner means that such resources cannot only be owned by one physical machine but by all of them. Which node currently owns or runs a resource is managed by the cluster software. Related “resources” are usually grouped to logical containers (for example Groups on MSCS or packages on HPSG) that can perform failovers independently.
What do "shared nothing" and "shared all" mean?
The term "shared nothing" and "shared all" specifies a type of architecture within an active-active cluster. "Shared nothing" means that every cluster node contains its own data partition, which implicates that these kinds of setup are not highly available due to the fact that in case of a failure the data of the failed node is no longer available. In a "shared all" environment the different cluster nodes that run the same service shares a data partition and accesses the data concurrently.
These options have to be supported by the cluster software. For example, MSCS does not support the "shared all" option.
What is a "SPOF"?
A Single Point of Failure (SPOF) is any component within a system that, if it fails, causes a loss of a runtime critical service. A SPOF can be hardware or a software component. However, this FAQ only covers the SAP identified potential SPOF software components that are Message Server, Enqueue Server, the central file system and Database. In a High Availability manner it is necessary to eliminate these SPOF.Be aware that it is possible to introduce additional SPOFs through configuration and programming, by adding critical, non redundant components yourself. These must be identified through an analysis and either be eliminated or covered by failover services.
SAP High Availability Setup
How can I set up a High Available system?
SAP supports the installation of High Available systems with several aspects. However, High Availability needs additional software, that is not delivered by SAP. In addition, it is important to analyze your system to be sure that no single points of failure are overseen, as it is possible to configure your system in that way or to write custom software that behaves that way. We recommend to engage experienced consultants on this analysis.
In general SAP systems are set up High Available through their technology components, which are the application servers. However, it is also responsibility of a running program to not introduce additional singel points of failure. At SAP this is ensured through extensive quality management, for custom development this should be carefully considered.
What is ASCS/SCS?
With SAP NetWeaver 04 Java, the Message Server and the Enqueue Server are separated from the Central Instance. These two services are grouped within the SAP Central Services Instance (SCS) as services. From NW04s the ABAP Central Services can be also separated from the Central Instance. Each stack, ABAP and Java, has its own Message Service and Enqueue Service. For ABAP systems the Central Services are referred to as ASCS, for Java systems the Central Services are referred to as SCS. The ASCS and the SCS are leveled as SPOF and require a High Availability Setup therefore. If the ASCS is integrated within the ABAP Central Instance (standard in NetWeaver 04) the Central Instance of the ABAP system needs a HA setup also.
What is the difference between HA for ABAP and HA for JAVA?
Within SAP NetWeaver 6.40 ABAP the Message Server and the Enqueue Server are integrated within the ABAP Central Instance (CI).
In SAP NetWeaver 6.40 Java the Message Server and the Enqueue Server are implemented as services within the SAP System Central Services Instance (SCS) and separated from the Central Instance (CI) this way.
With SAP NetWeaver 04s ABAP the Message Server and the Enqueue Server can be separated from the Central Instance (CI) to the ABAP SAP Central Services Instance (ASCS) in the ABAP stack also (which is recommended for HA setups due to the ASCS is a light component that can be switched over easily).
Is SAP HA also supported for heterogeneous system landscapes?
SAP does not support heterogeneous HA cluster environments at the moment officially. However, preparations for an official support of this type of setup are currently evaluated.
Does SAP support Microsoft Geo clusters?
Replication can be synchronous or assynchronious, depending on the functionality of the storage subsystem, accepted amount of data loss during a failover, the physical layout of the storage area network (distance between the storage boxes, signal latency, capacity and speed of the network connection) and last but not least the budget of the customer and the functionality supported by the database vendor.
Standard SAP installation procedures are normaly used during the SAP System installation of those configurations. But depending on the choosen configuration some steps are different - here again the hardware vendor has the responsibility to deliver the information and support or perform the installation.
Is there a session failover mechanism for SAP NetWeaver AS Java?
NetWeaver AS Java supports a session failover mechanism using DB, local persistence, or shared memory(7.1) which can be implemented in applications. Please take a look into the documentation for further information on how to do that. See the documentation for Failover System in Version 7.0 or Configuring Shared Memory for Version 7.1
What is an "Enqueue Replication Server"?
The Enqueue Server contains the central locking table for the SAP cluster. Besides database locks it also consists of infrastructure locks of system wide objects. It is therefore necessary to secure the locking table in case of a Standalone Enqueue Server failure. The SAP Enqueue Replication Server provides a replication mechanism for the Enqueue Server by holding a copy of the locking table within its shared memory segment. After a failure of the Enqueue Server the locking table can be restored this way. Since SAP NW04 SP15/ NW04s SR1 an automated installation of the Enqueue replication server is available for Windows environments. UNIX/ Linux installations are handled by SAP hardware partners.
Note: you can only protect Stand-Alone Enqueue Servers with an Enqueue Replication Server. The standard Enqueue Server in an ABAP CI (Enqueue work process) cannot be protected by an Enqueue Replication Server.
Is an "Enqueue Replication Server" necessary for setting up an SAP HA system landscape?
Is SAP NetWeaver AS Java available also if a central service or the database fails?
What is the difference between Enqueue Server and Standalone Enqueue Server?
Since NW04 Java the Enqueue Server and the Message Server for a J2EE Engine are standalone services hosted by the SAP Central Services instance (SCS). The difference between a Standalone Enqueue Server and Enqueue Service is therefore only formal: The term “service” refers to the Enqueue as part of the SCS; the term “server” refers to the Enqueue as process, either enserver (.exe) or the enqueue work process within the ABAP stack. From NW04s the Enqueue Server and the Message Server are also implemented as Services within the ABAP Sap Central Services Instance (ASCS).
Are my transactions lost after a database failover?
The impact of a database loss depends on the implementation of the database. Some databases supports a session failover mechanism, others do not. Please consult the database specific documentation for further information.
Is there a certification for SAP HA Solutions?
Does SAP NetWeaver AS ABAP, AS Java and Composition Environment support High Availability?
Yes. They all do under the mentioned conditions in this FAQ and the manuals.
SAP High Availability and the Windows environment
How does SAP support High Availability in Windows environments?
What is a cluster resource group?
Why can't I access the shared disks when they are part of another cluster node?
How many failover nodes are currently supported by MSCS/SAP?
How can I initiate a failover on MSCS manually?
What is the "Threshold"?
Are there other cluster environments available for the Microsoft Windows platform?
Where can I find additional information regarding the Microsoft Cluster Service?
- Guide to Creating and Configuring a Server Cluster under Windows Server 2003
- Clustering Services - General introduction to MSCS
- Step-by-Step Guide to Installing Cluster Service
- Technical Overview of Windows Server 2003 Clustering Services
- Windows 200 Clustering Technologies - Information for W2K clustering services
- Clustering Technology Community
Books
- "WINDOWS NT Microsoft Cluster Service," by Richard R. Lee ISBN: 0-07-882500-8
- "Implementing SAP R/3 using Microsoft Cluster Server," by David V. Watts, Mauro Gatti, Ralf Schmidt-Dannert ISBN: 0-13-019847-1
- "Tuning Microsoft Server Clusters: Guaranteeing High Availability for Business Networks," by Robert W. Buchanan, Robert Buchanan ISBN: 0071417397
- "Windows Server 2003 Clustering & Load Balancing", by Robert Shimonski ISBN: 0072226226
SAP High Availability and the Unix environment
How does SAP support High Availability solutions on UNIX/Linux?
The High Availability setup on UNIX/Linux is supported by our partners and not by SAP directly.
Where can I find documentation and resources for SAP HA setups on UNIX/Linux?
SAP does not provide any specific cluster guides for the implementation of SAP clusters within Unix HA environments due to this task is handled by our hardware partners.