3 Posts
Zoran Popovic

Support pipeline

Posted by Zoran Popovic Sep 23, 2010

Support Pipeline scenario


In this article I am explaining actual upgrade scenarios proposed by me in year 2008 for the upgrade part of a larger optimization, roll-out and upgrade project of SAP ERP landscape (ECC5 to ECC6 Ehp3). In the end, final scenario with all consolidation options was chosen, and upgrade was finished with a minum development downtime (development freeze) of less than one week (instead of two months or more). Whole upgrade lasted almost six months, including all preparations, sandbox installations and tests, official development system upgrade was started in February 2009, and Go Live was completely successfully finished by the end of last weekend in May, 2009.

Basic Scenario - SCENARIO 1


Official (ASAP) Upgrade Roadmap document available on Marketplace (service.sap.com/upgrade) and in Solution Manager, Upgrade Master guide, SDN and other official SAP sources offer many upgrade scenarios - from which the most simple one is a scenario which I will call SCENARIO 1 ("by the book"), while there are more sophisticated ones (like CBU, switch scenarios, etc.) which are not discussed here. I will not discuss here all the technical details of an upgrade process, only those which affect the decissions about project timeline, development freeze duration and possible hardware purchase or h/w leasing. What is here discussed is the possibility of variants of SCENARIO 1 by using parallel Transport Management routes (or so called Support Pipelines, Transport Lines, or just Lines, [2]) with some degree of additional hardware usage or consolidation in order to make development freeze as small as possible, and for other support purposes during the upgrade project.


In almost any  standard upgrade procedure scenario we must have a test SANDBOX system (as described in references). By a standard upgrade SCENARIO 1, a sandbox system as a homogeneous copy of the production system is needed for:


  • Initial technical upgrade (from ECC5 to ECC6): prepare, start of upgrade, SPAU/SPDD, modifications, support packs, etc.


  • Functional tests needed to improve procedures, correct problems, create and plan all necessary activites - this can be iterated until achiving satisfying results


  • Generating upgrade script (a detailed set of upgrade activites, participants and dependencies e.g. as given in )


In business environments without so strict validation practices it is possible to use this system as a future upgraded D (development) system (as proposed, if I have understood well as I wasn't present on his presentation/workshop), or it could be used as a place for formal functional tests (as proposed in references, upgrade/business case) and a future Q (Test and QA) system - thow they also propose creating a new (P) production system. Company needs three-tired (D-Q-P) lines in the landscape for validation purposes. In all cases, without taking into account which servers are physically used for ECC6, three-tiered systems (D-Q-P) and any upgrade scenario need this series of events (as given in references, [4]) - the UPGRADE LINE:


Figure 1

Fig. 1 - upgrade steps in the basic scenario - scenario 1


Parallel Hardware Infrastructure - SCENARIO 1A


In a variant of such a scenario (one of many which involve system copies or original systems and landscapes), whether we could use a parallel h/w infrastructure for ECC6 - let us call it  SCENARIO 1A:


Figure 2

                                                                                                Fig. 2 - „parallel" h/w landscape


Arrows in fig. 2 show direction of system copy, D5-Q5-P5 is the existing ECC5 line (landscape) of SAP systems (D5 stands for dev elopment, Q5 for test system and P5 for production on ECC5, while D6 is development on ECC6, and so on) which is both the maintenance and the upgrade line later on, and D'6-Q'6-P'6 is the ECC6 sandbox line (though, only one sandbox system is normally sufficient for "rehearsal" purposes and for upgrade script generation) made by heterogeneous system copy of each system established as a separate line in TMS on new set of servers. On the other hand, as in SCENARIO 1, there is only D5-Q5-P5 line. This scenario does not differ much from basic scenario (development freeze starts as soon as the upgrade of development system starts).


At one point, the following series of events (*) is inevitable in any of the previous scenarios (SCENARIO 1 and SCENARIO 1A) - the place of the beginning of development freeze is same:


  • 1. Development freeze
  • 2. D: Upgrade D5 to D6, corrections, formal unit testing, possible repeating step 2.
  • 3. Q: Transports generated by upgrade in step 2 from D5 upgrade. Including corrections (somewhat faster than complete upgrade of Q5 system instead of using transports of changes), formal integration tests, possible repeating step (2. and) 3. If successful, we have Q6 as result.
  • 4. P: optional testing upgrade on renewed sandbox and possible repeating steps 2. or 3.
  • 5. Transports from D5/Q5 upgrade, Final upgrade of P5 to P6 and GoLive


The thing here is that integration tests should take at least few weeks (with all the performance impact on new Q6 system), without the days needed for the pure technical upgrade itself. In all scenarios here mentioned the D'6-Q'6-P'6 line plays actaully the role of the sandbox system expanded to whole landscape, which might give some downtime decrease in the development freeze phaze - but whith extra hardware and effort which is not justified (at least I seriously doubt so, if h/w not borrowed I am certain). This downtime is not important if it is not influencing Development Freeze, as given and explained in the next scenario - SCENARIO 1B - and this is the only important difference between these two scenarios. Hardware can be even more consolidated as in the scenario to be proposed here bellow:


Parallel Hardware Infrastructure - SCENARIO 1B


An alternative solution (with variants) is given in  where an additional support line made of heterogeneous system copies of D and Q (or renewed copies from the SCENARIO 1A) is used:


Figure 3a

Fig. 3a - Temporary Support (pipe)line scenarios (as in [2])


In that manner, production system stays intact until the previously described step 5 - let us call this SCENARIO 1B. Formal validation (unit and integration) tests can be done without development freeze (**). In this case we have hardware performance-ready with optionally minimal additional hardware needed for D and Q copies instead of whole parallel h/w infrastructure as in SCENARIO 1A. Therefore, I propose for SCENARIO 1B.


SCENARIO 1B could be OPTIONALLY more comfortably realized with addtional hardware (but this is not a mandatory requirement) which could be leased/borrowed (or bought) during the initial upgrade phase as project support resources.

In this case it is even possible to make several levels of h/w consolidations with existing hardware infrastructure and other business requirements (and our future SAP systems if bought, depending on business needs, e.g. additional D/Q Solution Manager system, BI requirements, additional archiving solution with SAP Content Server, etc) - our existing SAP hardware was recently upgraded with additional RAM.So BI test system BWQ needs just one physical node, other node could be used in the support line of servers (and might even be temporarily left on only one cluster node with functionally intact). Sandbox system can be established on one of our integrity servers (also upgraded with RAM, like SAP5 in fig. 3c). MSCS switchover tests are necessary only on new (ECC6) ERQ servers.


If possible, borrowing servers or some RAM from our hardware vendor (HP, Belgrade) or his partners could significantly simplify this scenario. This would save a lot of time, effort and minimize the risk about these activities. Nevertheless, it is possible to make sufficient hardware consoloditions by replacing existing RAM modules from our existing servers which are less loaded in order to improve performance on critical servers (e.g. Q5 systems for testing, D5 for upgrade to D6), with insignificant risk of performance issues on old development and sandbox systems. But these actions have to be planned on time and carried out with devoted attention.



Possible h/w Consolidations with SCENARIO 1B


The SCENARIO 1B could be compared with SCENARIO 1A with a diagram on Fig. 3b similiar to Fig. 3a:

                                                                                    Figure 3b

Fig. 3b - Upgrade and Temporary Support (Production Maintenance) lines


But here D5 and Q5 are copied into D'5 and Q'5 (and so is P5 into SANDBOX system), which together with P5 form the new Temporary Support (Production Maintenance) line:  D'5-Q'5-P5  - which is used for urgent changes in the system (so called „fast-tracks", minimized in number as much as possible). These additional changes need to be taken care of in the post-GoLive phaze or during upgrade. Physically same servers D5 and Q5 are then upgraded to D6 and Q6 by the upgrade script generated on SANDBOX system. Development freeze is smaller than in SCENARIO 1A.


The bottom line here is that it is possible not to buy NEW HARDWARE for project support (no servers at least) with only potential performance issues with new ERD (and sandbox) consolidated on one of our existing servers which could be handled successfully. In the Fig. 5 bellow is a diagram of SAP System Landscape in our company with systems, transport lines and physical server names. The idea is to „borrow" two servers from ERQ and BWQ servers, using existing clusters and leaving temporary support line out of the cluster (which should not be an issue). In the fig. 3c bellow are is given a proposal of hardware consolidations using existing hardware landscape and only one available additional server (SAP5, rx4640 with 1 CPU and 4G RAM). Physical servers are greyed-out, clustered nodes are grouped (MSCS, Microsoft Cluster Services):

Figure 3c

Fig 3c - Consolidation h/w maps



Better consolidation could be achieved by using MCOD for D'5 and Q'5 systems on one server (two systems with one database). This also enables using existing snapshot EVA Storage functionality for creating instant copies of database VLUNs while saving space (which is very important) and Oracle transportable tablespaces for binding D5 and Q5 databases into one database. But in this scenario 8G RAM is mandatory for optimal performance. Using virtualization technology would enable more options and more optimal consolidation in all scenarios.


All urgent changes would be treated as fast tracks (urgent corrections as covered by standard change management procedures in the existing "normal" landscape). After the production upgrad and Go Live (but before any new change requests in the upgraded landscape), these would be implemented. There are two possibilites for their handling:


  • each development in D'5 should be repeated in D6 (transports are only possible to D5, e.g. before upgrade to D6), and transported to Q6 and P6 later


  • after the upgrade, D'5 (or a system copy of D'5) could be upgraded, and all fast tracks could be transported to D6 after that



Upgrade steps in SCENARIO 1B in a skecth could be compared to steps (*) described earlier, which now need no Development Freeze in step 1, but only during the final step 5 - upgrade of production:


  • 1-3.  Same as steps (*) 2-4.
  • 4.    Development Freeze (**)
  • 5. Transports from D5/Q5 upgrade, Final upgrade of P5 to P6 and GoLive
  • 6. Copying fast-tracks back upgraded D and Q systems.


Hardware Requirements and Sizing Estimations Concerning ECC6 Upgrade


In all these scenarios I find that pure technical upgrade of ECC systems can be estimated as follows (by Note 901070, SAP benchmark information, and Oracle documents given all in references) - by upgrading from ECC5 to ECC6 (and from Oracle 9.2 to 10.2) we have only a slight increase in need for the h/w resources:


System Component









0-5% / insignificant

0-5% / insignificant



Checking necessary disk space and datafile consumption by the guide is necessary. Some indications in references show that it is possible to have up to 25% DB growth during upgrade, but this is not an issue for ERP system as it has already more than 50% free space in the database files, and sandbox system didn't show such behaviour.


The only important issue about all these scenarios is the currently available Storage space for Temporary Support  hardware (in any scenario, but especially for SCENARIO 1A) and during project implementation. With some additional effort and incoming resources for backup disks this might be solved, too, but this needs to be carefully estimated. But in any case, Storage space has to be well planned for the future.


Hardware Sizing and Other Project Requirements - Quicksizing


On the other hand, system sizing and Storage space is also significantly influenced by Sizing produced by the Rollout & Optimizitaion activities and appropriate parts of the project. This has to be discussed separately, BC needs input from each team about estimated number of users and their types, significant changes of system usage (numbers of document generated, etc) and other information needed for the quicksizing utility which should be part of the regular project business blueprint activities. These are planned to be until 01.09.2008, but we need this info ASAP in order to inform out h/w vendor (HP) on time and to place orders as needed.  Timelines are still not cleared. I expect that it would be mostly about additional RAM, CPUs and Storage, and optionally about borrowing servers. HP also offered us help from his SAP experts at least on this topic, which could be very useful for the whole project.



BI hardware landscape and system sizing for it's upgrade is unknown in terms of project scope (to be defined in the project scope). For Solution Manager, company has requirements for only one upgraded system (as seen from workshops, SolMan landscape with development and production is not in the scope of this project). Using Web GUI for BI and Java instances for BI should also be considered in system sizing.


Borrowing servers as temporary resources and support during the project is also a good option which influences mainly the project timeline, but are not important after GoLive if no User Requirements emerge during BBP (e.g. BI systems and Solution Manager usage).






[1]    Road Map V3.2 (available in Solution Manager, or Upgrade media)


[2]    http://www.thespot4sap.com/upgrade_guide_v2.pdf


[3] http://www.sap-press.de/download/dateien/1324/sappress_netweaver_application_server.pdf


[4]    ADM326 Upgrade Course


[5]    http://www.sapdb.info/sap-ecc6-upgrade/


[6] Upgrade Master (service.sap.com/instguides)


[7]    http://service.sap.com/benchmark   





[9] Keter case study: Keter_ECC6_Upgrade_Presentation_170107.ppt


[10] http://service.sap.com/quicksize


[11] http://service.sap.com/upgrade

SAP, Linux, Virtualization - Itanium ... continued <br />


I have started this topic in post SAP, Linux, Virtualization and - Itanium ... and I have now some new findings and results. Moreover, I have broadened referring categories and audience as it might be additionally interesting also to people interested in SAP with virtualization and moving to Linux (both from Windows and Unix).

I have decided to make some approximate tests and shortly improvised benchmarks with virtualized and bare metal SAP systems I have got. My goal was making comparison with similar bare metal systems and different platforms, not making exact results comparable with some official tests. I am interested in making further inquiries and I welcome any suggestions and recommendations both about benchmarking and this environment in general. Sometimes it is very difficult to make meaningful interpretations out of specific and formal benchmark results. General look and feel is maybe not the only important thing and it is not an exact measure, but it is certainly very important. Here is the data ...

h2. Some benchmarks

Tests were made on SAP systems that are a homogeneous copies of
production system ERP / ECC6.0 Ehp3 with 1.1T Oracle db ( with
113499 chosen objects for SGEN. I have three systems with here given system IDs:

    • ERM: central system, installed on HVM based on host with new 4p 1.6GHz Montvale CPUs (2 cores, 2 threads each), 4 VCPUs, 6G RAM, Windows Server 2003 SP2 EE, latest updates

    • ERC: central system, installed on PVM based on same host with new 4p 1.6GHz Montvale CPUs, 4 VCPUS, 6G RAM, RHEL5.4 with latest patches
h3. General observations

SGEN load running time on ERC (about 8-9 hours) was similar or even slightly faster than on ERP, while on ERM it was running more than twice as longer, and similar thing was with database statistics. Long running job about closing financial period which lasts about 2.5 hours on ERP is running about 4 hours on ERM. I've got wildest difference with FS10N transaction (I was measuring minimum time needed in several attempts for three stages with same parameters: initial summary generation, first dialogue about all items, and the final report):

* System</td><td> 1st stage
</td><td> 2nd stage
</td><td> 3rd stage
 ERP 1 second 2 seconds 21 second
 ERC 1 second 7seconds 1 minute
 ERM 10 seconds 45sec
 30 minutes

I know that what I offer here is probably not the best way to prove ideas about performance, but that is what I've made in a shortest time, trying to gain a general overall impression about these systems. I would like to hear other people's experience or opinions, or even better some practical suggestions about these tests and systems.

h3. Few words on disk I/O

 For a start, just to depict one of the most important indicators both about database performance and about virtualization - the guest disk I/O. I have intentionally avoided network I/O as it is less critical: I got quite similar performance while testing Copy/Paste in bare metal Windows and PVM smbclient performance. There are many test tools (fio, iozone, bonnie, bonnie++, lmbench, unixbench, netperf, ...) and more formal approaches, but I have used here simple Copy/Paste time in Windows or just cp or dd on linux (average measurement for few tests, with different files in order to avoid cache buffering), and it is not an exact science definitely. First test was with about 500MB or so of small files (Program Files directory), second with 800MB of 110-120KB files, third was based on much larger files (datafiles up to 1GB) -  the 1.1T database was copied with dd in 9 hours on the PVM using both cp (both SAN vluns
presented on same storage, ntfs-3g and ext3) and dd similar, which was close to bare metal ERP.

* Test</td><td> ERM</td><td> ERP</td><td> VMWare</td><td> ERC*
 diff. small files
 30MB/s / /
 1x  110-120KB 25MB/s 62MB/s 26.6MB/s 66Mb/s
 1GB / 53MB/s / 35.6MB/s


 I still don't have all figures, but I have learned that VMWare ESX4.0 physical drive (SAN lun as raw drive using virtual LSI scsi adapter) performs on the same storage similar as Xen HVM physical drive (also SAN lun) with larger files. Generally, these results are not giving whole picture, except that HVMs have poor I/O (pitty that SIOEMU domains are not available) compared to PVM and bare metal. It shows that sometimes disk I/O bandwidth on PVMs compared to HVMs can be as good as
on bare metal having consistent results with files 100MB or less in
size, and making better but inconsistent results with much greater
files - while latency remains poor. On forums I saw that people seem to get consistent results if
the file size is less than 100MB but it is very inconsistent if using
file size > 1GB.

h3. Some DB related results


I am showing here the basic settings and statistics for the database of each system (ERP db node has no SAP instance on it, just db node in a cluster, just to remind), with an ad hoc test by executing a simple query (with 259480 records in the table and "set timing on" in sqlplus) and taking minimum value (first execution takes usually more time mostly due to parsing and buffer hit). ERP is probably tuned better, but other systems are sized and set according to resources at least with most important parameters (including clear situation in ST02 for abap, but I have omitted to correct db cache for ERC).

* System</td><td> ERP</td><td> ERM</td><td> ERC</td></tr><tr><td> RAM</td><td> 24G</td><td> 6G <br /></td><td> 6G <br /></td></tr><tr><td> sga_max_size</td><td> 12G</td><td> 3.5G</td><td> 4G <br /></td></tr><tr><td> pga_aggreagate_size</td><td> 2G</td><td> 800M</td><td> 850M</td></tr><tr><td> shared_pool_size</td><td> 1.2G</td><td> 832M</td><td> 400M <br /></td></tr><tr><td> PHYS_MEMSIZE (abap)</td><td> 14G</td><td> 3G</td><td> 2.5G <br /></td></tr><tr><td> db cache (buffer) </td><td> 6G</td><td> 2G</td><td> 1.1G <br /></td></tr><tr><td> data buffer quality<br /></td><td> 99.5%</td><td> 91.3%</td><td> 96.6%</td></tr><tr><td> DD cache quality<br /></td><td> 95.9%</td><td> 98.8% <br /></td><td> 90.7%<br /></td></tr><tr><td> select count() from dba_objects
 3.35sec 2.35sec 1.58sec


 These values depend on system load and user activity, so they can show ambiguous results without proper conditions (running long enough with similar load at least). I have made ad hoc tests using SE30 transaction's tips&tricks (one of
my favorites) which show slightly better results on HVMs in some cases. There are some template queries there and snippets which can be executed and measured in milliseconds (each having two variants, one on the left side, another on the right side of the screen). I have executed some of them during regular system load, each few times (up to 5-10 times) and took the lowest (minimum) value on each system (which is also very close to average):

*                       SE30: (microseconds, lowest value per each system)

</p><table border="1" width="428" height="322" align="center"><tbody><tr align="center"><td> Test</td><td colspan="2"> ERP</td><td colspan="2">ERM
</td><td colspan="2"> ERC*
  similar main similarmain
 similar main similarmain
 Select aggregates
 8383 434 7480 457 6789 447 6937436
 Select with view 167610 19926 342670 16090 190347 128948825
 using subqueries
 644 583 576 487 489615
 Internal tables
 Comparing int. tables
 461 43 306 25 407 27411
 Supply/Demand vs. select 466 464 502 511 487 4899781703
 Type total db total db total db/
 DIALOG 323 891253
 BACKGROUND 133075070

 I have also given data for a BI / SEM IDES system BIS (based on NW7.0 ABAP+Java stack), based on PVM with same virtual resources as ERC - only thing is that db is 0.13T in size (while our production BI system with EP and Java stack is also about 1T in size). 

 Finally, ST03N statistics are useful for system tuning, but very elusive about benchmarking because they also
depend heavily on user activity which can vary much (and I still don't
have good data - I've used here a weekend night with similar low number
of users, but I am not satisfied with that).

h1. Conclusions


 PVM guests have better performance in general (as expected), but database should not be on a virtual machine if maximum performance is needed without some scaling out (otherwise, it depends, e.g. using Oracle RAC). PVM guests booting takes less than a minute, while Windows on HVM takes minutes to become available and accessible through RDP - there many additional benefits beside performance with PVM (high availability before other and licensing: RHEL advanced platform support subscription allows unlimited number of guests per host with unlimited number of sockets in a fraction of price compared to other vendors) - therefore, migration path from Windows to Linux on Itanium is highly recommended and more justified than any other. I would need to make more thorough testing for detailed conclusions (comparison with HP-UX above all), but I am confident now that this platform (Xen on RHEL5.4) is very stable and giving predictable and usable performance results, and more - it is supported by SAP (and somewhat Oracle) and HP as a commercial platform with good support. There aren't many licensing benefits coming from virtualization except eliminating Windows or virtualization licenses, while only HP-UX offers some level of licensing consolidation with VSE and Capacity-On-Demand supported by Oracle  (dynamical CPU resource usage, but this makes some sense only on big Supredomes with large number of CPUs with occasional extremely high peaks).


 Few thoughts about Itanium future: Intel had almost equal income share from ia64 as RISC competition up to now, and this trend will be followed in the future (having Itaniums now that will use same motherboards chipsets as new Xeons). Software vendors and other hardware vendors might have different perspective. There aren't any other signs I can follow from Intel and HP, and I wait for latest TPC-C / SAP sd2tier w/SAPS and TPC-H benchmarks with BladeSystem infrastructure (not many at the moment) which follows comparable good price/performance ratio. I also expect more official benchmark results with virtualization (VMWare at least if not Xen) and scaling out (Oracle RAC / VM). Software vendors give different and sometimes confusing signals (SAP is very clear, Itanium is to stay - no change ahead so far), while other big hardware vendors don't offer real alternatives AFAIK. If it's about critical business environment and not about best price/performance ratio or HPC, there is no good reason to change CPU architecture to other than Itanium. If it is about consolidation and virtualization while keeping existing hardware architecture and critical business beside, there are many options, but all come to expensive HP-UX (justified for those with highest demands), or Linux with Xen (Red Hat or maybe Suse). Otherwise, low risk, flexibility and Windows together, even on other architecture - can not be justified.

h1. P2V and V2P migration

 Physical to virtual migration (P2V) using SAN storage and this environment is more or less very simple - no conversion tools are necessary if using vluns as physical (raw) drive. Using such this approach has the benefit not just in performance terms, it is also good if you need a disaster recovery scenario involving moving to a physical, previously prepared machine to which these can be easily presented. P2V procedure is based on Microsoft kb 314082 article, and it is mostly about MergeIDE.reg (as found in that article) which should be imported into physical instance before migration (otherwise, BSoD is inevitable because IDE drivers are not present). I have also copied (just in case) system32driverspciidex.sys and windowsinfmshdc.inf from a working HVM Windows guest to the physical instance - and that's it, piece of cake. Usual import of NVRAM with nvrboot might be needed, too. In V2P migration it is similar, and if some specific drivers other than EFI boot driver or those in the HP PSP
are needed, %SystemRoot%Setupapi.log on a target machine should be

h1. Host clone

 I had to mention this - as part of disaster recovery scenario, testing or deployment, using shared SAN storage makes life much easier. Not that only guest systems can be easily deployed (including usual methods with sysprep on Windows) and cloned from template installations - RHEL server instances as hosts can be cloned easily, too. After making a copy (snapshot, snapclone, mirror) of the system disk and presenting it to a new host, it is only necessary to change (I am using precauciosly boot from installation CD virtual media with linux mpath rescue and /mnt/sysimage/... but that is not necessary):


    1. host name in: /etc/sysconfig/network

    2. IP address in: /etc/sysconfig/network-scripts/ifcfg-eth0
 That are the only host-dependant settings (if kept like that), but one can make specific hosts where additional changes might be needed.


h1. Additional information

Here are some important facts I have omitted to write in the previous post:


    1.  VERY IMPORTANT: just as with MSCS or hosts with boot from SAN, it is necessary to set disk timeout and IDE behavior (Xen uses exclusively IDE on HVM):

        HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlClass{4D36E96A-E325-11CE-BFC1-08002BE10318} 001
        HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlClass{4D36E96A-E325-11CE-BFC1-08002BE10318} 002
      *  ResetErrorCountersOnSuccess = 1* (DWORD)

      ... and also, the most important thing is to set disk timeout:

  HKLMSystemCurrentControlSetServicesDisk*TimeOutValue = 78<br />  (DWORD, 120 hex)<br /><br />Without this, Windows guests might become unstable from time to time (other settings here are also important).<br /><br /></li></ul><ul><li>for old HP EVA storages (with A/P (active/passive) paths, e.g. EVA 5000 with firmware older than VCS4.00), it is only necessary to have defined for each multipath (instead of other parameters for multipath example for A/P storages I have given beside wwid and alias):<br /><br />          prio_callout         "mpath_prio_hp_sw /dev/%n"
          path_checker       hp_sw
          rr_weight              priorities*

... and as defaults section in multipath.conf is set for newer A/A (active/active) paths, it is necessary to use old paths with these parameters explicitely set. It is also reommended to use preferred paths on old storages.

    1. some general recommendations about moving guests manually (from host to host, migration is unsupported on Itanium - as told, but this doesn't mean it is not possible to have it, at least manually / using scripts):

      - before making changes to storage vlun, flushing buffer cache and invalidation on the host is needed to make host aware of them (e.g. putting virtual machine down on a host, doing restore on the storage or making changes on a different remote host - then after that comes this cache dropping before bringing the virtual machine up again - otherwise, data corruption and inconsistencies are expected):

   sync<br />   echo 1 > /proc/sys/vm/drop_caches

This is not needed if the guest is being brought up for the first time since the last reboot of that host or if using GFS / Cluster Suite. If using advanced disk caching on Windows guests, it would be also good to do disk cache flushing on the guest before making storage snapshot (beside making applications prepared for the backup) with a useful tool:

  Sysinternals - sync- about virtual EFI issue: if you expect many interventions in guest EFI during boot, it is recommended to use dom0_max_vcpus=4 in elilo.conf and (dom0-cpus 4) and in /etc/xen/xend-config.sxp, and usually about 10% of the physcial memory given to Dom0 (and even 1G more, depending on number of active guest systems, if swapping is evident - e.g. shown by free) from my experience - for example, for 32G RAM would be used (with (dom0-min-mem 3200) in xend-config.sxp to avoid memory ballooning):

   append="dom0_max_vcpus=4 dom0_mem=3200M -- quiet rhgb"

... otherwise, 2G of RAM for Dom0 is sufficient (and even smaller number of vcpus, depending on number of HVM guests and virt-manager responsiveness - they spend resources for I/O on Dom0, which can be addressed by crediting additionally - all these settings do not change performance of DomUs significantly which remains completely stable).

- similar to OS kernel scheduler, Xen schedules cpu resources to guests based on their relative weight (default 256 for all domains - domain with twice bigger weight gets twice more resources), for example (boosting Dom0):

    xm sched-credit -d Domain-0 -w 1024

- there is a way to partition available CPUs and avoid unnecessary context switching between different physical CPUs, by setting cpus parameter in guest configuration file (in /etc/xen). For instance, BL870c has 4 CPUs, each with two cores, each core with two threads (can be set in EFI shell with cpuconfig threads on) - these are numbered by Xen as cpus 0-15 (first cpus in this list are used for Dom0) - so, if guest is based only on second CPU, it should include cpus="4-7" in it's configuration file.

    1. one also important EFI setting is about MPS optimization (performance optimization with maximum PCIe payload) which is only available on HP-UX, OpenVMS and Linux (not supported on Windows):

ioconfig mps_optimize on<br /><br /><br />
... and in EFI shell also, there is a* drvcfg -s *command which should be used in some environments (depending on the OS, though I didn't notice important change with this setting); or, using drivers command to extract driverNr deviceNr pairs, issue manually (for a pair 27 2B here):

drvcfg -s 27 2B

... and then (similar to EBSU setup), through given FC driver menu driven setup (options, example: 4, 6,0, 11, 12, set OS mode, back to 4, 5, 0, 11), reboot server with RS command from MP.

    1. mandatory acpiconfig setting for cell based (NUMA) rxNN20 and similar Integrity servers, is single-pci-domain, for newer (rxNN40 and newer) it is default (instead of windows):

acpiconfig default<br /><br /><br />
    1. while in HVM guest (virtual) EFI environment I have used these settings (first disables DEP which sometimes caused serious problems on MSCS bare metal machines, and /novesa showed useful on some newer machines not just during Windows setup) in NVRAM for OsLoadOptions - an example how to set this in EFI shell (before making guest bootable) is:

      map -r
      cd MSutil
      nvrboot ... and then I option (import), choosing 1 for the boot option, 2 forthe parameter line:

OsLoadOptions=/noexecute=alwaysoff /redirect /novesa<br /><br /><br />
about installation - I have omitted to mention that during host or guest installation (which is generally needed only once per site / per template), default Virtualization and Development software groups are sufficient, though after a thorougher browsing through Oracle and SAP prerequisites I have found that some additional packages were missing: sysstat, compat-openldap, libXp, 'Legacy Software Development', and a rpm in saplocales_rhel5_ ia64_version-2.zip found in Note 1048303
    h2. To Do ...


     I find very few interesting things about this subject that are left somewhat unkown or not tested, and they are mostly not important. Above all, I am looking for features that we use every day with SAP on Windows which should be mapped and migrated to RHEL - AFAIK, there isn't one single such feature that is not available on RHEL, but there are important features (like virtualization with IA64) on Windows which are not available ...

      1. I haven't checked available HA scenarios - it is probably a variant of Netweaver failover, and I would prefer to see behaviour of cluster on guest systems

      2. SSO, SNC login with Front End clients on Windows having RHEL server ? Yes, in a way, having SPNego and Kerberos instead of SNC - and this goes also for Java AS / Enterprise Portal and ABAP. I know that it is possible to have SPNego with Kerberos (harder option in security terms, safer than SNC in general even on Windows GUI, fully supported by Microsoft, SAP and RHEL), but this should be tested. Great SDN blogs and SAP Notes available about this:

    SPNego - Wiki
    Configuring and troubleshooting SPNego -- Part 2
    Configuring SPNego with ABAP datasource -- Part 2
    Note 968191 - SPNego: Central Note
    Note 994791 – Wizard-based SPNego configuration
    Note 1082560 - SAP AS Java can not start after running SPNego wizard
    Using Logon Tickets on Microsoft Based Web Applications

    I am currently using SSO with existing Windows domain authentication via SNC for Windows Front End, and Smart Card authentication via PKI and client certificate authentication for WebGUI and Enterprise Portal. Using SPNego authentication supports fully all these scenarios on RHEL server environment. In the most simple form, if SAP user account is same as on Windows ADS it passes through SSO, otherwize - username/password is available, or special handling (mapping) of accounts.

      1. comparing distributed against single server systems (database not virtualized), and other interesting combinations which I didn't cover here (including performance scaling with number of (v)cpus), moving more of the existing physical systems to virtual, testing backup and restore scenarios - experience with Data Protector 6.11 on RHEL is excellent: everything worked from the start (I had difficult times on Windows with SAP integration and other things in the past), and there are really many possibilities (at the moment I do guest backup using physical image backup on the host through SAN on tapes or ZDB - works very stable, fast and reliable).
    h1. References


        SAP on Linux   
    Setting the TimeOutValue registry for Windows 2000 or 2003<br />
        Technet - disk timeout
        Oracle Metalink note ID 563608.1: Oracle SLES / Xen is supported
        XEN - SIOEMU Domain
        SAP Note 1122387 – Linux: Supported Virtualization technologies with SAP
        SAP Note 171356 – Virtualization on Linux: Essential information
        SAP Note 1400911 - Linux: SAP on KVM - Kernel-based Virtual Machine
        Virtualbox - MergeIDE
        Microsoft kb article about booting IDE device (P2V): kb314982
        HTTP-based Cross-Platform Authentication via the Negotiate Protocol
        SSO with logon tickets
        about SSO on help.sap.com
        SAP, Itanium, Linux and virtualization ...
        IDG press release
        Wikipedia - Tukwila





    h1. *Introduction*


    I have made this article based mainly on proposal I have created last summer after finished tests on prototype systems (detailed document available on my LinkedIn profile), and additional tests and unofficial benchmarking data will follow later in new posts. This text is is mainly intended for those who own SAP on Itanium and Windows, and/or for those interested in SAP, Itanium and virtualization. Many things happened in the mean time, one of which is the Red Hat's decision (by the end of last December) to discontinue support for Itanium (or IPF, Intel's Itanium Processor Family) in the announced future release RHEL 6. This was not so unexpected, and RH states that this decision is made because of the small number of Itanium systems sold in the recent period, and which doesn't tend to grow. But this story is not likely to end that fast (RH is fully supporting Itanium on RHEL 5 to the end of it's life in March 2014, while giving extended support until year 2017 through OEMs), Oracle bought Sun who has SPARC architecture and other hardware (there was a failed attempt to port Solaris to Itanium), enters the Xen advisory board (http://xenbits.xensource.com) with people involved in Open Source contributions, Oracle Enterprise Linux and Xen IA64 (Itanium) ports. RH6 is to be supported on IBMs POWER architecture, and so are other "traditional" RISC/CISC CPU architectures, x86 and x86_64 (http://kbase.redhat.com/faq/docs/DOC-23403). There are no clear indications from other software and hardware vendors about Itanium's future (at least not the ones I deal with here having their support: HP, SAP and Microsoft), but there are current benchmarking results and research studies with predictions.


    Back in 2003 my
    company bought HP EVA 5000 storage and OpenVMS cluster with two Alpha
    ES47 nodes to support Oracle 9i RAC database for our legacy
    application, along with a rack of Xeon based Proliants (it's a kind
    of tradition, before them we had Alpha and VAX systems). A year after
    or more, it was decided that we should migrate to SAP. It was an
    ambitious project: we had 9-12 months from start to going live,
    implementing BW3.5, ERP 2004 with MM, WM (having MFC automated
    warehouse), FI/CO, SD, PP, QM, PM, TM, logistics and procurement, etc
    (maybe I have omitted something), everything that we had in the
    legacy Oracle application but HR, CRM and some remote locations
    (which were covered in later roll-out and optimization projects). We
    also implemented Solution Manager as project base and maintenance
    base, having all the necessary strict Q&A and GMP procedures. All
    this had to be supported with appropriate hardware and
    infrastructure. The first choice was (among others) was HP's offer
    with 10 Integrity (Itanium based) servers and HP-UX. Somewhere in the
    early beginnings, our project management decided to change OS
    platform to - Windows !!! Today, five years later, we have at least
    twice larger system (1TB database in size in production, number of
    users - up to 800 with expected growth, additional affiliate
    production plants, automated warehouses, end users, etc) working
    quite good with only minor changes (added RAM up to 16-24G in
    production, same number of CPUS: 2x mx2 1.1GHz per node), and some
    infrastructure improvements. We never had any serious unplanned
    downtime or system failures, performance, reliability, availability
    and stability was predictable (apart from some OS problems with
    Microsoft MSCS and one short storage outage).


    But, in the past
    year we had a large roll-out and optimization project involving ERP
    upgrade from ECC5.0 to ECC6.0, and it showed that project members,
    developers, IT staff and key users always needed (and still need)
    additional sandbox systems, IDES demo systems, system copies and
    similar for different purposes (from experimenting and unofficial
    testing, to training and Q&A, project preparation and operation,
    etc). This is showing the need for new level of flexibility which
    could be only obtained with virtualization. There were many options,
    from buying new physcial machines (very expensive even for less
    expensive architectures), manual consolidation (MCOS and MCOD, with
    or without Microsoft WSRM - Windows System Resource Manager), thin OS
    (single kernel) virtualization (like VZ, Parallels Virtuozzo, but as
    we have strict security standards and procedures which conflict with
    OS patch level for VZ, this wasn't convenient even for testing), to
    full virtualization platforms. It turns out that with Itanium you can
    only have HP-UX VSE / Itanium VM or RHEL if you want official SAP
    support (as explained later). There is also a very interesting and
    sophisticated SAP solution - Adaptive Controller, but for now, Xen on
    RHEL is doing a very good job. I have two HP BL870c hosts with 6
    active sandbox systems (homogeneous copies of ERP development, Q&A
    and production, BI/SEM IDES with Java instance, ERP IDES, etc) and
    they all work properly and very stable. From pefromance standpoint,
    HP-UX with Windows guests doesn't offer more at all (without AVIO
    drivers, similar to PV guest drivers on Xen which don't exist only
    for Itanium), the only thing I miss at the moment is Live Migration
    on RHEL/Xen (which Integrity VM supports, as it also has some other
    nice-to-have features). But I am able to move virtual machines to
    different hosts manually without a problem (multipathing and shared
    storage does it's work).


    SAP supports
    virtual guests, without responsibility about their performance, and
    database on production systems should be on physical servers if
    maximum performance is needed - otherwise, there aren't important
    reasons not to do it even with database. We use Oracle 10g at the
    moment ( on Windows, which works fine on guest systems - but
    because of Windows, it has to be HVM, full virtualization. For
    optimum performance (and many other good reasons), the best option
    would be migration to RHEL, including production systems (all the
    tests show that it would be a smooth transition). There is a general
    trend about migration of SAP systems from Unix to Linux (for all the
    good reasons), while migration from Windows is less popular. Thing is
    that 60% of all recently sold Itanium servers are Unix (read: HP-UX),
    35% are Linux (RHEL and SLES), and only 5% are Windows
    (first link among references).
    Microsoft has no intention to introduce Hyper-V on Itanium (as Citrix
    also doesn't currently, because the Xen code branch they bought
    didn't cover ia64), but it is supported in all important flavours on
    Windows Server 2008. Migration to a different hardware platform is
    not an option, just as migration to HP-UX at the moment (current
    number of Itanium servers does not justify that risk and expense).
    But, the old aging hardware should be either upgraded (but it costs –
    additional CPU costs almost as one BL860c, and though it's partly
    consequence of local vendor's clumsiness, it is not very
    reasonable), or replaced with new servers – or, used for
    non-production (until dead), which is aligned very good with
    virtualization (RHEL/Xen paravirtualization is available for old
    Madison cores, but HVM isn't). There are better reasons to migrate to
    RHEL beside this (money saving driver) – decision makers can decide
    to change CPU architecture at one point, but with RHEL we can already
    have the level of flexibility we need – far better than in any
    other option: HP-UX is too expensive, and Windows doesn't offer it.
    One important aspect is coming from GMP and other compliance issues,
    so we need same platform both on test (Q&A) systems and in
    production, and this justifies even more migration to RHEL5.


    the starting point was: SAP ERP and BI systems on Windows Server 2003
    on HP Integrity servers and HP EVA storage virtualization solution, and the current stop is: some of those
    systems working on Xen fully virtualized guest machines on Windows
    Server 2003 and on HP Integrity blade servers, and some working on
    paravirtualized guest machines on RHEL5. Next stop would be more
    systems involved, and finally – everything migrated to RHEL, either
    on Xen virtual machines or bare metal. Also, I find that Integrity blade servers are far more affordable (comparable to Xeon based Proliants) than expensive mid-range cell based andother Integrity flavours.




    environment and Itanium platform*

    Currently, in the environment I am describing here, IPF (Itanium Processor
    Family) architecture is being used exclusively for the needs of SAP
    systems in my company. There are 18 such servers at the moment (10х HP rx4640 + 2х
    rx2620 with Madison cores, 2х rx640 and 2х BL860c with Itanium
    dual-core Montecito cores, 2х BL870c with Montvale cores), beside
    additional x86 HP Proliant DL360 G4/5 servers (4x without additional
    servers for support SAP routers and Web Dispatchers which are in
    Microsoft Cluster for production just as SAP central services on
    production and test systems, or on VMWare platform for support and
    other purposes). Beside VMWare as main VTI for x86/x86_64 processor
    families, HP EVA Storage nodes are used as form of FC SAN Storage
    virtualization (also using MSA2012 for VMWare, additional EVA 8400
    obtained, but not yet ready for production).


    SAP landscape currently consists of:

      • BI
        systems (development, test (Q&A) and production), BI7.0 SP18

      • CEN
        system (transport domain controller, CUA, partly central
        monitoring), NW04

      • Solution
        Manager (MOPZ, EWA, monitoring, project roadmaps, ticketing, and

      • SAP
        routers (for end users, and for support and external access)

      • SAP
        Web Dispatchers (BI portal, WebGUI)

      • network
        printing servers with SAPSprint, SAP Web Console for some RF

      • different
        sandbox and other systems: homogeneous system copies, traning
        systems, IDES systems, etc. - all working as Xen guests at the

    h2. *Equipment
    has life time (it is aging)

    Among these 18 servers, 12 servers are entering
    5th year of usage: rx4640/2620 servers on which lies the main SAP
    landscape (it consists of ERD/Q/P, BWD/Q/P systems). All these
    servers are very reliable and stable – there never was any serious
    hardware failure or issue on production servers (or even any incident
    as a consequence) until this day ! But, with hardware aging, support
    becomes more expensive, additional components or spare parts also
    become very expensive, and also vendor desupport dates are getting
    closer and closer for equipment and different functionalities (usual
    practice is to have maintenance renewal periods and IT equipment
    amortization within 3-5 years, but sometimes it makes sense to extend
    it). There are at least two possible roads: continued use and
    planning about Itanium platform, or migration to another processor
    architecture (as mentioned). If such migration is planned, then all
    other technological aspects and existing options must be taken into
    account (some less demanding and ”painful”, some are not, but
    offer different advantages and overall total cost – just as
    changing the OS platform, changing the server vendor, storage vendor,
    etc. should be considered as well). Some kind of trade-in model or
    amortization were not practice up to now as they are not offered by
    all available vendors currently (used or refurbished equipment is
    also not used in production systems in critical business environment like this) and probably it also
    will not be in respective future. There is an option to stay on
    Itanium platform by sole migration to Itanium blade servers which
    has for consequence mandatory need for additional servers based on
    current requirements for the main landscape, but without all the
    other additional systems (this is not justified and will be explained
    further on).


    servers – unused brute force*

    Main argument for Itanium system application is
    usage within OLTP systems and databases – an example: for CPU patch
    19 for Oracle based on standard README instructions,
    downtime lasts about half an hour (given for some internal Oracle
    tests for a database with 2000 views and 4000 objects, while our
    production system has at least twice as more), on production systems (old 2p mx2 rx4640 1.1GHz) it lasts about 5 minutes. But, except for biggest loads
    during the day (peak is usually 13:00-15:00 on work days and in the
    end of the month), this power mostly remains unused during the rest
    of the day (average CPU load on our production does not go over 10%
    on central instances with database and central service nodes, and up
    to 35% on dialog instances, having at least 400 active users among
    800 users) and this is expected state. Consolidated CPU usage and
    other server resources usage can be realized by installing additional
    SAP or DB instances on the same OS instance (operating system
    instance, or in general OE, Operating Environment) on the same
    server, allowed by SAP MCOD or MCOS installations – but, all there
    many possible problems of coexistence of such instances during their
    work, usage and maintenance life-cycle. That is one of the reasons
    why consolidated OS environments should be isolated and that is
    usually done through some form of virtualization. It can be justified
    above all simply by using non-productive and productive systems and
    their total cost – but, completely virtualized production
    environments have became a standard experience today and also a need
    for many business environments (if it is possible to scale resources
    and estimate bottom lines, then it is just necessary to take into
    account the “price” for virtualization).

    Existing solutions for virtualization which were
    considered while preparing this text: Citrix
    XenServer, Hyper-V
    / Win2k8, HPUX
    VSE/VM, vPars, nPars, Parallels
    Virtuozzo / OpenVZ, Novell
    Suse / Xen, Oracle
    VM, Fedora
    / Xen, FreeBSD
    / Xen, Red
    Hat Enterprise Linux (RHEL), and so is Centos / EPEL / xenbits, Sun
    Virtual Box, QEMU/KV, VMWae - there
    were also other solutions which were not considered for obvious
    reasons - some solutions were eliminated from the start
    because those are not available for IPF and are not even planned to
    be supported and working on Itanium any time near (Hyper-V, VMWare,
    XenServer, Virtual Box, and Oracle VM up to some point) but only
    x86/x86_64 (I am dealing with SAP environment which is mostly based on IPF platforms and Windows
    OS, apart from some remaining OpenVMS and Alpha systems). Following
    criteria are applied on remaining solutions:


      1. only
        full virtualization (hardware based, HVM) or paravirtualization
        (PVM, more favoured) with hypervisor is supported (solutions which provide full
        isolation) – HP vPars, Virtuozzo, OpenVZ represend OS / “Single Kernel“ virtualization which is not supported (by SAP and other
        vendors) on production systems

      1. all
        other official requirements given by SAP as a software vendor for
        our production systems, and by all other software and hardware vendors about platform support

      1. fully
        working installation prototype which confirms solution feasibility
        in our system environment, e.g. boot from SAN

    For HP-UX it is
    also mandatory to have additional expenses for HP installation and
    maintenance support including additional preparations, training and
    similar activities which are not needed for RHEL (though, HP-UX
    systems offer high level of availability, manageability and more,
    they represent top of the business and industrial standard, but it is
    very questionable if this is really needed). RHEL Xen still does not
    support many advanced features (like Live Migration, PCI
    passthrough, paravirtualized Windows drivers – HP-UX doesn't
    support those drivers too, etc) which are easily enabled on other
    platforms (x86, x86_64) or which HP-UX supports, not likely that all
    of the will be, but some might be.

    Furthermore, HP BL870c can support almost 4
    completely isolated active guest systems (even on physically
    different processors, CPU partitions avoiding kind of context switching – HP nPars can not go
    in granularity under the physically available resources, including
    CPU, Integrity VM can), and without a significant performance loss it
    can support up to 7-15 such guest systems (having 16 HT cores)
    compared to 1-core based bare metal systems, and more if high-end
    performance and load is not expected. Of course, number of inactive
    guest systems is almost unlimited – ready to be started and used
    when needed at any time if resources are available, without
    additional reinstallation, side-effects and server setup.



    Business requirements, coming from business case which emerges here during
    development, testing and usage of all SAP systems, are putting high
    demands about fast changes and mechanisms which enable such
    flexibility having all the existing business standards and SAP usage
    standards preserved (GMP and validation practice, QA,
    availability/reliability SLAs above all, etc). Specific requirements
    which arise from these are:

      1. one
        or more system copies in a given time (usually ASAP), homogeneous or
        heterogeneous – as part of the strategy for testing and/or
        development (but not for Q&A needs), and by given requirement or
        expressed need (testing of new or risky features, system
        types/usages and functionalities)

      1. change
        of the existing architecture / SAP Landscape – for instance, the
        problem of the peak loads on the Q system during testing periods
        (systems were never sized for these needs), Support Pipeline (to
        circumvent transport change and validation requirements in order to
        speed up cut-over after system upgrade, roll-out and similar),
        training and sandbox systems – last two examples show in our
        practice that system copies are far more efficient and usable than
        any other solution (aligned with storage split-mirror technologies
        like snapshots and snapclones) compared to unjustified client copies
        or exports on systems with already more than 1 TB in DB size, which
        can not support frequent requests for Q system data refreshments.
        One temporary solution is also to combine those two procedures (as
        used in R/3 system upgrade to ECC6.0 here during year 2009), and even
        more efficient would be to slightly change validation practice
        (which in essence remains same, but using far more efficient system
        copies). This is aligned with SAP note
        432027 - Strategy for using SAP Support Packages, which
        describes additional sandbox and test system usage (e.g. sandbox in
        Upgrade Master Guide) and as part of the official landscape.

        step further*
        (a solution and a problem) would be involvement of additional
        development and Q&A systems, but also using more efficient
        change management and automated test procedures using SAP Solution
        Manager and eCATT tools which we already have available at hand
        without additional investments (but there also many, many other
        useful tools) except for additional planned effort by validation and
        different SAP teams. Every testing of changes during any of the SAP
        projects is a convenient opportunity for preparation and
        implementation of such tools and solutions.

      1. Changes
        during system patching or system upgrade which require alternate
        system landscapes (as during the former mentioned upgrade in year 2009),

    *Solution* - Pool of Servers

    The strategy which enables the fastest possible
    response on large number of requests: certain number of servers is
    prepared like an “on hold“ template (firmware / hardware, FC /
    network, SAN storage and other needed resources, then the OS
    installation, Windows domain name reservation and AD infrastructure
    settings, antivirus / firewall / proxy, account specifications and
    all other OS settings, Oracle RDBMS and SAP s/w installation), and
    then by a well defined procedure a copy is made using HP Storageworks
    Business Copy technology in a shortest time according to SAP
    standards (homogeneous copy based on database copy, with additional
    post-installation activities which last not more than an hour).
    Server in the pool can be activated in a very short time, deactivated
    or renewed with fresh data (just by making database copy and
    post-installation tasks again).

    only problem with this solution (as mentioned about the missing
    features for Xen on Itanium) is a very poor disk I/O. This is the
    consequence of the needed full virtualization (HVM) for Windows
    guests (3-6 times slower than on bare metal, while network I/O is
    less deteriorated). This makes critical impact on database
    performance in some cases (though parsing is done faster than on
    older bare metal systems), which makes the whole ERT system in
    average slower (e.g. closing the financial period on guest system takes 4
    hours in batch instead of 2.5 hours as in production system). There are many workarounds and solutions for this
    problem - using paravirtualized I/O drivers if it was possible,
    iSCSI/NFS storage approach (not top performance, again), etc - but
    the best thing would be moving database to physical RHEL machine, using
    MCOD wherever possible. Database nodes wouldn't be easily
    reallocated, but one of the possible solutions in future would also
    involve adaptive infrastructure (SAP ACC, with Solution Manager -
    Adaptive Computing Controller) as part of the SAP strategy for system
    usage and consolidation:

      1. all
        SAP systems can adapt to business needs very fast and in a flexible

      2. dynamic
        load balancing according to system load (physical and virtual)

      3. easier
        SAP landscape maintenance


    solution would involve ACC because not only that it avoids better
    possible human errors, but it also follows vendor recommendations and
    standards better given with SAP Solution Manager as the base
    technical tool which is really useful in all system maintenance and
    management tasks according to GMP / ITIL / ITSM, and which is free of
    charge (unless additional Service Desk Extended component and
    functionalities are needed for non-SAP systems and Service Desk /
    Help Desk, and similar). Therefore it makes sense to make additional
    effort and improve this system and it's usage in our environment.




    *Realization options*

    These are the only existing usable IPF
    consolidation platforms for pool of servers and their perspectives (having Windows on Itanium systems):

      1. acquisition
        of new physical servers (without hypervisor and virtualization)

    most efficient (but not optimal) use of existing knowledge and

    price of 1 Itanium server is between $5К and $20К in average, and
    more (without support agreements and other hidden maintenance costs,
    human labor)

      1. HP-UX
        Integrity VM hypervisor:

          1. HP
            Integrity VM (software virtualization) as part of VSE (Virtual
            Server Environment) offers also full isolation, as opposed to vPars
            (similar to OpenVZ and appropriate Parallels Virtuozzo solution)

          1. HP
            nPars as part of VSE (cell based virtualization, similar to IBM
            LPARs, HP-UX as hypervisor is only controlling them), demanding
            specific cell-based (NUMA) hardware (rx7640 or stronger, like Superdome)

    highly stable platform which represents on of the leading industry

    licenses only for MC OE (DCOE, VSEOE now) are about $10000 per CPU

    as a possible minumum is more than $2000 per CPU), performance factor

      1. Xen
        with RHEL5 as hypervisor

    side: supported
    by SAP and HP, Open Source, quite stable and tested, in-house
    knowledge available.

    not support all the features other options have

                         (at least some of the
    features like live migration), performance factor


    All these solution are stable and robust, but maintenance and support for RHEL / Xen infrastructure is about 1200 euros per server for RHEL Advance Platform (unlimited sockets, unlimited number of guests, cluster suite and GFS included) and that makes it most optimal for system consolidation.


    *Realization steps*

    In general, following implementation gradual
    phases and steps are proposed in a short overview:

      1. Prototype
        preparation and testing (in process).
      2. Virtualization
        of all IA64 sandbox (and test) systems and preparation of pool of
        servers (for system copies), Solution Manager (SOD), dialog
        instances of development systems.
      3. In
        parallel, all 32-bit SAP servers on VMWare platform should be
        virtualized (old Solution Manager, IDES, CEN; even SAP routers, but
        not those for end users, for a start).
      4. Complete
        virtualization of development systems.
      5. Test
        drive virtualization for Q&A and production systems (they must
        be aligned, dialog instances first).
      6. Complete
        virtualization of the whole SAP landscape.
    Taking Q&A servers together with productive
    servers is necessary because of the nature of their intended usage.
    The last phase is not yet planned (if it ever will be), at least not
    with a final date or specific requirements at the moment (higher
    reliability (disaster recovery), greater security and scalability).
    This might justify HP-UX as an option as it's implementation has
    references nearby, with full implementation and integration support
    from HP in a critical environment (with given dates).

    For the successful implementation of these phases,
    good preparation is crucial, specially about planning needed

      1. storage

      2. available
        network and optical ports and switches, settings on firewall, etc.

      3. needed
        total number of servers in the landscape (categorized according to
        usage types and associated with possible hosts)

      4. this
        provides information needed to estimate people-days needed for
        implementation of the pool of servers in each of the phases, and
        also physical resources needed for the requested service levels
        according to user requirements.

      1. Much
        of storage space is saved using HP BC Snapshot functionalities (as
        during upgrade), but these also have some level of impact on EVA
        performance and puts constraints on original VLUN, and makes storage
        maintenance more complex. Price of the EVA FC disks is about 15-20
        eur/G which also has to be noted (snapshot grows up to about 10% of
        the original disk size in a normal usage period, which gives around
        1.5-2 eur/G) – there are other storage options, but there also
        other parts of the TCO structure


      2. backup
        (about 7 eur/G in averga, but varies from 1 to 15eur)

      3. network
        equipment and security (antivirus, firewall, LAN, FC), licenses

      4. guest
        OS licenses and support agreements

      5. infrastructure
        in the data center room (space, air conditioning, power supply)

      6. system
        maintenance (human resources, licenses for some monitoring

      7. other
        hidden expenses ...

    Therefore, it is of the utmost importance to have
    business requirements carefully prepared and estimated, because all
    real technical requirements (like additional hardware or licenses,
    memory, servers, storage). For instance, if large enough number of
    servers is hosted (which grows very easily with virtualization), so
    grows the I/O bandwidth on SAN or the number of storage spindles.



    about implementation*

    Installation of current environment is divided in three groups:

      1. installation
        of hosts (Dom0s)

      2. installation
        of HVM (fully virtualized) Windows guests

      3. installation
        of PV (paravirtualized) RHEL guests

    While the installation of hosts and PV guests has
    many similar steps (network setup – I am using trunk utility as
    local NTLM proxy for http/ftp, RHN registration, ntfs-3g uitilities
    installation, etc), there are things to be set only on hosts (xend
    service and boot parameters, HP support pack installation, etc).

    guests I am generally proposing using copies of the template systems
    for many different reasons, one of which is using good features of
    EVA storage (snapshots and snapclones) or LVM on local disks. If
    making guest from the scratch (or new template), it is always best to
    use physical (phy:/) drives instead of files – they have better
    performance, enabling moving to another host (with shared storage),
    and always use multipathd.

    I am giving the following hints for people who
    have experience with Windows installations and SAP administration on
    Itanium (and who have some basic knowledge and awareness of things
    about Linux) – many things are similar, but there are also many
    misleading details.






    *Host Installation *


      1. RHEL installation is quite simple and straightforward – no EFI boot
        driver is needed (Windows installation needs it), usual server
        preparation, firmware update if needed, boot device configuration in
        EBSU setup disk – boot from SAN is used, so HBA adapater should be
        configured there and a VLUN prepared, after booting from installation
        disk, installer should be started manually by entering “linux
        mpath vnc=1”
        (vnc is optional, GUI is then available later through vnc client on
        screen 1, or http://that-host:5901)
      2. installation number can be entered after the installation (using
        rhn_register), Virtualization group should be chosen if available
        (beside Development, Web Server and others) - or later, packages to
        be installed are: xen, kernel-xen, xen-ia64-guest-firmware,
        virt-manager, libvirt and virt-viewer, or just “yum groupinstall
        Virtualization” (if doing it manually with rpm -ivh in VT folder of
        the installation disk: libvirt-python, python-virtinst,
        Virtualization-en-US, perl-Sys-Virt, xen-devel ../Server/bridge-utils
        ../Server/xen-libs ../Server/gnome-python2-gnomekeyring
        ../Server/iscsi-initiator-utils ../Server/gtk-vnc
        ../Server/gtk-vns-python ../Server/cyrus-sasl-md5)

        installation lasts up to 1 hour – all setup parameters (like
        network address, gateway, etc) should be prepared

        check if /boot/efi/efi/redhat/elilo.conf is set correctly before

        /etc/resolv.conf should contain at least one line "nameserver
        DNS_IP_address" and one "domain group.hemofarm.com",
        enabling and disabling network interface is done with ifup eth0 /
        ifdown eth0
      3. on
        some bad terminal emulation/clients after restart, firstboot gets
        stuck (additional setup wizard), which shuts down automatically after
        about 10 minutes, and after logging in it can be disabled with
        “chkconfig firstboot off”
      4. trunk (local NTLM proxy, [http://ntlmaps.sourceforge.net | http://ntlmaps.sourceforge.net/])
        after unpacking with tar -xzf has to be configured
        (NT_DOMAIN:HEMONET, USER:proxyuser, PASSWORD:somepass,
        PARENT_PROXY:proxy_server) and started manually with “scripts/ntlmaps
        (or put into /etc/rc.local to start automatically after boot), then
        it is possible to start “rhn_register
      5. also, for other utilities accessing internet, like yum (package
        installer), edit ~/.bash_profile (using gedit in GUI console, or vi


        update can be started with “yum update” after setting up software
        channels and other additional settings on RHN side (if needed)

        LVM/multipath is not good is not good with FAT partitions, so before
        each kernel update/installation it is needed to mount manually
        /boot/efi (e.g. with “pvs” one can get sytsem disk device for
        system VG LogVol00, like /dev/sd..., for instance /dev/sdh2, and then
        EFI partition is mounted with: “mount -t vfat /dev/sdh1 /boot/efi”)
    using Windows (cifs) shares: e.g.
    serverx$ can be mounted on /mnt/tmp with:

    -t cifs -o dom=HEMONET,username=someuser //server/x$ /mnt/tmp

    or for simple copying smbclient is useful (like ftp)
      1. ntfs-3g installation (http://www.ntfs-3g.org):

        xvf ntfs-3g....
        install ntfsprogs
      2. firewall should be configured (opening needed ports,
        and selinux (for a start and troubleshooting, “setenforce
        and SELINUX=permissive in /etc/selinux/config)
      3. Xen
        Dom0 configuration for several network adapters, create

        "$@" vifnum=0 bridge=xenbr0 netdev=eth0
        "$@" vifnum=1 bridge=xenbr1 netdev=eth1
        "$@" vifnum=2 bridge=xenbr2 netdev=eth2
        "$@" vifnum=3 bridge=xenbr3 netdev=eth3

        and then: “chmod
        +x /etc/xen/scritps/network-multi-bridge”
        and edit /etc/xen/xend-config.sxp, putting “(network-script
        instead of “(network-script

        also, put there (dom0-min-mem 2048) ... and dom0_mem=2G into
        elilo.conf to reserve 2G for Dom0 and avoid memory ballooning (2G
        should be enough) – elilo.conf example:

        -- palhalt rhgb quiet"
      4. vncserver setup (/etc/sysconfig/vncservers), vncpasswd should be set
        for the running account, and initial vncserver start (“chkconfig
        vncserver on” for the service, too)
      5. hp
        support pack installation:

        install net-snmp net-snmp-libs
        install tog-pegasus
        tog-pegasus start

        is needed in /etc/sysconfig/network - in case of additional problems
        with hpsmhd start, in /etc/hosts should be put:

        host.group.hemofarm.com host

    ... initial configuration “/etc/init.d/hpima sample (or better, reconfiguration, especially after each kernel update):


    (and additional restart of services: tog-pegasus, hpmgmtbase, hpima,
    hplip, hmsmhd)


    kernel parameters and related settings and recommendations (by
    Oracle, SAP):


    -u 16384 -n 65536
    hangcheck-timer hangcheck_tick=30 hangcheck_margin=18


    kernel.sem=1250 256000 100 1024

        1. old
          A/P storages should have excluesively Failover (not failback, not
          None) in the prefered path in the presentation options (mixed environments like I have at the moment are very problematic - upgrade to A/A is more than recommended, it is just about uggrading firmware but needs planning as a potential risk)
        2. multipath.conf example (starting the service and setting chkconfig
          also is needed) - example content of /etc/multipath.conf:

      blacklist {
          devnode "(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"<br />    devnode "(hd|xvd|vd)[a-z]"<br />    wwid ""

      # Make sure our multipath devices are enabled.

      defaults {
          udev_dir /dev
          polling_interval 10
          selector "round-robin 0"
          path_grouping_policy group_by_prio
          getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
          prio_callout "/sbin/mpath_prio_alua /dev/%n"
          path_checker tur
          rr_min_io 100
          rr_weight uniform
          failback immediate
          max_fds 65536
          no_path_retry 12
          user_friendly_name yes

      blacklist_exceptions {
          wwid "3600508b4000129f70002900002030000"
          wwid "36005"<br />    wwid "36001"

      multipaths {
        multipath { /* for newer A/A storages, from EVA VCS4.0 /<br />    wwid 3600508b4000e302d0000a00001200000<br />    alias sap-trn-sys<br />  }<br />  multipath { / for A/P storage, up to EVA VCS4.0 /<br />    wwid 3600508b4000129f70002900000cb0000<br />    path_grouping_policy multibus<br />    path_checker readsector0<br />    prio_callout /bin/true<br />    no_path_retry queue<br />    rr_weight priorities<br />    alias sap-bisem-sys<br />  }<br />  ...<br />}<br /><br />devices {<br />  device {<br />    / EVA 3000/5000 with new firmware, EVA 4000/6000/8000, EVA 4400 /<br />    vendor "(COMPAQ|HP)"<br />    product "HSV1[01][01]|HSV2[01]0|HSV300|HSV450"<br />    getuid_callout "/sbin/scsi_id -g -u -s /block/%n"<br />    prio_callout "/sbin/mpath_prio_alua /dev/%n"<br />    features "0"<br />    hardware_handler "0"<br />    path_selector "round-robin 0"<br />    path_grouping_policy group_by_prio<br />    failback immediate<br />    rr_weight uniform<br />    no_path_retry 12<br />    rr_min_io 100<br />    path_checker tur<br />  }<br />  / MSA */
          vendor "HP"
          product "MSA2[02]12fc|MSA2012i"
          getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
          features "0"
          hardware_handler "0"
          path_selector "round-robin 0"
          path_grouping_policy multibus
          failback immediate
          rr_weight uniform
          no_path_retry 18
          rr_min_io 100
          path_checker tur

        1. Data Protector installation (MUST be started from the PA-RISC disk,
          e.g. B6960-15008-01.tar.gz or downloaded from ITRC SUM):

          install xinetd
          (if not installed)

          add port 5555 in iptables firewall (/etc/sysconfig/iptables "-A
          INPUT -m state --state NEW -m tcp -p tcp --dport 5555 -j ACCEPT")

          -source /root/hp/B6960-15008-01/ -server dprotector.group.hemofarm.com -install da,ma,cc

          --list | grep omni

          if no service found, edit manually /etc/xinted.d/omni:

          = stream
          = tcp
          = no
          = root
          = /opt/omni/lbin/inet
          = inet -log /var/opt/omni/log/inet.log
          = no

      INFO: linux rescue is a method of disaster recovery, by booting
      installation disk with “linux rescue” or better, “linux rescue
      mpath”, and it is also possilbe to do a reinstallation or upgrade
      over the existing OS file system (as a repair method) by choosing any
      of these options, guided by the installer dialogues


      *!!!!! NOTE: always, always make backup of everything that is important ...</p><br />
      *h2. PV Guests





      sqlplus / as sysdba ...<br /><br />- SAP:

      su -ertadm
      startsap r3
      stopsap r3

      all instead of r3 if Java stack is present, as it is with BI systems)


          1. shutting down:

            -h 0
            (as with other guest or bare metal systems, there is always a risk
            with forced shutting down of damaging system disk or something more –
            linux rescue boot with prepared sap-rescue configuration with
            affected system disk would help, or restore from a backup ...) or
          2. guests can be backed up just like any other machine using Data Protector or other usual tools (no FC tape medi, though, unless using PCI passthrough which is not available on Itanium even though it has VT-d hardware support)


        HVM Guests<br />


          1. It
            is enough to copy existing template configuration (e.g..
            /etc/xen/sap-test) for a new guest, having at least changed
            name=new_machine (instead of name=sap-test), and also
            /usr/lib/xen/boot/nvram_sap-test into

            is instead of nvrboot import from the backup on the EFI partition –
            HVM guests have their own EFI environment, just as any bare metal
          2. for
            new guest disks created either as snaphots / snapclones or CA
            replicas of the template disks, check the parted /dev/disk/by-id/...
            print before proceeding – if any I/O errors occur, try changing in
            EVA Command View the Presentation / Preferred path/mode on old EVA
            5000 (A or B, when upgraded to new A/A VCS4.xx firmware this will no
            longer be a problem).

            recommended to use for each VLUN it's wwid set accordingly in
            /etc/multipath.conf with appropriate /dev/mapper/alias ...

            ntfsls utility one can even check the M$ partition without any
          3. for
            each new disk (which is not a copy/replication, not having gpt label)
            initialization must be done: parted /dev/disk/by-id/... mklabel gpt
            (it is also possible as usual in EFI shell) ...
          4. after Windows boot and administrator login, first the IP address
            should be set in the GUI console (vnc) and remote desktop enabled
            (for a newly installed machine, it should be already set in the
            template) and then continue working with RDP (rdesktop)
          5. the
            next step is changing the name to the machine and adding it to Winows
            domain: newsid /a novoime (restart should be done only from
            virt-manager or with xm create, anything other will fail to bring up
            network correctly)
          6. machine preparation is the same as with any other Windows machine
            (Oracle and SAP installation, mind installing Montecito Oracle
            patches if needed)
          7. e.g. SAP
            ERT 30 instance in the template with preinstalled ERT instance can be
            used only for the same machine, which comes down to same MAC address,
            having same database schema and licenses preserved with (otherwise,
            additional setup is needed):

            ... on
            the old system:
              exp file=lic.dmp userid='/ as sysdba'

            ... on the newly copied system:
            imp file=lic.dmp userid='/ as sysdba' fromuser=SAPERP touser=SAPERP

          1. SAPDATA_HOME and oraarch folders (or whatever set in sapinst for the
            template) are best to be kept on C: - HVM guests have that
            unfortunate constraint to have maximum four drives

        NOTE: always, always make backup of everything that is important ...*



        had several problems during during all these tests, and these
        are the most significant issues:

          1. EFI
            environment can slow down vnc console and virt-manager refresh
            terribly for unknown reasons (xm commands are usable, but also a bit
            slower, while the guest domains and Dom0 work perfectly normal) –
            a RH service request is open on this one

          2. ORA-07445
            occurs intermittently on Dom0s and DomUs (but does not on normal
            kernel) – a metalink SR is open on this one (bug 8868468) - the
            only viable workarounds are not to use Xen for Oracle, or to somehow
            prepare database without Xen, and then use it under Xen (this is not
            causing serious issues, but it would be unacceptable in producion)


        Note 962334 - Linux: SAP on Xen virtual machine

        1048303 - Red Hat Enterprise Linux 5.x: Installation and upgrade

        958253 - SUSE LINUX Enterprise Server 10: Installation notes

        941735 - SAP memory management for 64-bit Linux systems

        964705 - Linux IPF: "floating-point assist fault" in Linux

        784391 - SAP support terms and 3rd-party Linux kernel drivers

        Note 527843 - Oracle RAC support in the SAP environment

        592393 - FAQ: Oracle



        Linux Supported platforms
          Supported Platforms
        homogeneous copy x64 to ia64
          Re: System copy  from SuSe Linux-IA64_64 Ent  to SuSe Linux-X8_64 Ent
        SAP on Linux:
          SAP on Linux


        SAPS http://www.sap.com/solutions/benchmark/measuring/index.epx