cancel
Showing results for 
Search instead for 
Did you mean: 

DB2 bckup size optimization through ACS/snapshot technology

Former Member
0 Kudos

Hi ,

We have ECC 6.0 production system, with DB2 version 10.1 as database. Its non partitioned database and has grown up to 3.5 TB whose everyday online backup size  up to 1.5 GB and takes almost a day to get successful. This resource intensive backup has adverse effect on system performance. We are looking out for ways on how to compress,optimize the backup size and come across ACS and snapshot technology .

Have few points to discuss:

1) Is there any other better method to reduce db2 backup size available?

2) What are the pros and cons of using ACS/ snapshot technology?

3) ACS/snapshot technology compatibility with P series IBM servers

4) Any document/guide available for the same?

Kindly help us on the above points.

Regards,

Aditya

SAP Basis

Accepted Solutions (1)

Accepted Solutions (1)

Former Member
0 Kudos

Hi Aditya,

Table and index compression would likely be a good way to reduce conventional backup times. You should also check for and correct any severe tablespace size skew since that can reduce read parallelism as the backup progresses.  Ensure that your LUNs have a queue depth much higher than the AIX default of 3, and that you have configured enough prefetchers and a reasonable degree of IO parallelism.  Finally, the SAP default extent size of 32KB is IMHO bizarrely small and detrimental to backup performance in most cases. A more normal extent size of 128KB or 256KB, aligned to the RAID strip size, is better. Unfortunately this requires new tablespaces, but you can migrate online while compressing at the same time.

Also make sure that you have enough sessions if you are using TSM. I have a customer with a 4TB ECC database, compressed. They have implemented the above changes and use 3 or 4 sessions. The online backup takes about 3 hours, not that amazing but better than a day.  In short, your challenge is to eliminate all IO bottlenecks 🙂

The aforementioned will improve big read performance generally.  Snapshot backup is another option but too much to explain here I feel. One downside of snapshot backup is less flexibility in that you cannot restore just a single tablespace nor can you restore to different containers. However the performance advantages may far outweigh this.

Jeremy

Former Member
0 Kudos

Hi Jeremy,

Thanks a lot for this narrow and beautiful description on how to control the size. We checked table spaces and found BTAB data and index tables with highest size. Currently we have value 4 DB2BM edu's  and value 16 prefetchers for around 30+ table spaces. So, for 30+ Containers can we place tantamount parallelism value to increase db performance? We have learnt that db2bm agent has one to one mapping with specific tablespaces and increasing their no. will not help performance. One solution is to evenly distribute db over the tablespaces using Db2 tools.Other solution which we observed is how about increasing prefetcher value as one table space can have more than one prefetcher to retrieve data...will it improve performance ?

How to check this Db2 extent size and LUN's queue length, kindly guide on this.

And we are using tapes in place of TSM, so how to go ahead with it?

Also Jeremy, We would like to know much about other options like this snapshot and ACS technology as we are on DB10.1 and If ACS technology wants us to upgrade our DB to 10.5 then we can consider it as a plus point

We want to explore max available options and ll pick the best

Kindly Guide us on above points and pls provide information/guide/doc if any on snapshot backup /ACS technology.

Thanks in advance,

Aditya

SAP Basis

Former Member
0 Kudos

Hi Aditya,

The IBM DB2 Knowledge Center would probably be the best place to start for ACS: http://www-01.ibm.com/support/knowledgecenter/#!/SSEPGG_10.1.0/com.ibm.db2.luw.admin.ha.doc/doc/c005...

There are also other, older resources such as Redbooks, that describe backup and recovery with ACS.

Note that the ACS scripted interface is available in 10.1 as of fix pack 3 if you want to integrate a non shrink-wrapped snapshot copy solution with DB2.  External scripts are also possible, as before.

I'm sure that you can get a great result using snapshot backups, but be sure to prepare and above all test backup and recovery with your solution carefully.   A couple of pointers:

  • Your logs must be properly separated from data so that you can restore the database from snapshot without losing log files needed for subsequent rollforward.
  • If your snapshot technology is not instantaneous, things get more interesting.  You will probably want to use the EXCLUDE LOGS option so that database log writes can continue while the snapshot executes, otherwise your application will hang during backup.
  • Also, for non-instantaneous snapshot backups with EXCLUDE LOGS, you would archive the current log file immediately before suspending writes.  This is because DB2 cannot start a new log file while writes are suspended, so you archive to start with an empty log file and also make sure that log files are sized large enough to support transactions while the snapshot completes.  If you force log archives for any other reason, e.g. DR solution, then you also need to ensure this does not happen during backup.

I'm sure there is a lot more that should be added, but the devil in the detail is for you to work out 🙂

Jeremy

Former Member
0 Kudos

Going back to the conventional backups, you would need to determine where the bottlenecks are.  Perhaps the write performance to tape is the problem rather than the reads.

Some additional points:

  • You could test a dummy backup, writing to /dev/null, so ascertain quickly whether you are bottlenecked on the tape writes or the database reads.
  • DB2 backs up the largest tablespaces first to reduce the skew issue, so tablespace skew is only a problem when severe.  The symptom that you will see is good initial backup throughput that degrades as the backup runs.
  • You can observe LUN throughput with: iostat -D 60
    • High values in sqfull or avgwqsz indicate IO request bottlenecks
  • At the start of the backup, all tablespaces are in backup pending state.  If you see that most tablespaces revert to normal state quickly but that one or two big tablespaces remain in backup pending for ages after, again that is symptomatic and means you are not getting the consistent degree of parallelism that you could.
  • In theory you can boost parallelism within tablespaces not just across them.  My personal experience with this for backup has not been very positive, but maybe different storage characteristics will get better results.
  • lsattr -El hdisk1 (or appropriate LUN name) shows the LUN queue depth setting.
  • Extent size is in syscat.tablespaces, or a tablespace snapshot.  The degree of IO parallelism within a tablespace is controlled by the ratio of the prefetchsize to extentsize and the number of containers, or by the DB2_PARALLEL_IO registry setting.  It is quite complicated to explain accurately so please read up on these in the DB2 Knowledge Center.
  • Also be aware that the prefetch rules for backups are a bit different to standard prefetch - backup always reads single extents according to the documentation.  So the small SAP default extent size of 32KB means you end up reading from each (say) 256KB RAID strip 8 times.  Although caching may help, this is still overhead.

Answers (1)

Answers (1)

wong_norman
Discoverer
0 Kudos

Another option to consider is to take incremental delta backups.  Take a full on the week-end (or whenever the system is quiet), then take incremental daily backups for the rest of the week.

The ACS snapshots are good for onsite restores, but you cannot do offsite recoveries with them as the snapshot is held within the SAN.