bill.morton

2 Posts
Bill Morton

Sustainable Data Storage

Posted by Bill Morton Jan 3, 2011

It doesn’t take much to see that the exponential growth of incoming content (data and electronic files) from all sources is overtaking your ability to acquire more storage. To keep your head above water, you need strategies for controlling the growth. You are storing content from many different sources, and although the composition of the content varies dramatically, the strategy is the same. You need to identify what you need to keep, and you need the freedom to destroy what you don’t need to keep.

Simple as that sounds, it can be extraordinarily complex. Considering the volume and diversity of the content, identification can be problematic. Considering the litigation risk and compliance issues, if you can’t confidently identify your important records, you won’t have the freedom to destroy anything. Fortunately, because you Run Better with SAP, the same disciplines and best practices that enable you to structure your master data and processes in SAP applications will come to your rescue when classifying your content.

In my last blog – “Is Your Data Ready for the New Year?” – I wrote about legacy data systems and their impact on your Green IT initiatives. This time, I consider your production SAP database and your strategy for archiving historical data. Because we are talking about structured SAP data, it’s easy to identify what you need to keep and what you can archive. I will cover the more challenging types of content in subsequent posts.

According to the 2008 ASUG Data Archiving Survey, the average SAP database is larger than 1.6 TB and 15% are larger than 6.4 TB. The average database is growing by more than 400 GB per year and 10% are growing by more than 1.5 TB. Most SAP customers eventually take advantage of the data archiving features built into the SAP system – 80% of those surveyed have archived some of their data. The main reasons for archiving are Conversion or Migration, Performance, Hardware and Admin Cost, System Availability, and Legal Requirements.

Your production SAP database is stored on high-performance storage hardware – what I call “Tier 1” storage. This is typically the most expensive type of storage. The archived data is stored in an archive storage system – what I call “Tier 3” storage – in a compressed format, so it occupies only about 20% of the original size. The archive storage is typically the least expensive type of storage.

It is obvious that data archiving reduces data storage requirements. What is not quite as obvious is that the amount of the reduction is amplified because you have multiple copies of your SAP database. You don’t have just a production SAP database. You have additional copies of your SAP database for backup, disaster/recovery, development, testing, and training. Those 3 to 7 copies are most likely stored on Tier 2 storage – a middle class of storage that can be less expensive than Tier 1 storage, but in many organizations is not.

The archive system is sharable, so you don’t need to maintain an archive system for each Tier 2 instance. That means that each terabyte that you eliminate from your production database eliminates 3 to 7 terabytes from your Tier 2 storage.

Let’s take an example of an organization that has been running SAP for several years, has a 1.5 Terabyte SAP database with 5 copies in Tier 2 storage, and has never done any archiving. We’ll assume that the SAP database is growing at a rate of 15% per year, which is well below the average rate of those who responded to the ASUG survey described above. In 10 years, the SAP database will grow to 5.3 terabytes if no data archiving is done.

The table below shows what the SAP database storage looks like before and after implementing data archiving. It also looks out 10 years in the future.

 

Tier 1 Storage (gigabytes) (1)

Tier 2 Storage (gigabytes) (2)

Tier 3 Storage (gigabytes) (3)

Storage Cost (4)

Electricity Used (kWh) (5)

Electricity Cost (6)

Carbon Emissions (lbs. CO2) (7)

Year 1: Before Archiving

1,500

7,500

0

$247,500

10,249/yr.

$1,042/yr.

16,399/yr.

Year 1: After Archiving (8)

900

4,500

120

$149,940

6,286/yr.

$339/yr.

10,058/yr.

Year 1 Reduction (Increase)

600

3,000

(120)

$97,560

3,963

$403

6,341

Year 10: With No Archiving (9)

5,277

26,384

0

$870,674

36,055/yr.

$3,667/yr.

57,689/yr.

Year 10: With Archiving (10)

1,396

6,981

318

$234,194

266,700/yr.

$2,660/yr.

41,844/yr.

Cumulative Reduction (Increase) Over 10 Year Period

19,135

95,677

(2,128)

$3,131,815

128,325

$13,051

205,321

 

Although the first year energy savings are not substantial, the benefits compound year over year. In addition, the SAP database keeps growing year after year, so that the amount of archived data and associated savings increases each year.

What’s more impressive is the money saved from avoided storage cost, which amounts to nearly $100,000 in the first year and accumulates to a total of over $3 million over the 10 year period. This compound benefit effect is found in most Green IT initiatives. So, the sooner that we act, the better off we’ll all be!

Next time I will write about the benefits of removing paper from your organization. Meanwhile, you can learn more about how SAP and its partners support Green IT through the SAP Sustainability Map on the EcoHub.

What do you spend on your SAP data storage? How do you keep data growth under control? Have you considered archiving as a way to reduce storage costs? Do you consider the environmental impact of your data center storage?

References:

(1) Tier 1 storage is the production SAP Database and has High Performance & Availability. This storage is typically the most costly type of storage -- for example, using Fiber Channel.

(2) Assumes 5 copies of the production SAP database.

(3) Assumes archived data is compressed by 80% and shared by all Tier 1 and Tier 2 instances.

(4) According to industry experts, the total cost of storage is comprised of the following component costs: 36% = Hardware (disk); 6% = Software; 44% = Personnel; 1% = Connectivity;  6% = Facilities (environment, physical space, etc.);  6% = Disaster Recovery;  2% = Other (training, etc.). This table assumes that the annual cost per gigabyte is $40 (Tier 1); $25 (Tier 2); and $12 (Tier 3).

(5) Assumes electrical consumption of storage devices is 0.13 kWh per Terabyte. Source: Tape and Disk Costs – THE CLIPPER GROUP Explorer(TM); What It Really Costs to Power the Devices;  Analyst: Dianne McAdam; Report #TCG2006046; June 4, 2006

(6) The rolling 12-month average cost of commercial electricity was 10.12 cents per kilowatt hour in August 2010. Source: U.S. Energy Information Agency

(7) The average U.S. CO2 emissions are 1.58 pounds per kWh. Based on 7.18 x 10-4 metric tons CO2 / kWh non-baseload national average emissions rate for converting kilowatt-hours into avoided units of carbon dioxide emissions. Source: eGRID2007 Version 1.1; U.S. annual non-baseload CO2 output emission rate, year 2005 data U.S. Environmental Protection Agency, Washington, DC.

(8) Assumes that 40% of the production data is archived in Year 1.

(9) Assumes that the SAP production database grows at an annual rate of 15%.

(10) Assumes that 10% of the data is archived in each of Years 2-10.

2010 is winding down and some of you may already have visions of sugar plums displacing more mundane thoughts of Enterprise Information Management. Take a moment to look back over the year and contemplate the amount of data that you have accumulated. Then, think about next year. More of the same, right? Well, sort of. Most of you will find that you accumulated much more data this year than last year, and the rate at which you are ingesting new data is increasing well beyond the rate at which you are able to get rid of it. The rate of data growth has simply become unsustainable for most organizations. And, it will get worse next year. Why? Because the composition of your data keeps shifting from less structured, easily managed databases, to more unstructured, hard to manage, rich media from more and more new sources. Visions of photos, sound and video, multi-media presentations, wikis, and blogs, and tweets – oh my! – are displacing the sugar plums, aren’t they!

For starters, let’s look beyond the data that you accumulated this year and contemplate the data that you had at the beginning of the year and that you still have going into the New Year. Some of it is legacy data and some is historical data from your productive systems. This article looks at legacy data and legacy systems and their impact on your efforts at becoming a Green IT organization. Next time, we will look at the historical data that you could be archiving from your production SAP system.

By “legacy data”, I am referring to data that lives in some old application and you are keeping because you don’t know how to get rid of it. The software that created it and the hardware that provides access to it are outmoded or obsolete. The data is “important” because there are business users who demand that you keep the legacy system running. You would like to replace the servers with ones that consume less energy, but the applications preclude you from doing so.

What is the impact of these legacy systems on your Green IT strategy? The average mid-range enterprise server consumes 10,635 kWh per year and the average high-end enterprise server consumes a whopping 142,017 kWh per year! Referring to the table below, that means that each legacy server is responsible for between 15,000 pounds and 200,000 pounds of carbon dioxide emissions annually. Most large enterprises have hundreds of such servers.

Server Type

Average Watts per Server (1)

Average kWh per Year per Server (1)

Annual Electricity Cost (2)

Annual Carbon Footprint (lbs. CO2) (3)

Volume Server

222

3,889

$396

6,145

Mid-range Enterprise Server

607

10,635

$1,082

16,803

High-end Enterprise Server

8106

142,017

$14,443

224,387

 

If the business users of these legacy systems are SAP users, they would like to have all of the information that they use available directly in SAP. However, migrating the legacy data and requisite applications is generally not practical.

We have assisted many customers in addressing this dilemma by converting the legacy information into reports that are archived into the same archive server that also supports other SAP archiving needs. An index is created in the SAP database so that the business users can access the legacy information in a quick and efficient manner – without leaving the SAP application interface. This is a popular scenario for SAP Document Access by OpenText.

The result is a win-win-win situation. The business users are happy because they have ongoing access to their important legacy information without having to bring up the legacy applications. IT is happy because they can unplug the legacy servers and eliminate the cost of operating and supporting them. And, the environment gets a little bit greener because less carbon is emitted from electrical energy production.

I do have two cautions for you. Be careful in the manner that you dispose of the legacy servers, so that you minimize the impact of eWaste. Many server vendors, including Green IT Community members HP and Bull, have programs that will guide you to recycle or dispose of the servers in compliance with local requirements. And, before shipping any data storage devices, be sure to take steps to securely erase the data or physically destroy the device so that your important information does not end up on WikiLeaks.

Next time, we will look at the environmental impact of historical data in your SAP database that can be archived. Meanwhile, you can learn more about how SAP and its partners support Green IT through the SAP Sustainability Map on the EcoHub.

How many legacy systems is your organization bringing with it into 2011? Do you have a strategy for decommissioning legacy systems? How are you disposing of decommissioned servers and data storage devices?

References:

(1)      To arrive at this estimate, the servers are assumed to operate 100% of the year, and the total electricity consumption (including cooling and auxiliary equipment) is twice that of the direct server power consumption, based on typical industry practice. Source: Estimating Total Power Consumption by Servers in the U.S. and the World, Jonathan G. Koomey, Ph.D., February 15, 2007

(2)      The rolling 12-month average cost of commercial electricity was 10.12 cents per kilowatt hour in August 2010. Source: U.S. Energy Information Agency

(3)      The average U.S. CO2 emissions per kWh are 1.58 pounds per kWh. Based on 7.18 x 10-4 metric tons CO2 / kWh non-baseload national average emissions rate for converting kilowatt-hours into avoided units of carbon dioxide emissions. Source: eGRID2007 Version 1.1; U.S. annual non-baseload CO2 output emission rate, year 2005 data U.S. Environmental Protection Agency, Washington, DC.

Filter Blog