Put Your Data on the Storage & Backup Systems Diet



Overview

Data is growing at the rate of 50% per year and more, requiring you to purchase and consume more and more disk. You may now be under considerable pressure to manage these massive increases in data stored in production environments and put a halt to the capital expenditure on additional disk.

Using storage as efficiently as possible should be now be a critical objective for your organisation. Follow this 5 steps plan to take back control of your data storage.

  1. Remove none business related files
  2. Archive old or infrequently used data
  3. Remove Duplicate Files
  4. Remove Duplicate Data
  5. Compress the remaining data you have

1) Remove none business related files

Like a 7 day detox, this can provide a very quick fix, use a tool like fileinsight from Brocade (brocade.com/support/fileinsight.jsp) which scans both Windows file servers and non-Windows NAS devices and can generate reports on file data quantity, age, size, types, and other file metadata statistics to identify candidates for removal. However, be aware that deleting all MP3’s may also remove your CEO’s favourite podcast or HR’s training library, the key is to understanding and indentifying the different data. Analyse your findings and ensure you understand your user’s data requirements before applying any company wide policies. Discuss with your systems integrator which SRM (Storage resource Management) or HSM (Hierarchical Storage management) Tools can help you automate the process.

2) Archive old or infrequently used data

Once you have analysed which data is old, orphaned (i.e. no longer accessible files where applications have been retired), infrequently accessed, non mission critical etc Create an archiving policy to remove this data to less expensive tier 2 disk or tape. File and Email archiving is well served by market proven solutions such as EMC disk extender and email extender, Symantec Enterprise Vault etc but before choosing a solution it is essential to complete the following tasks:

  1. Set a data retention policy
  2. Integrate data retention policies into one archiving system if possible
  3. Enforce data retention based on a published central retention document
  4. Store data in classes appropriate to the age of data and its access requirements
  5. Relocate inactive data to the archive
  6. Retain application transparency for users regardless of where the data resides
  7. Get your archiving methodology approved as legal in each country you operate in by the relevant agency concerned.

Once the above is defined work with your systems integrator to ensure that the solution offered meets your exact requirements

3) Remove Duplicate Files

A simple fix is to remind users to store documents in a shared area rather than having a personal copy of a commonly used document in their own user directory. How many copies of the same organisation chart, company overview PowerPoint , expenses spreadsheet template do you have?.

Your future storage infrastructure investment should consider how it deals with file deduplication, also referred to as single-instance storage (SIS), which compares a file to be stored, backed up or archived with those already stored by checking its attributes against an index. If the file is unique, it is stored and the index is updated; if not, only a pointer to the existing file is stored. The result is that only one instance of the file is saved and subsequent copies are replaced with a “stub” that points to the original file. Vendors, such as NetApp, are now adding this feature to their file systems as a no cost option.

4) Remove Duplicate Data

Advanced block level data reduction and data de-duplication techniques have been used in the secondary storage environment to cost-effectively address the avalanche of backup and archival data that organisations must keep, delivering more than 2x compression and in many cases over 20X compression on backup and archival data. However, the algorithms and techniques used are not well suited to the primary storage environment.

Primary storage requires something very different: real-time compression and optimisation without performance degradation, and without the need to change existing infrastructure. New approaches and algorithms are required in order to bring the same level of optimisation to the production storage environment.

When evaluating block level deduplication it is important to understand the different methodologies which include:

(1) In-Line Processing Data de-duplication

(2) Post Processing Data de-duplication

(3) Parallel Processing I/O de-duplication

Ensure that your Systems integrator explains what methodology is best suited to your requirements before recommending a specific vendors solution.

5) Compress the remaining data you have

The data that you need to be available on primary storage can be further slimmed down by compressing it so it takes up less space.

There are two main types of data compression: lossy and lossless.

(1) Lossy data compression. After lossy data compression is applied the file can never be recovered exactly, as data has been lost. This type of compression does have its uses in sound, video, graphics and picture files. An example of lossy compression would be MP3 format that removes high and low frequencies, which the human ear cannot hear to reduce file size. This compression method is clearly not acceptable for text based files.

(2) Lossless data compression. Lossless data compression works by finding repeated patterns in a message and encoding those patterns in an efficient manner. Lossless data compression is ideal for text.

Traditionally, capacity optimisation has centred on secondary storage management techniques like de-duplication and high density disk architectures such as SATA. But a new software class is changing the old equation by optimising primary storage as well as secondary. Products such as the Storwize STN-6000 can compress primary storage data by up to 20:1

Benefits of the Storage & Backup Systems Diet

Not only will you may now be able to manage increases in data stored in your production environment and put a halt to the capital expenditure on additional disk you will be using your storage as efficiently as possible. You will also:

  • Reduction Footprint
    • Reduces data centre costs – Rack, hosting, rent
  • Reduce Power Consumption (disk and air-conditioning)
    • Less disk requires less power, creates less heat and reduces cooling requirement
  • Potentially quicker data recovery
    • Smaller amounts of better structured data allow for faster recovery in the event of a loss or disaster.

Summary

All these steps are essentially the foundation of a formal data management policy. In practice some of the steps can become quite complicated and require a great deal of thought and planning.

To ensure that this is an exercise that is undertaken only once and provides a platform for future data storage and management purchases, you should engage with a partner who will take the time not only to undertake a capacity based assessment of their storage infrastructure, but will understand their complete environment and develop a strategy that can be implemented in line with your business requirements.

Earn $5,000 A MONTH From Home! Click Here

Source by John Malabon

caretaker