The hardware and software to make an all-flash data center a reality are on the market and ready to go. IT leaders and data center managers know that flash makes a world of difference in application performance, but the one thing stopping enterprises from adopting this approach is a business case, which would demonstrate overall price and total cost of ownership of all all-flash environment. What they don’t know is whether or not their enterprises need to and should make capital expenditures so that every single application (big or small, mission critical or not) achieves superior levels of performance. Read More
Hello Cloud, Goodbye Constant Configuration
I have to admit that when I log into a Linux box and realize that I have some technical chops left, I get a deep feeling of satisfaction. I am also in the habit of spinning up a Windows Server in order to test network routes/ACLs in the cloud since I like using the Windows version of tools like Wireshark. Despite my love for being logged into a server, I do see the writing on the wall. Logging into a server to do installs or make configuration changes is fast becoming a thing of the past. Given the number of mistakes we humans make, it’s probably about time. Read More
While discussing backups and recovery validation in my first blog of this series, “The Case for Disaster Recovery Validation“, I cited “…The secondary purpose of backups is to recover data from an earlier time, according to a user-defined data retention policy…” [Wikipedia]. In this blog, I will review data loss/retention in regard to backups and archives, and the difficulties inherent in specifying retention times.
The burgeoning cost of storing increased amounts of data may have put an unfair burden on most IT organizations because the business and application “users” are failing to specify the retention needs of the data created. Now, more than ever before, IT has to balance budgets with regulatory compliance, industry standards and company constraints. This may vary from five days to fifty years, depending on the purpose of data retention strategy, the type of data and the functional use of the data: financial, health, education, research, government, etc. Further, retention policies apply to one or both systems of data retention: backup and archiving. Read More
Like most organizations, you probably are hosting your unstructured data on traditional NAS platforms. The days of storing this data on these legacy systems are coming to an end. Let’s look at some of the setbacks that plague traditional NAS:
- Expensive to scale
- Proprietary data protection – third-party backup software is needed to catalog and index
- Inability to federate your data between disparate storage platforms onsite or in the cloud
- High file counts, which can cripple performance, increase back-up windows, and require additional flash technology for metadata management
- File count limitations
- High “per-TB” cost
- Some platforms are complex to administer
- High maintenance costs after Year 3
Disaster Recovery Planning (DRP) has gotten much attention in the wake of natural and man-made disasters in the recent years. But Executives continue to doubt the ability of IT to restore business IT infrastructure after a serious disaster. And this does not even include the increasing number of security breaches worldwide. By many reports, the confidence level in IT recovery processes is less than 30%, bringing to question the vast amounts of investment poured into recovery practices and recovery products. Clearly, backup vendors are busy – see compiled list of backup products and services at the end of this article (errors and omissions regretted). Read More
If archiving is defined as intelligent data management, then neither Backup Technologies, nor Hierarchical Storage Management (HSM) techniques, nor Storage Resource Management (SRM) tools qualify; however, these continue to be leveraged for archiving as substitute products. Even “Information Lifecycle Management” that would benefit from archiving is now equated with archiving. This has led to a proliferation of archiving products that tend to serve different purposes for different organizations.
IT organizations have long valued the notion of preserving copies of data in case “work” got lost. In fact, with every occurrence of data disaster, the role of data backup operations has strengthened and no company can do without a strategy in place. Since 1951, when Mauchly and Eckert ushered in the era of digital computing with the construction of UNIVAC, the computing industry has seen all kinds of media in which storage could be kept for later recall: punch cards, magnetic tapes, floppy disks, hard drives, CD-R/RW, flash drives, DVD, Blue-ray and HD-DVD to name a few. And the varying formats and delivery methods have helped create generations of vendors with competing technologies.
Backups had come of age … but also became increasingly costly and hard to manage with data complexity, growth and retention.
Backups had come of age, cloaked and dressed with a respectable name “data protection”—the magic wand that was insurance for “data loss.” But, it also became increasingly costly and hard to manage with data complexity, growth and retention. Thus came about the concept of “archiving,” defined simply as “long term data.” That, coupled with another smart idea for moving data to less expensive storage (tier), helped IT organizations to reduce costs. The HSM technique dovetails into tiered storage management, as it is really a method to move data that is not changing or not being accessed frequently. HSM was first implemented by IBM and also by DEC VAX/VMS systems. In practice, HSM is typically performed by dedicated software, such as IBM Tivoli Storage Manager, Oracle’s SAM-QFS, Quantum SGI DMF, StorNext or EMC Legato OTG DiskXtender.
On the other hand, SRM tools evolved as quota management tools for companies trying to deal with hard-to-control data growth, and now include SAN management functions. Many of the HSM players sell tools in this space as well: IBM Tivoli Storage Productivity Center, Quantum Vision, EMC Storage Resource management Suite, HP Storage Essentials, HDS Storage Services Manager (Aptare) and NetApp SANscreen (Onaro). Other SRM products include Quest Storage Horizon (Monosphere), SolarWinds Storage Profiler (Tek-Tools) and CA Storage Resource Manager. Such tools are able to provide analysis, create reports and target inefficiencies in the system, creating a “containment” approach to archiving.
Almost as old as the HSM technique is the concept of Information Lifecycle Management (ILM). ILM recognizes archiving as an important function distinct from backup. In 2004, SNIA gave ILM a broader definition by aligning it with business processes and value, while associating it with five functional phases: Creation and Receipt; Distribution; Use; Maintenance; Disposition. Storage and Backup vendors embraced the ILM “buzzword” and re-packaged their products as ILM solutions, cleverly embedding HSM tools in “policy engines.” And so, with these varied implementations of “archiving tools,” businesses have come to realize different levels of satisfaction.
Kelly J. Lipp, who today evaluates products from the Active Archive Alliance members, wrote (in 1999) the paper entitled “Why archive is archive, backup is backup and backup ain’t archive.” Kelly wrote this simple definition: “Backup is short term and archive is long term.” He then ended the paper with this profound statement: “We can’t possibly archive all of the data, and we don’t need to. Use your unique business requirements and the proper tools to solve your backup and archive issues.”
Backup is short term and archive is long term.
— Kelly Lipp
However, the Active Archive Alliance promotes a “combined solution of open systems applications, disk, and tape hardware that gives users an effortless means to store and manage ALL their data.” ALL their data? Yes, say many of the pundits who rely on search engines to “mine” for hidden nuggets of information.
Exponential data growth is pushing all existing data management technologies to their limits, and newer locations for storing data—the latest being “storage clouds”—attempt to solve the management dilemma. But the realization that for the bulk of data that is “unstructured,” there is no orderly process to bring back information that is of value to the business brings increasing concern.
Similarly, though, to the clutter stored in our basement, data that collects meaninglessly may become “data blot.”
Although businesses rely on IT to safeguard data, the value of the information contained therein is not always known to IT. Working with available tools, IT chooses attributes such as age and size and location to measure the worth, and then executes “archiving” to move this data out, so that computing systems may perform adequately. Similarly, though, to the clutter stored in our basement, data that collects meaninglessly may become “data blot.” Data survival then depends on proper information classification and organization.
Traditionally, data has seen formal organization in the form of databases—all variations of SQL, email and document management included. With the advent of Big Data and the use of ecosystems such as “Hadoop,” large databases now leverage flat file systems that are better suited for mapping search algorithms. And this may be considered as yet another form of archiving because data stored here is “immutable” anyway. All of these databases (and the many related applications) tend to have more formal archiving processes, but little visibility into the underlying storage. Newer legal and security requirements tend to focus on such databases, leading to the rise of “archiving” for compliance.
That brings us back full circle. While security and legality play a lot in today’s archiving world, one could argue that these tend to create “pseudo archives” that can be removed (deleted) when the stipulated time has passed. In contrast, a book or film on digital media adds to the important assets of a company that become the basis for its valuation and for future ideas. If one were to create a literature masterpiece, the file security surrounding the digitized asset is less consequential than the fact that 100 years later those files would still be valuable to the organization that owns it.
Archiving … is the preservation of a business’s digital assets: information that is valuable irrespective of time and needed when needed.
The meaning of archiving becomes clearer when viewed as distinctly different from backup. It is widely accepted that purpose of a backup is to restore lost data. Thus, backup is preservation of “work in progress”: data that does not have to be understood, but resurrected as-is when needed. Archiving, on the other hand, is the preservation of a business’s digital assets: information that is valuable irrespective of time and needed when needed. The purpose of archiving is to hold assets in a meaningful way for later recall.
Backup is a simple IT process. Archiving is tied to business flow.
This suggests that archiving does not need “policy engines” and “security strongholds,” but rather information grouping, classification, search and association. Because these tend to be industry-specific, “knowledge engines” would be more appropriate for archiving tools. Increasingly, IT professional services are now working with businesses and vendors alike to bridge the gaps and bring about dramatic industry transformations through the implementation of intelligent archiving.
Backups have grown in importance since the days of early computing, and as technology has changed, so has the costs for preserving the data in different storage media. Backup technologies also have become substitute tools for archives by choosing long-term retention for those data.
With a plethora of tools and techniques developed to manage the storage growth, and contain the storage costs (the HSM techniques and the SRM tools), archiving has been implemented in different organizations for different purposes and with different meaning.
In defining Information Lifecycle Management, SNIA has elevated the importance of archiving, and thereby encouraged vendors to re-package HSM tools in policy engines. On the other hand, databases for SQL and email—and even Big Data ecosystems—have implemented archiving without visibility into the underlying storage.
As archiving tools continue to evolve, it is now considered distinctly different from backup. While backup protects “work in progress,” archiving preserves valuable business information. Unlike backups that need “policy engines,” archiving requires “knowledge engines” which may be industry-specific. IT professional services have stepped in to bridge the gaps and bring about transformations through the implementation of intelligent archiving.
1. “Why archive is archive, backup is backup and backup ain’t archive” by Kelly J. Lipp, 1999
Photo credit: Antaratma via Flickr
Symantec’s NetBackup has been in the business of protecting the VMware virtual infrastructures for a while. What we’ve seen over the last couple of versions is the maturing of a product that at this point works very well and offers several methods to back up the infrastructure.
Of course, the Query Builder is the mechanism that is used to create and define what is backed up. The choices can be as simple as servers in this folder, on this host or cluster—or more complex, defined by the business data retention needs.
Below are the high level backup methods with my thoughts around each and merits thereof.
1: SAN Transport
To start, the VMware backup host must be a physical host in order to use the SAN transport. All LUNS (FC or iSCSI) that are used as datastores by the ESX clusters must also be masked and zoned (FC) to the VMware backup host.
When the backup process starts, the backup host can read the .vmdk file directly from the datastores using vADP
The obvious advantage here is one can take advantage of the SAN fabric thus bypassing all resources from the ESX hosts to backup the virtual environments. Backup throughput from what I’ve experienced is typically greater than backups via Etnernet.
A Second Look
One concern I typically hear from customers specifically with the VMware team is that of presenting the same LUNS that are presented to the ESX cluster to the VMware backup host. There are a few ways to protect the data on these LUNS if this becomes a big concern, but I’ve never experienced any issues with a rogue NBU Admin in all the years I’ve been using this.
2: Hot-add Transport
Unlike the SAN Transport a dedicated VMware backup host is not needed to backup the virtual infrastructure. For customers using filers such as NetApp or Isilon and NFS, Host-add is for you.
Just like the SAN Transport, this offers protection by backing up the .vmdk’s directly from the datastores. Unlike the SAN Transport, the backup host (media server) can be virtualized saving additional cost on hardware.
A Second Look
While the above does offer some advantages over SAN Transport, the minor drawback is ESX host resources are utilized in this method. There are numerous factors to determine how much if any the impact will be on your ESX farm.
3: NBD Transport
The backup method used with NBD is IP based. When the backup host starts a backup process a NFC session is started between the backup host and ESX host. Like Hot-add Transport, the backup host may be virtual.
The benefit of this option is it is the easiest to configure and simplistic in concept compared to the other options.
A Second Look
As with everything in life, something easy always has drawbacks. Some of the drawbacks are cost of resources to the ESX host. Resources are definitely used and noticeable the more hosts that are backed up.
With regard to NFC (Network File Copy), there is one NFC session per virtual server backup. If you were backing up 10 virtual servers off of one host, there would be 10 NFC sessions made to the ESX host VMkernel port (management port). While this won’t affect the virtual machine network, if your management network is 1GB, that will be the bottleneck for backups of the virtual infrastructure. Plus VMware limits the number if NFC sessions based upon the hosts transfer buffers, that being 32MB.
Wrap-up: Your Choice
While there are 3 options for backing up a virtual infrastructure, once you choose one, you are not limited to sticking with it. To get backups going, one could choose NBD Transport and eventually change to SAN Transport … that’s the power of change.
Photo credit: imuttoo
Avamar is a great tool to backup remote office file servers to a private or hybrid cloud infrastructure. However, performing an initial backup can be a challenge if the server is more than a few GB and the connection to the remote office is less than 100Mb.
In this scenario, the recommended process is to “seed” the Avamar server with the data from the remote server. There are a number of devices that can be used to accomplish this: USB hard drives are the most often used; however, they can be painfully slow, as most modern servers only have USB 2.0 ports that can only transfer around 60MB/sec and are limited to 3-4TB in size. In order to copy 3TB to a USB 2.0 drive, it will typically take 12-16 hours. Not unbearable, but quite a while.
Another option would be to install a USB 3.0 adapter card or eSata card—but that requires shutting down the server and installing drivers, etc. An alternative that I have had a good deal of success with is using a portable NAS device like the Seagate GoFlex drives or, for larger systems, the Iomega/LenovoEMC StorCenter px4-300d. The px4-300d has an added feature that I will touch on later. These NAS devices leverage Gigabit Ethernet and can roughly double the transfer rate of USB 2.0.
Moving the data to these “seeding” targets can be as simple as drag-and-drop or using a command line utility like Xcopy from Windows or Rsync from a Linux box once you plug in the USB device or mount a share from the NAS drives. When the data copy is complete, eject the USB drive or unmount the share, power down the unit, package the drive for shipping and send it back to the site where the Avamar grid lives.
At the target site attach the portable storage device to a client locally and configure a one-time backup of this data. With the Iomega device, it includes a pre-installed Avamar client that can be activated to the grid and backed up without having to go through an intermediary server.
Once you get this copy of the backup into the Avamar grid, activate and start the backup of the remote client. The client will hash it’s local data and compare to what is on the gridfinding that the bulk of the unique data is already populated – reducing the amount of data required to transfer to the data that has changed or added since the “seed”.
Photo credit: Macomb Paynes via Flickr
Storage Pools for the CX4 and VNX have been around a while now, but I continue to still see a lot of people doing things that are against best practices. First, let’s start out talking about RAID Groups.
Traditionally to present storage to a host you would create a RAID Group which consisted of up to 16 disks, the most typical used RAID Groups were R1/0, R5, R6, and Hot Spare. After creating your RAID Group you would need to create a LUN on that RAID Group to present to a host.
Let’s say you have 50 600GB 15K disks that you want to create RAID Groups on, you could create (10) R5 4+1 RAID Groups. If you wanted to have (10) 1TB LUNs for your hosts you could create a 1TB LUN on each RAID Group, and then each LUN has the guaranteed performance of 5 15K disks behind it, but at the same time, each LUN has at max the performance of 5 15K disks.
[framed_box bgColor=”#F0F0F0″ textColor=”undefined” rounded=”true”] What if your LUNs require even more performance?
1. Create metaLUNs to keep it easy and effective.
2. Make (10) 102.4GB LUNs on each RAID Group, totaling (100) 102.4GB LUNs for your (10) RAID Groups.
3. Select the meta head from a RAID Group and expand it by striping it with (9) of the other LUNs from other RAID Groups.
4. For each of the other LUNs to expand you would want to select the meta head from a different RAID Group and then expand with the LUNs from the remaining RAID Groups.
5. That would then provide each LUN with the ability to have the performance of (50) 15K drives shared between them.
6. Once you have your LUNs created, you also have the option of turning FAST Cache (if configured) on or off at the LUN level.
Depending on your performance requirement, things can quickly get complicated using traditional RAID Groups.
This is where CX4 and VNX Pools come into play.
[/framed_box] EMC took the typical RAID Group types – R1/0, R5, and R6 and made it so you can use them in Storage Pools. The chart below shows the different options for the Storage Pools. The asterisks notes that the 8+1 option for R5 and the 14+2 option for R6 are only available in the VNX OE 32 release.
Now on top of that you can have a Homogeneous Storage Pool – a Pool with only like drives, either all Flash, SAS, or NLSAS (SATA on CX4), or a Heterogeneous Storage Pool – a Storage Pool with more than one tier of storage.
If we take our example of having (50) 15K disks using R5 for RAID Groups and we apply them to pools we could just create (1) R5 4+1 Storage Pool with all (50) drives in it. This would then leave us with a Homogeneous Storage Pool, visualized below.
The chart to the right displays what will happen underneath the Pool as it will create the same structure as the traditional RAID Groups. We would end up with a Pool that contained (10) R5 4+1 RAID Groups underneath that you wouldn’t see, you would only see the (1) Pool with the combined storage of the (50) drives. From there you would create your (10) 1TB LUNs on the pool and it will spread the LUNs across all of the RAID Groups underneath automatically. It does this by creating 1GB chunks and spreading them across the hidden RAID Groups evenly. Also you could turn FAST Cache on or off at the Storage Pool level (if configured).
On top of that, the other advantage to using a Storage Pool is the ability to create a Heterogeneous Storage Pool, which allows you to have multiple tiers where the ‘hot’ data will move up to the faster drives and the ‘cold’ data will move down to the slower drives.
Another thing that can be done with a Storage Pool is create thin LUNs. The only real advantage of thin LUNs is to be able to over provision the Storage Pool. For example if your Storage Pool has 10TB worth of space available, you could create 30TB worth of LUNs and your hosts would think they have 30TB available to them, when in reality you only have 10TB worth of disks.
The problem with this is when the hosts think they have more space than they really do and when the Storage Pool starts to get full, there is the potential to run out of space and have hosts crash. They may not crash but it’s safer to assume that they will crash or data will become corrupt because when a host tries to write data because it thinks it has space, but really doesn’t, something bad will happen.
In my experience, people typically want to use thin LUNs only for VMware yet will also make the Virtual Machine disk thin as well. There is no real point in doing this. Creating a thin VM on a thin LUN will grant no additional space savings, just additional overhead for performance as there is a performance hit when using thin LUNs.
After the long intro to how Storage Pools work (and it was just a basic introduction, I left out quite a bit and could’ve gone over in detail) we get to the part of what to do and what not to do.
Creating Storage Pools
Choose the correct RAID Type for your tiers. At a high level – R1/0 is for high write intensive applications, R5 is high read, and R6 is typically used on large NLSAS or SATA drives and highly recommended to use on those drive types due to the long rebuild times associated with those drives.
Use the number of drives in the preferred drive count options. This isn’t always the case as there are ways to manipulate how the RAID Groups underneath are created but as a best practice use that number of drives.
Keep in mind the size of your Storage Pool. If you have FAST Cache turned on for a very large Storage Pool and not a lot of FAST Cache, it is possible the FAST Cache will be used very ineffectively and be inefficient.
If there is a disaster, the larger your Storage Pool the more data you can lose. For example, if one of the RAID Groups underneath having a dual drive fault if R5, a triple drive fault in R6, or the right (2) disks in R1/0.
Expanding Storage Pools
Use the number of drives in the preferred drive count options. If it is on a CX4 or a VNX that is pre VNX OE 32, the best practice is to expand by the same number of drives in the tier that you are expanding as the data will not relocate within the same tier. If it is a VNX on at least OE 32, you don’t need to double the size of the pool as the Storage Pool has the ability to relocate data within the same tier of storage, not just up and down tiers.
Be sure to use the same drive speed and size for the tier you are expanding. For example, if you have a Storage Pool with 15K 600GB SAS drives, you don’t want to expand it with 10K 600GB SAS drives as they will be in the same tier and you won’t get consistent performance across that specific tier. This would go for creating Storage Pools as well.
Graphics by EMC
Regular database backups of Microsoft Exchange environments are critical to maintaining the health and stability of the databases. Performing full backups of Exchange provides a database integrity checkpoint and commits transaction logs. There are many tools which can be leveraged to protect Microsoft Exchange environments, but one of the key challenges with traditional backups is the length of time that it takes to back up prior to committing the transaction logs.
Additionally, the database integrity should always be checked prior to backing up: to ensure the data being backed up is valid. This extended time often can interfere with daily activities – so it usually must be scheduled around other maintenance activities, such as daily defragmentation. What if you could eliminate the backup window time?
EMC RecoverPoint in conjunction with EMC Replication Manager can create application consistent replicas with next to zero impact, that can be used for staging to tape, direct recovery, or object level recovery with Recovery Storage Groups or third party applications. These replicas leverage Microsoft VSS technology to freeze the database, RecoverPoint bookmark technology to mark the image time in the journal volume, and then thaw the database in a matter of less then thirty seconds – often in less than five seconds.
EMC Replication Manager is aware of all of the database server roles in the Microsoft Exchange 2010 Database Availability Group (DAG) infrastructure and can leverage any of the members (Primary, Local Replica, or Remote Replica) to be a replication source.
EMC Replication Manager automatically mounts the bookmarked replica images to a mount host running the Microsoft Exchange tools role and the EMC Replication Manager agent. The database and transaction logs are then verified using the essentials utility provided with the Microsoft Exchange tools. This ensures that the replica is a valid, recoverable copy of the database. The validation of the databases can take from a few minutes to several hours, depending on the number and size of databases and transaction log files. The key is: the load from this process does not impact the production database servers. Once the verification completes, EMC Replication Manager calls back to the production database to commit and delete the transaction logs.
Once the Microsoft Exchange database and transaction logs are validated, the files can be spun off to tape from the mount host, or depending on the retention requirement – you could eliminate tape backups of the Microsoft Exchange environment completely. Depending on the write load on the Microsoft Exchange server and how large the journal volumes for RecoverPoint are, you can maintain days or even weeks of retention/recovery images in a fairly small footprint – as compared to disk or tape based backup.
There are a number of recovery scenarios that are available from a solution based on RecoverPoint and Replication Manager. The images can be reversed synchronized to the source – this is a fast delta-based copy, but is data destructive. Alternatively, the database files could be copied from the mount host to a new drive and mounted as a recovery storage group on the Microsoft Exchange server. The database and log files can also be opened on the mount host directly with tools such as Kroll OnTrack for mailbox and message-level recovery.
Photo Credit: pinoldy