High Capacity Disks, Storage facility, Storage facilities, Cloud storage, Storage pool, Storage racks, Cheap storage, What is cloud, Computing storage management, Network storage, Rack mount, Storage unit, Vmware, Vmware performance monitoring, Vmware monitoring, Vmware backup, Sto-rage, Storage in

Advice from the Expert, Best Practices in Utilizing Storage Pools

By | Backup, Cisco, Data Loss Prevention, EMC, How To, Log Management, Networking, Storage, VMware | No Comments

Storage Pools for the CX4 and VNX have been around a while now, but I continue to still see a lot of people doing things that are against best practices. First, let’s start out talking about RAID Groups.

Traditionally to present storage to a host you would create a RAID Group which consisted of up to 16 disks, the most typical used RAID Groups were R1/0, R5, R6, and Hot Spare. After creating your RAID Group you would need to create a LUN on that RAID Group to present to a host.

Let’s say you have 50 600GB 15K disks that you want to create RAID Groups on, you could create (10) R5 4+1 RAID Groups. If you wanted to have (10) 1TB LUNs for your hosts you could create a 1TB LUN on each RAID Group, and then each LUN has the guaranteed performance of 5 15K disks behind it, but at the same time, each LUN has at max the performance of 5 15K disks.
[framed_box bgColor=”#F0F0F0″ textColor=”undefined” rounded=”true”] What if your LUNs require even more performance?

1. Create metaLUNs to keep it easy and effective.

2. Make (10) 102.4GB LUNs on each RAID Group, totaling (100) 102.4GB LUNs for your (10) RAID Groups.

3. Select the meta head from a RAID Group and expand it by striping it with (9) of the other LUNs from other RAID Groups.

4. For each of the other LUNs to expand you would want to select the meta head from a different RAID Group and then expand with the LUNs from the remaining RAID Groups.

5. That would then provide each LUN with the ability to have the performance of (50) 15K drives shared between them.

6. Once you have your LUNs created, you also have the option of turning FAST Cache (if configured) on or off at the LUN level.

Depending on your performance requirement, things can quickly get complicated using traditional RAID Groups.

This is where CX4 and VNX Pools come into play.
[/framed_box] EMC took the typical RAID Group types – R1/0, R5, and R6 and made it so you can use them in Storage Pools. The chart below shows the different options for the Storage Pools. The asterisks notes that the 8+1 option for R5 and the 14+2 option for R6 are only available in the VNX OE 32 release.

High Capacity Disks, Storage facility, Storage facilities, Cloud storage, Storage pool, Storage racks, Cheap storage, What is cloud, Computing storage management, Network storage, Rack mount, Storage unit, Vmware, Vmware performance monitoring, Vmware monitoring, Vmware backup, Sto-rage, Storage inNow on top of that you can have a Homogeneous Storage Pool – a Pool with only like drives, either all Flash, SAS, or NLSAS (SATA on CX4), or a Heterogeneous Storage Pool – a Storage Pool with more than one tier of storage.

If we take our example of having (50) 15K disks using R5 for RAID Groups and we apply them to pools we could just create (1) R5 4+1 Storage Pool with all (50) drives in it. This would then leave us with a Homogeneous Storage Pool, visualized below.High Capacity Disks, Storage facility, Storage facilities, Cloud storage, Storage pool, Storage racks, Cheap storage, What is cloud, Computing storage management, Network storage, Rack mount, Storage unit, Vmware, Vmware performance monitoring, Vmware monitoring, Vmware backup, Sto-rage, Storage in

The chart to the right displays what will happen underneath the Pool as it will create the same structure as the traditional RAID Groups. We would end up with a Pool that contained (10) R5 4+1 RAID Groups underneath that you wouldn’t see, you would only see the (1) Pool with the combined storage of the (50) drives. From there you would create your (10) 1TB LUNs on the pool and it will spread the LUNs across all of the RAID Groups underneath automatically. It does this by creating 1GB chunks and spreading them across the hidden RAID Groups evenly. Also you could turn FAST Cache on or off at the Storage Pool level (if configured).

On top of that, the other advantage to using a Storage Pool is the ability to create a Heterogeneous Storage Pool, which allows you to have multiple tiers where the ‘hot’ data will move up to the faster drives and the ‘cold’ data will move down to the slower drives.

Jon Blog photo 4Another thing that can be done with a Storage Pool is create thin LUNs. The only real advantage of thin LUNs is to be able to over provision the Storage Pool. For example if your Storage Pool has 10TB worth of space available, you could create 30TB worth of LUNs and your hosts would think they have 30TB available to them, when in reality you only have 10TB worth of disks.

The problem with this is when the hosts think they have more space than they really do and when the Storage Pool starts to get full, there is the potential to run out of space and have hosts crash. They may not crash but it’s safer to assume that they will crash or data will become corrupt because when a host tries to write data because it thinks it has space, but really doesn’t, something bad will happen.

In my experience, people typically want to use thin LUNs only for VMware yet will also make the Virtual Machine disk thin as well. There is no real point in doing this. Creating a thin VM on a thin LUN will grant no additional space savings, just additional overhead for performance as there is a performance hit when using thin LUNs.

High Capacity Disks, Storage facility, Storage facilities, Cloud storage, Storage pool, Storage racks, Cheap storage, What is cloud, Computing storage management, Network storage, Rack mount, Storage unit, Vmware, Vmware performance monitoring, Vmware monitoring, Vmware backup, Sto-rage, Storage inAfter the long intro to how Storage Pools work (and it was just a basic introduction, I left out quite a bit and could’ve gone over in detail) we get to the part of what to do and what not to do.

Creating Storage Pools

Choose the correct RAID Type for your tiers. At a high level – R1/0 is for high write intensive applications, R5 is high read, and R6 is typically used on large NLSAS or SATA drives and highly recommended to use on those drive types due to the long rebuild times associated with those drives.

Use the number of drives in the preferred drive count options. This isn’t always the case as there are ways to manipulate how the RAID Groups underneath are created but as a best practice use that number of drives.

Keep in mind the size of your Storage Pool. If you have FAST Cache turned on for a very large Storage Pool and not a lot of FAST Cache, it is possible the FAST Cache will be used very ineffectively and be inefficient.

If there is a disaster, the larger your Storage Pool the more data you can lose. For example, if one of the RAID Groups underneath having a dual drive fault if R5, a triple drive fault in R6, or the right (2) disks in R1/0.

Expanding Storage Pools

Use the number of drives in the preferred drive count options. If it is on a CX4 or a VNX that is pre VNX OE 32, the best practice is to expand by the same number of drives in the tier that you are expanding as the data will not relocate within the same tier. If it is a VNX on at least OE 32, you don’t need to double the size of the pool as the Storage Pool has the ability to relocate data within the same tier of storage, not just up and down tiers.

Be sure to use the same drive speed and size for the tier you are expanding. For example, if you have a Storage Pool with 15K 600GB SAS drives, you don’t want to expand it with 10K 600GB SAS drives as they will be in the same tier and you won’t get consistent performance across that specific tier. This would go for creating Storage Pools as well.

Graphics by EMC

IT, Cloud, IDS, Integrated Data Storage, Networking,

Your Go To Guide For IT Optimization & Cloud Readiness, Part I

By | Cloud Computing, How To, Networking, Virtualization | No Comments

As an Senior IT Engineer, I spend a lot of time in the field talking with current or potential clients. Over the last two years I began to see a trend in questions that company decision makers were asking and this revolves around developing and executing the right cloud strategy for their organization.

With all the companies I’ve worked with, there are three major areas that C-level folks routinely inquire about and those topics include reducing cost, improving operations and reducing risk. Over the years I’ve learned that an accurate assessment of the organization is imperative as it’s a valuable key to understand the current state of the companies IT infrastructure, people and processes. When discovering these key items of an organization, I’ve refined the following framework to help decision makers effectively become cloud ready.

Essentially IT infrastructure optimization and cloud readiness adhere to the same maturity curve, moving upstream from standardized to virtualized/consolidated and then converged.  From there, the remaining journey is about automation and orchestration.  It ultimately depends on where an organization currently resides. Within that framework it will dictate my recommendations for tactical next steps to reach more strategic goals.

IT, IT Optimization, Cloud, Cloud Readiness, IT Cloud, Cloud GuideStandardization is the first topic which needs to be explored as that is the base of all business operations and directions. The main drive to standardize is in efforts to reduce the number of server and storage platforms in the data center.

The more operating systems and hardware management consoles your administrators need to know, the less efficient they become.  There’s little use for Windows Server 2003 expertise in 2013 and it is important to find a way to port the app to your current standard.  The fewer standards your organization can maintain, the fewer the variables exist when trouble shooting issues. Ultimately, fewer standards will allow you to return to IT to focus on initiatives essential to the business.  Implementing asset life-cycle policies can limit costly maintenance on out of warranty equipment and ensure your organization is always taking advantage of advances in technology.

After implementing a higher degree of standardization, organizations are better equipped to take the next step by moving to a highly virtualized state and by greatly reducing the amount of physical infrastructure that’s required to serve the business.  By now most everyone has at least leveraged virtualization to some degree.  The ability to consolidate multiple physical servers onto a single physical host dramatically reduces IT cost as an organization can provide all required compute resources on far fewer physical servers.

I know this because I’ve worked with several organizations who’ve experienced consolidation ratios of 20-1 or greater.  One client I’ve worked with has extensively reduced their data center footprint, migrating 1200 physical servers onto 55 total virtual hosts. While the virtual hosts tend to be much more robust than the typical physical application server, the cost avoidance is undeniable.  The power savings from decommissioning 1145 servers at their primary data center came to over $1M in the first year alone.

It is also important to factor in cooling and a 3 year refresh cycle that will require a 1100+ servers to be purchased as the savings start to add up quickly.  In addition to the hard dollar cost savings, virtualization produces additional operational benefits.  Business continuity and disaster recovery exposure can be mitigated by using high availability and off site replication functionality embedded into today’s hypervisors.  Agility to the business can increase as well, as time required to provision a virtual server on an existing host is typically weeks to months faster than what’s required to purchase, receive, rack, power, and configure a physical server.

Please look for Part II of “Your Guide To IT Optimization & Cloud Readiness” as Mr. Rosenblum breaks down Convergence and Automation.

photo by “reway2007

Letting Cache Acceleration Cards Do The Heavy Lifiting

By | EMC, How To, Log Management, Networking, Storage, VMware | No Comments

Up until now there has not been a great deal of intelligence around SSD Cache cards and flash arrays because they have primarily been configrued as DAS (Direct Attach Storage). By moving read intensive workload up to the server off of a storage array, both individual application performance as well as overall storage performance can be enhanced. There are great benefits to using SSD Cache cards in new ways yet before exploring new capabilities it is important to remember the history of the products.

The biggest problem with hard drives either local or SAN based is that they have not been able to keep up with Moore’s Law of Transistor Density. In 1965 Gordon Moore, a co-founder of Intel, made the observation that the number of components in integrated circuits doubled every year – he later (in 1975) adjusted that prediction to doubling every two years. So, system processors (CPUs), memory (DRAM), system busses, and hard drive capacity have been doubling in speed every two years, but hard drives performance has stagnated because of mechanical limitations. (mostly heat, stability, and signaling reliability from increasing spindle speeds) This effectively limits individual hard drives to 180 IOPs or 45MB/sec under typical random workloads depending on block sizes.

The next challenge is that in an effort to consolidate storage, increase the number of spindles, availability and efficiency we have pulled the storage out of our servers and placed that data on SAN arrays. There is tremendous benefit to this, however doing this introduces new considerations. The network bandwidth is 1/10th of the system bus interconnect (8Gb FC = 1GB/sec vs PCIe 3.0 x16 = 16GB/sec). An array may have 8 or 16 front-end connections yielding and aggregate of 8-16GB/sec where a single PCIe slot has the same amount of bandwidth. The difference is the array and multiple servers share its resources and each can potentially impact the other.

Cache acceleration cards address both the mechanical limitations of hard drives and the shared-resource conflict of storage networks for a specific subset of data. These cards utilize NAND flash (either SLC or MLC, but more on that later) memory packaged on a PCIe card with an interface controller to provide high bandwidth and throughput for read intensive workloads on small datasets of ephemeral data.

[framed_box bgColor=”#F0F0F0″ textColor=”undefined” rounded=”true”] I realize there was a lot of qualification statements there so lets break it down…

  • Why read intensive? As compared to SLC NAND flash, MLC NAND flash has a much higher write penalty making writes more costly in terms of time and overall life expectancy of a drive/card.
  •  Why small datasets? Most Cache acceleration cards are fairly small in comparison to hard drives. The largest top out at ~3TB (typical sizes are 300-700GB) and the cost per GB is much much higher than comparable hard drive storage.
  •  Why ephemeral data and what does that mean? Ephemeral data is data that is temporary, transient, or in process. Things like page files, SQL server TEMPDB, or spool directories.
[/framed_box] Cache acceleration cards address the shared-resource conflict by pulling resource intense activities back onto the server and off of the SAN arrays. How this is accomplished is the key differentiator of the products available today.

SSD Caching , EMC, VFCache, FusionIO, VMWare, SLC, MLC NAND Flash, Gordon Moore, Intel, processors, CPU's, DRAM

FusionIO is one of the companies that has made a name for themselves early in the enterprise PCI and PCIe Flash cache acceleration market. Their solutions have been primarily DAS(Direct Attach Storage) solutions based on SLC and MLC NAND Flash. In early 2011 FusionIO released write-through caching to their SSD cards with their acquisition of ioTurbine software to accelerate VMWare guest performance. More recently – Mid-2012 – FusionIO released their ION enterprise flash array – which consists of a chassis containing several of their PCIe cards. They leverage RAID protection across these cards for availability. Available interconnects include 8Gb FC and Infiniband. EMC release VFCache in 2012 and has subsequently released two additional updates.

The EMC VFCache is a re-packaged Micron P320h or LSI WarpDrive PCIe SSD with a write-through caching driver targeted primarily at read intensive workloads. In the subsequent releases they have enhanced VMWare functionality and added the ability to run in “split-card” mode with half the card utilized for read caching and the other half as DAS. EMC’s worst kept secret is their “Project Thunder” release of the XTremIO acquisition. “Project Thunder” is an all SSD array that will support both read and write workloads similar to the FusionIO ION array.

SSD Caching solutions are an extremely powerful solution to very specific workloads. By moving read intensive workload up to the server off of a storage array, both individual application performance as well as overall storage performance can be enhanced. The key to determining whether or not these tools will help is careful analysis around reads vs writes, and the locality of reference of active data. If random write performance is required consider SLC based cards or caching arrays over MLC.


Images courtsey of “the register” and “IRRI images

deduplication, data domain, online storage, back up online, data, data storage

Avamar And Data Domain: The Two Best Deduplication Software Appliances

By | Avamar, Deduplication, Networking, Storage | No Comments

For years backup-to-disk technologies have evolved toward the ingestion of large sums of data very quickly, especially when compared to the newest tape options. This evolution has made backup applications influence disk targets for equally fast restores, even down to the file level.

Essentially what this means is that, today clients can integrate disk-based back-up solutions to fulfill the following conditions:
[framed_box bgColor=”#F0F0F0″ textColor=”undefined” rounded=”true”] – Mitigate risk of traditional tape failures

– Reduce the amount of time it takes to perform large back-up jobs

– Reduce the amount of capacity required at the back-up target by nearly 10-20X more than tape

– Reduce the amount of data traversing the network during a back-up job (Avamar or similar “source-based” technologies)

– Lower the total cost of ownership in comparison to tape

– Enable clients to automate the “off-site” requirement for tape by the way of replicating one disk system to the next over long distances

– Lower the RTO and RPO for clients based on custom policies available

Data Domain deduplication methods are useless without backup software in place. By leveraging Data Domains OST functionality (DDBoost), we can now combine Data Domain’s deep compression ability with the superior archiving abilities of Avamar.

Through Source-Based Deduplication, Avamar’s host side enables environments with lower bandwidth and longer backup windows to push the backup process much faster. Also, after completing the initial backup, this strategy results in less data on disks, which is good for everyone.

deduplication, data domain, online storage, back up online, data, data storage

Where Data Domain shines the most is in its ability to compress the then deduplicated data 10X more than Avamar. This integration allows Avamar to cut weekend, month-end and year-end backups to the Data Domain, allowing for much longer retention. This feature expands Avamar’s reach into extended retention cycles to disk, which is one of the faster restore methods.

Data Domain’s “target-based” deduplication technology means the backup/deduplication process happens at the actual DD Appliance. Data Domain is the actual target, as it is here that the deduplication takes place.

All data has to go over the network to the target when leveraging Data Domain. If there is a need to backup 10TBs then 10TBs need to traverse the network to the DD Appliance. When leveraging Avamar, I may only need to send 2TBs over the network, given the fact that data has been deduped prior to pushing to the target.

Taking Data Domain even further, Avamar can replicate backups to another Data Domain off site.

Allowing Avamar to control the replication enables it to keep the catalogues and track the location of the backup. This ability gives the end user ease of management when a request is made to restore. The prerequisites for DDBoost are both the license enabler for DDBoost and the Replicator on Data Domain. Overall this integration of the two “Best Deduplication Appliances” allows the end user a much wider spectrum of performance, use and compliance.

For a deeper dive into deduplication strategies, read the article from IDS CTO Justin Mescher about Data Domain vs EMC Avamar: which deduplication technology is better.

Networking, Router, VMware

Adventures In Networking: Hardships In Finding The Longest Match

By | How To, Networking, VMware | No Comments

Networking, Router, VMwareSometimes in life you have to learn things the hard way. Recently I learned why the Longest Match Rule (Longest Match Algorithm) works and why it is applied not only to routing, but to other situations as well.

I was adding a new storage array and datastores to an existing VMware cluster using iSCSI. The VMware existing environment was laid out as follows:
[framed_box bgColor=”#F0F0F0″ textColor=”undefined” rounded=”true”]vSwitch0 = VM Network & Service Console (
vSwitch1 = iSCSI (
vSwitch2 = vMotion (
vSwitch3 = Testing
The new storage array and iSCSI targets landed on a new vSwitch (vSwitch4). The old environment had both iSCSI and vMotion on the same network ( For the new environment I wanted to completely separate the iSCSI and vMotion traffic by assigning them to different networks. Both iSCSI networks needed to stay up for migrations to happen so the new environment was laid out as follows:
[framed_box bgColor=”#F0F0F0″ textColor=”undefined” rounded=”true”] vSwitch0 = VM Network & Service Console (
vSwitch1 = iSCSI (
vSwitch2 = vMotion (
vSwitch3 = Testing
vSwitch4 = iSCSI (

First, vSwitch4 was created where the new storage was configured and presented to VMware, just as planned. The problem occurred when the subnet mask on vSwtich2 was modified from /16 to /24. As soon as this change to the subnet mask on vSwitch2 happened, access to all the VM went down. After scrambling for about 5 minutes to retrace the steps prior the problem I was able to determine that it was the subnet change that caused the outage. Changing the subnet mask on vSwitch2 back to /16 slowly brought everything back online.

What caused this outage?

One simple mistake!

When the subnet was changed from /16 to /24 the third octet also needed to be changed to differentiate the iSCSI and vMotion networks. When the /24 subnet was applied to vSwitch2 ( network) the Longest Match Rule matched the longer extended network prefix. This change also applied for vSwitch1 and any data within the /16 network would traverse the /24 thus dropping the iSCSI targets and all its datastores.

A network that has a longest match describes a smaller set of IP’s than the network with a shorter match. This in turn means the longer match is more specific than the shorter match. The is the selected path because it has the greatest number of matching bits in the destination IP address of the packets (see below).

[framed_box bgColor=”#F0F0F0″ textColor=”undefined” rounded=”true”] #1, = 00001010.00001100.00000001.00000000
#2, = 00001010.00001100.00000000.00000000
#3, = 00001010.00000000.00000000.00000000
By simply changing the third octet on vSwitch2 I was able to change the subnet to /24.
The final and working configuration was laid out as follows:
[framed_box bgColor=”#F0F0F0″ textColor=”undefined” rounded=”true”] vSwitch0 = VM Network & Service Console (
vSwitch1 = iSCSI ( left for migration
vSwitch2 = vMotion (
vSwitch3 = Testing
vSwitch4 = iSCSI (

Photo From: maximilian.haack

Networking & The Importance Of VLANs

By | Networking, Replication, VMware | No Comments

We have become familiar with the term VLANs when talking about networking. Some people cringe and worry when they hear “VLAN”, while others rejoice and relish the idea. I used to be in the camp that cringed and worried – only because I did not have some basic knowledge about VLANs.

So let’s start with the basics: what is a VLAN? 

VLAN stands for Virtual Local Area Network and has the same characteristics and attributes as a physical Local Area Network (LAN). A VLAN is a separate IP sub-network which allows for multiple networks and subnets to reside on the same switched network – services that are typically provided by routers. A VLAN essentially becomes its own broadcast domain. VLANs can be structured by department, function, or protocol, allowing for a smaller layer of granularity. VLANs are defined on the switch by individual ports; this allows VLANs to be placed on specific ports to restrict access. 

A VLAN cannot communicate directly with another VLAN, which is done by design. If VLANs are required to communicate with one another the use of a router or layer 3 switching is required. VLANs are capable of spanning multiple switches and you can have more than one VLAN on multiple switches. For the most part VLANs are relatively easy to create and manage. Most switches allow for VLAN creation via Telnet and GUI interfaces, which is becoming increasingly popular.

VLAN’s can address many issues such as:

  1. Security – Security is an important function of VLANs. A VLAN will separate data that could be sensitive from the general network.  Thus allowing sensitive or confidential data to traverse the network decreasing the change that users will gain access to data that they are not authorized to see. Example: An HR Dept.’s computers/nodes can be placed in one VLAN and an Accounting Dept.’s can be place in another allowing this traffic to completely separate. This same principle can be applied to protocol such as NFS, CIFS, replication, VMware (vMotion) and management.
  2. Cost – Cost savings can be seen by eliminating the need for additional expensive network equipment. VLANs will also allow the network to work more efficiently and command better use of bandwidth and resources.
  3. Performance – Splitting up a switch into VLANs allows for multiple broadcast domains which reduces unnecessary traffic on the network and increases network performance.
  4. Management: VLANs allow for flexibility with the current infrastructure and for simplified administration of multiple network segments within one switching environment.

VLANs are a great resource and tool to assist in fine tuning your network. Don’t be afraid of VLANs, rather embrace them for the many benefits that they can bring to your infrastructure.

Photo Credit: ivanx

The Shifting IT Workforce Paradigm Part II: Why Don’t YOU Know How “IT” Works In Your Organization?

By | Backup, Cloud Computing, How To, Networking, Security | No Comments

When I write about CIO’s taking an increased business-oriented stance in their jobs, I sometimes forget that without a team of people who are both willing and able to do that, their ability to get out of the data center and into the board room is drastically hampered.

I work with a company from time to time that embodies for me the “nirvana state” of IT: they know how to increase revenue for the organization. They do this while still maintaining focus on IT’s other two jobs — avoiding risk and reducing cost. How do they accomplish this? They know how their business works, and they know how their business uses their applications. The guys in this IT shop can tell you precisely how many IOPS any type of end-customer business transaction will create. They know that if they can do something with their network, their code, and/or their gear that provides an additional I/O or CPU tick back to the applications, they can serve X number of transactions and that translates into Y dollars in revenue, and if they can do that without buying anything, it creates P profit.

The guys I work with aren’t the CIO, although I do have relationships with the COO, VP of IT, etc. To clarify – there aren’t business analysts who crossed over into IT from the business who provide this insight. These are the developers, infrastructure guys, security specialists, etc. At this point, I think if I asked the administrative assistant who greets me at the door every visit, she’d be able to tell me how her job translates into the business process and how it affects revenue.

Some might say that since this particular company is a custom development shop that should be easy. Yet, they have to know the business processes to write the code that drives them. Yes and no. I think that most people who make that statement haven’t closely examined the developers coming out of college these days. I have a number of nieces, nephews, and children of close friends who are all going into IT, and let me tell you, the stuff they’re teaching in the development classes these kids are taking isn’t about optimization of code to a business process and it isn’t about utility of IT.

It’s about teaching a foreign language more than teaching them the ‘why you do this’ of things. You’re not getting this kind of thought and thought-provoking behavior out of the current generation of graduates. This comes from caring. In my estimation it comes from those at the top giving enough latitude to make intelligent decisions and demanding that people understand what the company is doing and more importantly – where they want to be.

They set goals, clarify those goals, and they make it clear that everyone in the organization can and does play a role in achieving those goals. These guys don’t go to work every day wondering why they are sitting in the cubicle, behind the desk, with eight different colored lists on their whiteboard around a couple of intricately complicated diagrams depicting a new code release. They aren’t cogs in a machine, and they’re made not to feel as though they are. If you want to be a cog, you don’t fit in this org, pretty simple.  That’s the impression I get of them, anyway.

The other important piece of this is that they don’t trust their vendors. That’s probably the wrong way to say it. It’s more about questioning everything from their partners, taking nothing for granted, and demanding that their vendors explain how everything works so they understand how they plug into it and then take advantage of it. They don’t buy technology for the sake of buying technology. If older gear works, they keep the older gear, but they understand the ceiling of that gear, and before they hit it, they acquire new. They don’t always buy the cheapest, but they buy the gear that will drive greater profitability for the business.

That’s how IT should be buying. Not a cost matrix of four different vendors who are all fighting to be the apple the others are compared to. Rather – which solution will help me be more profitable as a business because I can drive more customer transactions through the system? Of course, 99% of organizations I deal with couldn’t tell you what the cost of a business transaction is. Probably 90% of them couldn’t tell you what the business transaction of their organization looks like.

These guys aren’t perfect, they have holes. They are probably staffed too thin to reach peak efficiency and they could take advantage of some newer technologies to be more effective. They could probably use a little more process in a few areas. But at the end of the day, they get it. They get that IT matters, they get that information is the linchpin to their business, and they get that if the people who work in the organization care, the organization is better. They understand that their business is unique and they have a limited ability to stay ahead of the much larger companies in their field; thus they must innovate, but never stray too far from their foundation or the core business will suffer.

It’s refreshing to work with a company like this. I wish there were more stories like this organization and that the trade rags would highlight them more prominently. They deserve a lot of credit for how they operate and what they look to IT to do for them.

Even though I can’t name them here I’ll just say good job guys, keep it up, and thanks for working with us.

Photo Credit: comedy_nose


What’s Better Than Advil For Datacenter Headaches? An Organized Datacenter!

By | How To, Networking | No Comments

How does your datacenter, server rack or switch rack look? Like a rats nest? 

If you had a power problem how long would it take you to track down the bad power cord? Additionally how much time would you waste figuring out if it was a bad network cable, NIC, or switch port? 

Yes, it can be a daunting task to organize your datacenter, but if the time is spent now it could save you hours and many headaches later. As an IT professional, you put a lot of money into your datacenter and it is a valuable asset to your company. It is essential to keep it organized to reduce an outage window and maximize your investment in the technology you have implemented. Taking some additional time up front to plan and organize will protect your equipment, extend its life, and make management easier.


Labeling both ends of your power cables will allow you to easily trace a power cable.  Doing so has many benefits:

  • The most obvious is that in the event of a power failure your aforethought will allow you to easily trace the failed power cable
  • If the need arises to load balance power between different PDU’s, circuits or power supplies
  • When the time comes to decommission or move the equipment already in your datacenter, you will know with simply a glance which power cables can be removed.

Running power cables to the edge of a server rack or switch rack will eliminate clutter in the event that a failed power supply needs to be replaced. Thus making replacing the failed part a simple and fast process. Using either vertical or horizontal PDU’s will depend on the type of racks you have and will allow cables to be neatly routed and managed.


  • Labeling both ends of your network cables eliminates the need to trace cable when a problem happens, replacing cables, testing or moving to a different switch/port.
  • Once again – running network cables to the edge of the server rack or a switch rack will eliminate clutter and make replacing a failed part a simple and fast process.
  • Color coordinating your network cables assists in trouble shooting and moves, it is also a visual reference for the purpose of the cable.


Have you ever wondered where all of the hot air expelled from the equipment in your server and switch racks ends up?  The majority of the time it is out of the back/rear. If you have a rats nest of cables behind the fans, then they are not going to be able to expel the hot air away from the racks. This could lead to not only overheating of the equipment, but possibly the room.

  • If not expelled – stagnate hot air will, over time, cause your cables to become brittle and lead to failure. Unused cables can also pose a serious fire risk.


  • Troubleshooting, replacing, removing and adding new equipment, power or network will be a simple task if your datacenter is organized and clean.
  • Not only will the management be easier for you, it will also allow guests in your datacenter to easily figure out where a network cable goes. If a problem ever does arise where a technician needs to work on some equipment he/she would have easy access.

Photo Credit: mrtom

Internet Running Out of IP Addresses?! Fear Not, IPv6 Here to Save the Day

By | Backup, How To, Networking | No Comments

As everyone may (or may not be) aware, we are running out of IP version 4 addresses. Okay, not really, but they have almost all been given out to service providers to pass on to customers. At that point, they will eventually run out. Fear not. This doesn’t mean that the internet will come to a screeching halt.  It only means that it will be time to move on to the next iteration of networking called IP version 6 (IPv6 for short).  Most of the rest of the world is already running it to high degree.

With this post, I’m going to take some time to lift the veil off of this. The reason is that every time I mention it to anyone, be it a customer, old coworker, or longtime networker, it draws a sense of fear. Don’t be afraid of IPv6, people! It’s not as scary as it seems.

Let’s start with a quick comparison. Currently, there are approximately 4.3 billion IPv4 addresses using the current 32 bit scheme. That’s less than 1 for every person in the world! Think about how many you are using right now.  Here’s me:

1.  Cell phone

2. Air card

3. Home network

4. My work computer

5. TV

We’ve gotten around the limitation by using something called Port Address Translation (PAT). PAT should really be called “PATCH,” because we are patching the IPv4 network due to a gross underestimate of the growth of the internet. PAT normally occurs on a firewall. We can use one public IP address to represent outgoing/incoming traffic to our network. That is why we have RFC 1918 addresses (10/8, 192.168…and so on). Those addresses needed to be reserved so that they could hide behind a public IP address, and therefore every company could have as many IP addresses as they needed. Because of the reserved address space, the available IP addresses are layout 3.2 billion. That’s less than 1 for ever two people!

Theoretically, a single PAT IP could represent over 65000 clients (you may see flames begin to shoot out of your firewall). So, what are the drawbacks? For one, it requires a higher degree of difficulty to troubleshoot connection issues. Also, setting firewall rules become more difficult and can result in connectivity issues. Plus, the idea of end-to-end connectivity is thrown out the door since it truly is not at that point. Lastly, as translations occur, you are placing higher and higher loads on firewall, which could be doing other things such as improving latency and throughput.  PAT’s time is through!  Thanks, but good riddance!

IPv6 uses 128 bit addressing.  That’s about 340,000,000,000,000,000,000,000,000,000,000,000,000 or 18,000,000,000,000,000,000 for every person on earth.  For a comparison in binary:

IPv4: 10101010101010101010101010101010

IPv6:  10101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010

Luckily, IPv6 addressing is represented in HEX.  Though the above binary number looks painful and overwhelming, a single IPV6 address on your network can be as simple as this:


That’s not so bad, is it? In a follow-up post, I will demystify the IPv6 addressing scheme.

For up to date IPv6  statistics and IPv4 exhaustion dates around the world, look here:

Photo credit: carlospons via Flickr

Sick Over Gateway Redundancy? Cisco’s Got A Solution For That …

By | Cisco, How To, Networking | No Comments

A testament to the ever adapting pioneers that they are, Cisco has developed the first gateways redundancy protocol: Hot Standby Router (HSRP). HSRP allows for default gateways to be failed over to another router, based on a priority that can rise or fall contingent upon interface tracking.

The Internet Engineering Task Force (IETF) created a standard that is almost identical: Virtual Router Redundancy Protocol (VRRP), as identified in RFC 2338. The only real differentiator is the terminology. If you have non-Cisco routers or are pairing between Cisco and another vendor then you are using VRRP.

Here is an example of the old days:

[iframe src=”” width=”535″ height=”525″]


Next in the long line of gateway redundancy protols came HSRP, which allows for failover of the default gateway. The only way to load balance was by creating two different HSRP groups: multiple HSRP (MSHRP), using different IP addresses for the default gateways. Hence you would have to configure Dynamic Host Configuration Protocol (DHCP) pools that give two separate gateway addresses for the SAME IP range. Sound painful, right?

Let’s look at general HSRP operation. For example: you could have Router 1 and Router 2 running HSRP which would both be tracking their WAN links. Below is normal HSRP operation: the router on the left is actively forwarding traffic as the default gateway, and the one on the right is waiting for it fail or lose its WAN link. Notice that the top router is doing absolutely nothing, aside from looking pretty.

[iframe src=”” width=”605″ height=”450″]


Now, the WAN link fails and the other router takes over.

[iframe src=”” width=”605″ height=”440″]


When the link goes down the other router takes over forwarding traffic. It is a time tested strategy, but if you have two routers why not utilize both?

Introducing another Cisco first: Global Load Balancing Protocol (GLBP). GLBP introduces two router roles:

  1. The Active Virtual Gateway (AVG): responsible for giving out the Media Access Control (MAC) address to the other routers as well as responding to clients Address Resolution Protocol (ARP) requests.
  2. The Active Virtual Forwarded (AVF).

The AVG generally gives out the MAC address in a round robin fashion (though there are other choices). Some clients get MAC for Router 1 and some recieve ONE IP address.

[iframe src=”” width=”605″ height=”490″]


Normal Operation:

[iframe src=”” width=”625″ height=”525″]


Now, I’m sure you are wondering what happens on a link failure or router loss.

Since there are only two routers in these scenarios, the AVG would take over for the MAC address, making the failover absolutely seamless. The router on the right would lose it’s link and report that it is no longer able to forward traffic. Ok, it might be a little more complicated than that, but you get the gist. 

[iframe src=”” width=”625″ height=”525″]


GLBP is a great solution for load balance and it offers your users seamless failover of their default gateway upon the failure of a router.

Perhaps the IETF will make this a standard too!

Photo Credit:DominiqueGodbout