Reconsidering 2-Node Architecture: Design Your Data Center With “3s and 5s”

In the enterprise IT world, most people would agree that all your eggs in one basket is not ideal. However, if you scale too wide, you have to manage all that overhead. That’s why a brilliant and experienced enterprise architect once told me that he prefers “3s and 5s.”

If you have anything in your data center that has redundancy, consider a rethink of anything active/standby or active/active and only having two nodes. For one thing, if you have only two systems, then you should monitor to ensure you don’t go above 50% utilization. If you have active/standby or active/passive, then each node cannot really exceed 100% utilization. In either case, you’re not getting efficiency out of your systems.

Design Comparison Matrix

Consider the following matrix comparing number of nodes versus efficiency, resiliency, and performance:

Screenshot 2014-08-12 06.38.41

But why 3s and 5s and not 4s and 6s? “Because it forces a recode of the software and moves to a structure that is a truly cloud-like design instead of old-school, fault-tolerant or highly available designs,” says David Bolthouse, enterprise architect.

Look how inefficient and rigid 2-node systems look with this comparison. You cannot burst all that much when the business needs to burst, perhaps due to acquisitions, perhaps because there’s a sudden demand workload, or headroom for peak usage times.

I’m suggesting that there is a sweet spot somewhere between 3 and 8-node systems.

If you have a blank slate on new designs for servers, networking, and storage, then consider looking at multi-node architectures. This concept can also be applied to DR and Cloud, as just having two sites—one production and one failover site—is not the future of cloud-based data centers.

If you have a blank slate on new designs for servers, networking, and storage, then consider looking at multi-node architectures.

The whole point of cloud is to distribute workloads and data across systems to achieve a higher level of overall efficiency, resiliency, and performance. Notice I didn’t say cost. We’re not quite there yet, but that’s coming. Distributing your risk across nodes and locations will eventually drive down costs, not just from the additional efficiencies gained, but also because you will be able to afford to invest in more value-driven products.

Take Cleversafe, which replicates data across multiple nodes and multiple locations. The low cost of this object-based storage allows for all those copies, while still keeping costs under control. Instead of thinking about your ability to recover from a failure, you will be thinking about how many failures you can sustain without much business impact. If the applications are written correctly, there may be very little business interruption after all.

Photo credit: Thomas Lieser via Flickr