Sneakernet vs. WAN: When Moving Data With Your Feet Beats Using The Network

Andrew S. Tanenbaum was quoted in 1981 as saying “Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.”

The underlying story on this is written in the non-fiction section of Wikipedia. It was derived from NASA and their Deep Space network tracking station between Goldstone, CA and their other location at Jet Propulsion Labs about 180 miles away. Common today, as much as it were 30 years ago, a backhoe took out the 2400bps circuit between the two locations. The estimate to fix it was about one full day. So, they loaded a car with 9-track magnetic tapes and drove it 3-4 hours from one location to the other to get the data there six times faster than over the wire.

So, they loaded a car with 9-track magnetic tapes and drove it 3-4 hours from one location to the other to get the data there six times faster than over the wire.

That got me to thinking about IT and business projects that require pre-staging data. Normally, we IT folks get wind of a project weeks or months in advance. With such ample notice, how much data can we pre-stage in that amount of time?

With a simple 100Mbit connection between locations, and using a conservative compression ratio, we can move nearly 1TB of data in a day. That seems plenty of time to move source installation files, ISOs, and even large databases. Remembering that our most precious resource is time, anything a script or computer can do instead of us manually doing is worth a careful consideration.

Below is a chart listing out common bandwidth options and the time to complete a data transfer.

chart - 1

The above example is not as much about data center RPOs and RTOs, as it is about just moving data from one location to another. For DR objectives, we need to size our circuit so that we never fall below the minimums during critical times.

For example, if we have two data center locations with a circuit in between, and our daily change rate of 100TB of data is 3%, we will still need to find the peak data change rate timeframe before we can size the circuit properly.

chart - 2

If 50% of the data change rate occurs from 9am to 3pm, then we need a circuit that can sustain 250GB per hour. A dedicated gigabit circuit can handle this traffic, but only if it’s a low latency connection (the location are relatively close to one another). If there’s latency, we will most certainly need a WAN optimization product in between. But in the event of a full re-sync of data, it would take 9-10 days to move all that data over the wire plus the daily change rate. So unless we have RPOs and RTOs measuring weeks, or unless we have weeks to ramp-up to a DR project, we will have a tough time during a full re-sync, and wouldn’t be able to rely on DR during this time.

So, that might be a case where it makes sense to sneakernet the data from one location to the other.

Photo credits via Flickr: Nora Kuby