Are networks now faster than disks?

This is a software design question

I used to work on the following rule for speed

cache memory > memory > disk > network

With each step being 5-10 times the previous step (e.g. cache memory is 10 times faster than main memory).

Now, it seems that gigabit ethernet has latency less than local disk. So, maybe operations to read out of a large remote in-memory DB are faster than local disk reads. This feels like heresy to an old timer like me. (I just spent some time building a local cache on disk to avoid having to do network round trips - hence my question)

Does anybody have any experience / numbers / advice in this area?

And yes I know that the only real way to find out is to build and measure, but I was wondering about the general rule.

edit:

This is the interesting data from the top answer:

  • Round trip within same datacenter 500,000 ns

  • Disk seek 10,000,000 ns

This is a shock for me; my mental model is that a network round trip is inherently slow. And its not - its 10x faster than a disk 'round trip'.

Jeff attwood posted this v good blog on the topic http://blog.codinghorror.com/the-infinite-space-between-words/


Here are some numbers that you are probably looking for, as quoted by Jeff Dean, a Google Fellow:

Numbers Everyone Should Know

L1 cache reference                             0.5 ns
Branch mispredict                              5 ns
L2 cache reference                             7 ns
Mutex lock/unlock                            100 ns (25)
Main memory reference                        100 ns
Compress 1K bytes with Zippy              10,000 ns (3,000)
Send 2K bytes over 1 Gbps network         20,000 ns
Read 1 MB sequentially from memory       250,000 ns
Round trip within same datacenter        500,000 ns
Disk seek                             10,000,000 ns
Read 1 MB sequentially from network   10,000,000 ns
Read 1 MB sequentially from disk      30,000,000 ns (20,000,000)
Send packet CA->Netherlands->CA      150,000,000 ns

It's from his presentation titled Designs, Lessons and Advice from Building Large Distributed Systems and you can get it here:

  • Dr Jeff Dean Keynote PDF or on slideshare.net

The talk was given at Large-Scale Distributed Systems and Middleware (LADIS) 2009.

Other Info

  • Google Pro Tip: Use Back-Of-The-Envelope-Calculations To Choose The Best Design
  • Stanford 295 Talk Software Engineering Advice from Building Large-Scale Distributed Systems

It's said that gcc -O4 emails your code to Jeff Dean for a rewrite.



There are a lot of variables when it comes to network vs. disk, but in general, disk is faster.

The SATA 3.0 and SAS buses are 6 Gbps, vs. a networks 1Gbps minus protocol overhead. With RAID-10 15k SAS, the network is going to seem dog slow. In addition, you have disk cache and also the possibility of solid state harddrives, which depending on the scenario, could also increase speed. Random vs. Sequential data access plays a factor, as well as the block size in which data is being transferred. That all depends on the application that is being used to access the disk.

Now, I have not even touched on the fact that whatever you are transporting over the network is going to or coming from disk anyway...so.......again, disk is faster.