Cost effective way to build a server with lots of RAM

Unusual requirement sometimes benefit from unusual solutions. Sure you can give 6 figures to Sun, Dell or HP and be done with it, but it is not the only game in town.

For single box solutions, getting up to 128GB is very cheap (32 x 4GB ~ USD 3.000), even with homebrew motherboards that cost less than USD 1.000. (don't mock the makers. If it's good enough for Google ... )

256GB is seriously more expensive (32x8GB ~ USD 18.000), and beyond that ...

Alternatively have you considered Infiniband (10Gbps) interconnected cheap boxes as an alternative?

You could build a 4 node, 16 processor (64 cores), 512GB machine that way and still have change from USD 25.000 .

You would furthermore have the added benefits of gracefull degradation, if your application can run on 3 machines if one of them fails, and possibly get a linear scaling in cost up to 8 nodes (just add 4 more nodes). At that point you are looking at a cool 128 core, 1TB RAM beast for < USD 50.000 .

Before you dismiss the Infiniband proposal as exotic, it isn't for the type of machine you are asking for. e.g. 141 of the top 500 supercomputers are built this way, including 4 out of the top 10 ( http://top500.org/connfam/8 )


Alright, look. You're not going to find a server that has the sort of RAM footprint you're looking for, at least not one that doesn't require its own electrical grid.

Why not take a scalable approach, and use memcached? You can spread the memory around to different machines across the network. The data never has to touch a disk drive, and with the sort of ultra-fast network you can buy with the money you're talking about, latency will hardly be a problem at all.

Here's a memcached client for java: http://www.whalin.com/memcached/

And here's an intro to memcached in case you're not familiar: http://www.danga.com/memcached/

Look into it. It's going to be way more cost effective than building a single monster machine with an insane amount of RAM. Besides, if you're doing something that has that kind of requirement, it's probably mission critical, and you don't need a single point of failure.