What RAM options do I need to know before buying Server RAM?

This is a proposed Canonical Question about Server Memory.

I have to buy a Dell R420 server and there are various combinations (1600 and 1333 MHz RDIMMS and UDIMMS) and Performance Optimized vs. Advanced ECC with and without sparing. I noticed that there are only 4gb DIMMS with UDIMM, so I will have utimately to go to 16GB RDIMMS.

What are these options and what do I need to know about them?


Solution 1:

RAM for servers comes with a few common metrics to specify it's capacity and ability to work in a particular configuration. To help confuse this there are different names for what is essentially the same thing, and the "standard" name changes depending on which type of RAM you're using.

Capacity (1GB, 4GB, 32GB, etc)

This is easy enough; everyone should already be familiar with the concept that RAM comes in different capacities. The particular type of RAM determines what the maximum size of a single stick can be, but that's irrelevant because actual implementations limit the amount of RAM a system can support (ie, check the documentation for your system to see what capacity it supports).

RAM's capacity can be organized in different configurations. Usually there's just one standard configuration for RAM of a certain size. If you're buying ultra-cheap RAM off the Internet be warned that it may be non-standard (especially if they mention the organization) and not supported by your server.

Speed (1600MHz, etc)

For the purposes of this Answer, you want the speed of the RAM to match the maximum speed of the system. RAM that is one or sometimes two "speeds" faster will work as well, though at the lesser speed. Similarly RAM that is one or two "speeds" slower will work, also at the lesser speed.

Integrity Protection (ECC or Non-ECC)

ECC is the most common form of integrity protection (ie, making sure cosmic rays didn't flip any bits and none of the memory locations are going bad). In most systems the RAM must either be ECC or non-ECC, whatever the system requires. Occasionally this is called 72-bit memory (a misnomer leftover from 64 memory data channels getting 8 bits of ECC along side the data bus).

When RAM has ECC, that protection information can be checked at a variety of times. The most basic protection reads and checks the ECC data only when the RAM at that memory location is read. More advanced options allow the system to check ECC regularly. Most frequently I've seen this called "memory scrubbing"; it works much like disk array scrubbing; and like disk array scrubbing you should have it enabled unless there's a good reason to disable it.

ECC is one of the steps reducing the impact of Row Hammer bug.

Bus Electrical Capacity (Unbuffered or Registered)

We're not electrical engineers, so all you really need to know is that Buffered or Registered RAM allows more RAM in a system than without. Like ECC this is something that must be supported by the system. Unlike ECC many new servers support both Unbuffered/Unregistered and Buffered/Registered RAM. Older servers tended to support only one or the other. Registers are a type of buffer, but the terms are used interchangeably when applied to RAM. I have never see a system that can mix Unbuffered and Registered at the same time.

When you see UDIMM, the "U" is for "Unbuffered". The "R" in RDIMM is "Registered".

  • Ranks

    Registered RAM has well defined electrical "usage" characteristics metered in "ranks". Each RAM channel (or bus) in a system can support so many ranks at each speed it supports. Typically systems are rated at two speeds (ie, the channel runs at X speed normally with up to A ranks; but Y speed if over that; and only up to B ranks are possible).

    There is RAM available with the same capacity and speed, but taking up different numbers of ranks. Typically the more capacity the more ranks a module takes up. Low voltage modules take up less ranks (per the module's specifications).

Foot Notes

  • There are a variety of configuration options unrelated to what physical RAM you need to buy for your server. These include mirroring the RAM (just like RAID1, but for RAM), sparing (literally spare RAM that if one goes bad the spare replaces it), timing and related optimizations.

  • Modern servers typically have the memory controller(s) integrated into the CPU instead of a separate North Bridge chip. This means systems that support multiple CPUs must have the CPU socket populated that corresponds to a memory slot in order to use that slot. Similarly some CPUs required there to be memory populated in their slots for the system to work. See the system's documentation for details.

  • Modern servers typically have more than one memory channel. These channels operate mostly independently, which will allow greater memory bandwidth in memory-intensive usage scenarios. Generally you should plan on distributing memory across all channels on all populated CPUs as evenly as is realistic to ensure the best performance. 

Solution 2:

When upgrading the memory of an existing server you should probably start by confirming what memory modules you have installed now and what extra/new/replacement modules are actually supported by the (main board) vendor and BIOS.

To comply with warranty and your hardware support contracts you may be required to buy genuine spare-parts from the vendor, rather than using after market memory modules. Most vendors list certified spare-parts for their hardware and most memory manufacturers also have product selectors directing you to products that should work with your server.

A common pitfall is that older servers don't support new larger capacity memory modules, which based on all their other properties do fit and would be expected to work.

The most common approach is to populate currently empty memory banks, rather than upgrading to larger sized memory modules. NB You can't populate memory banks assigned to empty CPU sockets.

Finding out what you have now

Some of remote management consoles like HP's ILO will display current memory configuration.

The Linux dmidecode -t memory command will display the maximum amount of memory the main board supports as well as information about what memory is present in the populated memory banks and which ones are still empty.

For Windows systems WMI should provide similar information with wmic MemoryChip.

Mixing memory modules of different sizes

Although it always feels somewhat wrong, I haven't seen any compelling reasons it is bad per se. The Owners manual confirms that it is a supported configuration, provided that all rules regarding memory are complied with.

In multi CPU configurations you need a balanced memory configuration where each CPU has the same amount of memory on the same memory channels: i.e. in a 2 CPU configuration you can have 2 GB in slot A1 and 4 GB in slot A2 as long as that is mirrored in the second CPU,2 GB in slot B1 and 4 Gb in slot B2.

Mixing memory modules of different speeds

You can mix modules of different speeds as long as the main board supports those speeds. The BIOS is supposed to find the lowest common denominator and regulate that all modules run at the same speed. Since typically faster memory is more expensive this seems a small waste of money although it does allow you to cannibalise some older systems to upgrade others.