Maxing out both PDUs in a rack with redundant power
If I have a rack full of servers using redundant power supplies, and I'm keeping the amps at 80% (16A of 20A), isn't this still asking for trouble in the event of a power loss on one of the circuits?
If power is distributed 50-50 (e.g. an average of 1A per power supply on a server that would use 2A when using only one power supply), when one circuit fails, the load is going to pull from the other circuit (and will likely throw a breaker).
I guess I've always just gone the 80% route without thinking too much about it, but does this mean I should only be "maxing out" our circuits to 40%?
Running dual-PSU servers in a balanced power distribution mode doesn't mean that they each use exactly 50% of the power that a single PSU would use. It's usually horribly inefficient to balance the load. I don't have the exact numbers, but afaik it's about a 20% waste in power.
However - running them in balanced mode and then push each PDU up to 80% is begging for trouble. I see absolutely no chance in the remaining PDU surviving if you're already that far up in usage with balanced power distribution.
One other thing to keep in mind is the most catastrophic scenario in this situation - a site-wide power failure where both PDU's goes offline, and one of them breaks it's circuit because of overvoltage or similar (not that uncommon). What happens when all your servers power on at the same time, with a single 20A circuit to share? I'm sure I don't have to explain it further..
If I were you I'd start making preemptive changes right away.. set the servers to high efficiency power (if they support it), which usually gives you a 2-3% usage on the standby PSU and 100% load on the other. You'll quickly see if you're maxing out one of the PDU's. Also - make sure that the servers have power-on delayed in clusters to make sure that they all don't power on at the same time in case of outage.
Edit: Oh, and of course - order bigger PDU's and bigger circuits :-)
Much of this depends on your tolerance for risk and the policies at your datacenter facility. If this is your office's datacenter/computer room, the conditions may be different than at a co-location facility.
My colo strictly enforces the 80% utilization rule. This is for their protection, my protection and customer/provider SLAs. But it's also mandated by the electrical code here.
High-end server power supplies can be configured to balance load across the circuits or to run in a high-efficiency mode, where the load is unbalanced...
balanced PSU mode on an HP ProLiant server
hpasmcli> SHOW POWERSUPPLY
Power supply #1
Present : Yes
Redundant: Yes
Condition: Ok
Hotplug : Supported
Power : 105 Watts
Power supply #2
Present : Yes
Redundant: Yes
Condition: Ok
Hotplug : Supported
Power : 95 Watts
hpasmcli> SHOW POWERMETER
Power Meter #1
Power Reading : 200
high-efficiency mode on an HP ProLiant server
hpasmcli> SHOW POWERMETER
Power Meter #1
Power Reading : 290
hpasmcli> SHOW POWERSUPPLY
Power supply #1
Present : Yes
Redundant: Yes
Condition: Ok
Hotplug : Supported
Power : 255 Watts
Power supply #2
Present : Yes
Redundant: Yes
Condition: Ok
Hotplug : Supported
Power : 35 Watts
Your situation sounds like mine. You have two 20A feeds and are using redundant power supplies. You really don't want to load either side more than 40%. You want the aggregate to be 80%, so that's 16 Amps for you. No more!
As you can see below, I run a bit close to the edge. I had a temporary server in the environment for 45 days and had to receive special permission from the facility to run over the power limit for the duration.
In my experience, the server power supplies do not behave as you predict. In other words, a server drawing 1/1 in a steady state with both PDUs working will go to 1.5/0 or 0/1.5 if a PDU or circuit dies. You can test this for yourself with a smart PDU or with a kill-a-watt.
The real question is whether the 80% load rule is an absolute requirement or not.
In my case, I simply stored an extra PDU in the cage and monitored the power closely. Occasionally, we'd drift above 80% utilization on one circuit or another. It never got to the point where the facility complained.