Best sysadmin WTF?

You sure witnessed it with your own eyes (or are going to) sooner or later: that dreadful project/system/situation where something got SO screwed up you just can't believe it actually went like it did.

Mismanagement? Misbudgeting? Misunderstanding? Just silly, plain ignorance? Name your cause, it sure happened (and keeps happening a lot, sadly; see here).

Describe it here, for amusement (although somewhat of the cynic kind) and learning (hopefully).

Some rules:

  • This is not the place for random (even if utterly devastating) admin errors, so please avoid "argh I mistyped that rm -r" or "OMG I JUST COPIED THE CORRUPTED DATABASE OVER MY LAST GOOD BACKUP" (been there, done that); those things are better here. This is about "what sort of drugs was exactly under the influence of who designed/implemented this system?".
  • One WTF per post, so they can get properly commented.
  • Please post something you actually witnessed :-)
  • If it was you who did it, it still qualifies :-)

I'll be adding some material soon, feel free to add your own; and please do :-)


Solution 1:

Email response from a Microsoft support engineer to a reported issue:

"As far as my opinion of your issue, I have one word: WEIRD."

Gold!

Solution 2:

I was called from a company I never heard of before, which had been tasked with implemented an Exchange 2003 mail server for a customer and had no clue at all about how to do it; nothing too strange, right? I work as a freelance consultant, so I'm perfectly fine doing jobs you don't know how to do for you (and getting your money for it).

So I went at the customer site, and discovered something quite strange: every single server in the network was a domain controller; all 15 or so of them.

Then I discovered something even stranger: no one of them was properly replicating with any other, Active Directory overall behaviour could only be described as "erratic", users were having about any network issue you can imagine and Exchange just refused to install with unknown-to-mankind errors.

So I had a look at the network configuration on the server, and I saw... it was using the ISP's public DNS servers. Then I look at another server... and it was the same. Then I look at a DC... same thing. Then I asked... and it was officially confirmed: each and every computer on the network (about 1500) was using the ISP's DNS instead of a rightful domain controller.

I proceeded in explaining DNS is quite critical for proper Active Directory operation, and was able to reconstruct the back story:

  • Someone originally set up the AD domain correctly, using a DC as the DNS server for every computer.
  • He/she/it didn't know anything about forwarders and/or firewall configuration, so computers were unable to resolve Internet public names.
  • So came the idea of using the ISP's DNS servers on the computers; they configured it on everyone of them.
  • They started having lots of "can't find a domain controller" error (who would have guessed?).
  • They thought the problem was caused by not having enough DCs, so they proceeded to promote every single server to that role.
  • Needless to say, this only worsened things, as those new DCs too used the wrong DNSs, so they were also unable to replicate.
  • This went on for months, they just "got used" to the network being totally unreliable.
  • On top of this, they try to launch Exchange setup, which crashed miserably; only then they decided to call some external consultant, and until them they had absolutely not clue something was totally, definitely wrong in their network setup.

Solution 3:

Once upon a time I had a client that was a small business (10 people) with an electronic health record. (Not a medical doctor). I noted one day that the backups had been failing. Upon testing, the tape drive was not working at all. I mentioned this to the owner, who said he was well aware that the drive was bad, but was too expensive to replace.

Sure -- thats not very WTF.

The WTF is he had his staff rotating the tape daily, taking it to a safety deposit box, and all that jazz for the 6-9 months since it died.

"Don't tell the staff, it might worry them"

Solution 4:

I was working as a sysadmin for a Big Government Agency (one of the main bodies of Italy's governement), and had been managing their data center for some months. One evening, my phone rings and my boss tells me something Very Bad is happening: total power outage.

Ok, we have UPSes, right?

Yes, but they won't last long, so better go there and shut down everything until power returns.

I go there, make my way through the dark corridors, arrive at the server room... and am greeted by what can only be described as pure hell. Literally. The room was so hot you could have baked cakes in it. UPS power was ok, but half of the servers had already shut down from overheating and the remaining ones were screaming in agony.

The reason?

Servers were on UPS power... air conditioning was not.