IBM Server takes a long time to boot past UEFI to OS
I have a pair of IBM System x3620 servers. These servers do fine once they finally reach the point where the operating system takes over, but it takes them forever to get past the new-fangled UEFI boot system... a good five minutes or so; maybe longer. I haven't timed it, but it's the kind of thing where you go get a cup of coffee while you wait and it's still going when you come back.
Normally the only time I shut these down is for a monthly maintenance cycle (usually just windows updates). It's built-in maintenance time, and so the extra 5 minutes don't count against our SLAs and aren't a big deal. However, in the case where I might have an outage I'd sure like to get those 5 minutes back. Is there anything I can do to tell them to just go ahead and boot already? I've already disabled everything I can find to disable as far as extra boot options go.
All IBM uEFI Machines take ages to boot, as after the eon-taking uEFI initialization and module startup the legacy BIOS emulation kicks in and the PCI-E option ROMs get executed etc. etc. This is "normal" on all IBM uEFI machines - no matter if blade or standard rack server.
You could disable legacy BIOS boot, the option ROMs, optimize the boot order and generally keep that machine to the newest firmware level offered by IBM.
I agree the System X uEFI legacy implementation is so painfully slow, that I might even avoid selling them as a platform to my clients.
Measuring the IBM form the time it starts a legacy USB key boot until I get an OS prompt is ridiculously long. I am using SmartOS (an illumos/opensolaris derivative for all intents an purposes once booted it runs and acts a lot like Solaris 11) which acts like puppy Linux e.g. it loads a 275MB "compressed" blob (the entire OS) and then boots the OS in memory. This really showcases the problem with IBMs uEFI implementation of legacy booting.
BEG: 1:27:05 pm (start SmartOS USB 2.0 USB key) END: 1:33:38 pm (done into running SmartOS - we read 275MB) --- TOOK: 6:33 (six minutes and 33 seconds - pretty slow - only 0.75MB/sec.)
It is almost as if the UEFI implementation uses a tiny block size like 512 byte reads, rather than a larger buffer during reads. Once I am in the OS I can benchmark the performance of the USB key I booted off, IMHO if the IBM UEFI code would just read a 8192 block size or better yet a 32768 block size the resulting boot would be super fast.
So once in a SmartOS operating systems I saw the following performance characteristics for my USB key, ranging form 512 byte to 131072 bytes. Looks like either 8192 block size (12.3 MB/sec in a booted OS) or better yet a 32768 block size (20.2 MB/sec in a booted OS) would be good choices. It also looks like a 512 block size (0.64 MB/sec in a booted OS) matches pretty close the results I seem to experience in my lengthy boots.
time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=512 count=524288 524288+0 records in 524288+0 records out real 31m19.499s => 00.64MB/sec. on SmartOS like Solaris 11 (this is the speed of the IBM bios boot speed) time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=1024 count=262144 262144+0 records in 262144+0 records out real 1m39.989s => 02.56MB/sec. SmartOS like Solaris 11 time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=2048 count=131072 131072+0 records in 131072+0 records out real 0m50.215s => 05.09MB/sec. SmartOS like Solaris 11 time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=4096 count=65536 65536+0 records in 65536+0 records out real 0m33.056s => 07.74MB/sec. SmartOS like Solaris 11 time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=8192 count=32768 32768+0 records in 32768+0 records out real 0m20.757s => 12.33MB/sec. SmartOS like Solaris 11 time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=32768 count=8192 8192+0 records in 8192+0 records out real 0m12.785s => 20.02MB/sec. on SmartOS like Solaris 11 (as expeected and seen on a Win7 box) time dd if=/dev/dsk/c1t0d0p0 of=/dev/null bs=131072 count=2048 2048+0 records in 2048+0 records out real 0m11.532s => 22.19MB/sec. SmartOS like Solaris 11
I was using the following a new IBM x3550 M3 with UEFI (BIOS) rev 1.13 (12GB ram, and one 2.266GHz Xenon processor)
Firmware Type Version String Release Date IMM YUOOC7E 09/30/2011 UEFI D6E154A 09/23/2011 DSA DSYT89P 10/28/2011
I must say I am sorely dissapointed with the "speed" of USB booting in the legacy BIOS mode in the IBM UEFI implementation.
Food for thought for my 275MB image a Supermicro XSCA9F or an Oracle-Sun X4275 will boot a 275 MB usb key image in just 32 or 33 seconds respectively, while the IBM x3550 M3 takes over 363 seconds for the same image (11 times slower).
This performance is totally unacceptable and the issue exists across the entire System X line. I have been in contact with IBM and they just say try a uEFI boot load (which is like saying to me learn the UEFI spec, learn GRUB2 and write your own custom boot loader, yes its doable but I don't have an extra 2-3 weeks to mess with this stuff). Yes using a "pure" uEFI boot should work fast but I can not prove it, however then I couldn't used "standard distros" and also as I indicated I would be forced to write my own uEFI boot loader.
This problem "slow legacy booting" was reported by me under IBM Problem/Ticket # A02PGGK, I even tried contacted the uEFI developer (I think it is Michael Brinkman) directly, however IBM doesn't seem like they care to acknowledge this issue and the large community of people and companies that are impacted.
I have also posted a similar anaysis to a thread at http://communities.intel.com/thread/3909?wapkw=uEFI which also discusses "slow booting" back in Sep. 2009 here it is the same issue I have been seeing
Boot time is too slow. It take about 20 minutes to boot a VMware ESX when EFI is used, compared to less than 2 minutes with the normal bios
this is the same 10X or 11X slowdown I experience, hopefully some day IBM will fix this.
Jon Strabala