Networking doesn't initialize properly when pxebooting Linux Mint (live CD) using cifs, but works with nfs
I have a TFTP/DHCP/NFS/SMB server (Ubuntu server 12.04 LTS) on 192.168.26.1. I use pxelinux to display a menu containing startup and installation options for Windows, an Ubuntu network installer, and the Linux Mint 17 MATE live CD. Getting it running like this was already nasty and I'm running out of steam...
For Linux Mint, I have provided 2 netboot options: NFS and CIFS. I got it fully working with NFS: The user can select it in the boot menu, and a short while later, lands on the Linux Mint live CD desktop. But with CIFS, networking doesn't initialize properly. When Linux Mint starts, the networking hangs for 120 seconds. Then, it continues to boot to the Desktop, but net network-manager
isn't started (and doesn't start). I suspected that it might be a problem with the DHCP server not responding, however, in the DHCP server log I can see the DHCP request and successful response.
Once in the Linux Mint desktop, ifconfig
reports an IP address that is assigned by the DHCP, and pinging the server works.
My pxelinux configuration is (everything after APPEND
is in one line, I just split it up for readability on this site):
NFS:
LABEL linuxmint17
MENU LABEL Linux Mint 17
KERNEL linux-mint-17/image/casper/vmlinuz
APPEND
root=/dev/nfs boot=casper netboot=nfs
nfsroot=192.168.26.1:/var/lib/tftpboot/linux-mint-17/image
initrd=/linux-mint-17/image/casper/initrd.lz
CIFS:
LABEL linuxmint17smb
MENU LABEL Linux Mint 17 (SMB)
KERNEL linux-mint-17/image/casper/vmlinuz
APPEND
root=/dev/cifs boot=casper netboot=cifs
nfsroot=//192.168.26.1/tftpshare/linux-mint-17/image
ip=dhcp
initrd=/linux-mint-17/image/casper/initrd.lz
Note that I had to insert the ip=dhcp
option to the CIFS menu. If I don't do that, the boot process hangs for 120 seconds when initializing Networking, but then it doesn't continue. If I add that line, it still hangs, but after 120 seconds it continues to boot.
The setup:
The client and server virtual machines are only connected to each other (internal network). There are no other machines in the network at all.
The server has all the pxe boot files under /var/lib/tftpboot/
. The Linux Mint ISO (unmodified) is mounted under /var/lib/tftpboot/linux-mint-17/image
. vmlinuz
and initrd
are in /var/lib/tftpboot/linux-mint-17/image/casper
. /var/lib/tftpboot/
is an NFS export. There is a samba share called tftpshare
that maps to /var/lib/tftpboot/
(read-only, allows access to everyone).
smb.conf
[tftpshare]
comment = TFTP Root
path = /var/lib/tftpboot
browsable = yes
guest ok = yes
read only = no
create mask = 0644
dhcpd.conf
authoritative;
subnet 192.168.26.0 netmask 255.255.255.0 {
range 192.168.26.10 192.168.26.40;
next-server 192.168.26.1;
filename "pxelinux.0";
}
This is a strange 2 minute gap in the syslog
of the client machine after a successful boot to the live desktop environment:
Jun 14 13:13:18 mint kernel: [ 23.388873] intel_rapl: domain core energy ctr 0:0 not working, skip
Jun 14 13:13:18 mint kernel: [ 23.528409] intel_rapl: domain uncore energy ctr 0:0 not working, skip
Jun 14 13:13:18 mint kernel: [ 23.528453] intel_rapl: no valid rapl domains found in package 0
Jun 14 13:13:20 mint ntpdate[1198]: Can't find host ntp.ubuntu.com: Name or service not known (-2)
Jun 14 13:13:20 mint ntpdate[1198]: no servers can be used, exiting
(2 Minute gap without any entries, roughly at the time when the 120 second boot delay occurs)
Jun 14 13:15:19 mint dbus[864]: [system] Activating service name='org.freedesktop.ConsoleKit' (using servicehelper)
Jun 14 13:15:19 mint dbus[864]: [system] Activating service name='org.freedesktop.PolicyKit1' (using servicehelper)
Jun 14 13:15:19 mint acpid: starting up with netlink and the input layer
Jun 14 13:15:19 mint acpid: 9 rules loaded
Jun 14 13:15:19 mint acpid: waiting for events: event logging is off
This is what happens in both cases when using CIFS:
On the server:
...
Jun 14 13:12:52 ubuntu-netboot in.tftpd[2722]: RRQ from 192.168.26.13 filename /linux-mint-17/image/casper/initrd.lz
Jun 14 13:13:14 ubuntu-netboot dhcpd: DHCPDISCOVER from 08:00:27:1c:c5:43 via eth1
Jun 14 13:13:14 ubuntu-netboot dhcpd: DHCPOFFER on 192.168.26.14 to 08:00:27:1c:c5:43 via eth1
Jun 14 13:13:14 ubuntu-netboot dhcpd: DHCPREQUEST for 192.168.26.14 (192.168.26.1) from 08:00:27:1c:c5:43 via eth1
Jun 14 13:13:14 ubuntu-netboot dhcpd: DHCPACK on 192.168.26.14 to 08:00:27:1c:c5:43 via eth1
The IP that is assigned to the client in case of a successful boot to the desktop, according to ifconfig
, is indeed ...14
.
This is what happens without the ip=dhcp
:
This is what happens with the ip=dhcp
, immediately before the Desktop shows:
I'm thankful for any ideas. If any other logs (which?) would help, I can provide them.
Solution 1:
This problem has been solved by Serva (I'm related to Serva development)
The complete kernel and append lines plus the additional initrd.gz required for PXE booting current Ubuntu/Mint live versions with CIFS can be found here
Basically the problem is a Casper bug (AFAIK never reported/fixed before) that in the case of a CIFS netmount forgets to export a kernel parameter that later affects the networking configuration scripts that end up recreating with delays and errors the file /etc/network/interfaces.
If we see Serva's Ubuntu/Mint "append" line
append = showmounts toram root=/dev/cifs initrd=NWA_PXE/$HEAD_DIR$/casper/initrd.lz,NWA_PXE/$HEAD_DIR$/casper/INITRD_N11.GZ boot=casper netboot=cifs nfsroot=//$IP_BSRV$/NWA_PXE_SHARE/$HEAD_DIR$ NFSOPTS=-ouser=serva,pass=avres,ro ip=dhcp ro
we find that the embedded "initrd" variable is made of 2 "consecutively loaded" initrd files (initrd.lz and INITRD_N11.GZ)
initrd=NWA_PXE/$HEAD_DIR$/casper/initrd.lz,NWA_PXE/$HEAD_DIR$/casper/INITRD_N11.GZ
The first one (initrd.lz) is the one coming with Ubuntu/Mint while the second one (INITRD_N11.GZ) is a tiny 8K (originally developed by Serva) custom initrd including the patched components. This approach avoids the need to recreate the big original initrd.lz (20 MB). INITRD_N11.GZ can be freely downloaded from Serva's site (please do not post direct links here)
If we continue analyzing the "append" line we see the need to add the CIFS mounting options (the OP forgets this step) that are carried in this case by the somehow misleading variable "NFSOPTS"
NFSOPTS=-ouser=serva,pass=avres,ro
In this example the SMB share has a user=serva with password=avres and it'll be mounted as "Read Only", off course user/pass parameters must be edited accordingly.
The TFTP paths and CIFS locator are the ones required by Serva repository structure; when the PXE server is not Serva those parameters must be edited accordingly.
If you guys PXE boot this way Ubuntu/Mint Live versions from a CIFS share there will be no network related delays and Internet/Networking will work right away after boot
Edit:
Bug already reported to Ubuntu Launchpad and confirmed