"A timeout was reached while waiting for the service to connect" error after rebooting

I have a custom-written Windows service that I run on a number of Hyper-V VMs. The VMs get rebooted a couple times an hour as part of some automated tests being run. The service is set to automatic start and almost all of the time, it starts up fine.

However, maybe 5% of the time, with no pattern that I can discern, the service fails to start. When it fails, I get an error in Event Viewer saying

A timeout was reached (30000 milliseconds) while waiting for the My Service Name service to connect.

When this occurs, I can start the service manually, or restart again, and the service will start fine.

The thing I can't figure out is that the 30 second timeout doesn't appear to be occurring in my code. The very first line of my service class's OnStart() method logs "Starting..." to its log4net log. When the service fails to start, I don't even get anything logged at all, which indicates to me that either log4net can't log for whatever reason, or the timeout is occurring before my OnStart() gets called.

The service runs on a variety of OSes, from XP all the way up to Win7 and 2008R2, and I know that setting the service to delayed start may solve this for Vista and later, but that seems like a hack.

I haven't been able to remote debug this because of the fact that it happens so intermittently and during system startup, and I'm at a loss as to further ways to try to figure out what's going on. Any ideas?


Solution 1:

My guess - and that's all it is - is that the disk is thrashing hard during startup, to the point where the .NET Framework itself isn't starting in the 30 seconds that Windows allocates for services to start.

A kludgy workaround may be to set the service to start manually, then write a very small stub service in unmanaged code (e.g. C++, Delphi) to start the service.

Another approach may be to start the service remotely from another machine. The sc command should do the job nicely.

Solution 2:

I was seeing this error in the Event Viewer when trying to install a service with powershell.

The problem I had was that I had different values for "Service Name" and "Service Display Name" in my powershell script to those that I had specified in the program.cs file of my Console Application.

Solution 3:

For what it's worth, I discovered that I received this message (almost immediately upon service startup) because I did not have version 4.5 of the .NET framework installed on the target machine. I rolled back the version I was using to version 4.0 (which was already installed on the target machine) and the service worked as expected.

Solution 4:

I think I may have also found another contributing factor to this kind of does not start on reboot error.

It appears that if the Windows Event Log is set to Overwrite Events > 7days.. size 512kb.. But a lot of activity has occurred within this window, then Event Log is effectively full because it can't overwrite the number of events generated inside that timeframe. If you set the eventlog to a much larger size OR to Overwrite as needed then you won't experience this issue