Hangfire job is crashing the entire application; how do we get Hangfire to handle the errors?

We have several Hangfire jobs, and occasionally they will bring down our entire site. Hangfire isn't catching the errors, and we even added a try/catch around the ProcessCore() call, but it's not hitting that code.

Has anyone else had this issue? Is there some kind of specific setup that needs to be done to avoid this? (In the error below, BackgroundServices.UpdateClientStats is the Hangfire job)

RecurringJob.AddOrUpdate("updateClientStats", () => container.GetInstance<UpdateClientStats>().Process(null), Cron.Daily(7));




public class UpdateClientStats : BackgroundAppService {
        private readonly ClientStatsHelper clientStatsHelper;
        public UpdateClientStats(DBContext context, IDomainRuleViolationCollection domainRuleViolationCollection, ClientStatsHelper clientStatsHelper) : base(context, domainRuleViolationCollection) {
            this.clientStatsHelper = clientStatsHelper;
        }

       protected async override Task ProcessCore() {
            var userIds = context.Users.Where(x => x.Status == ClientStatus.Active && x.TeamMemberRole.HasFlag(TeamMemberRole.Runner)).Select(x => x.Id).ToList();

            userIds.ForEach(async userId => {
                await clientStatsHelper.UpdateClientStats(userId, false, false);
            });
        }
    }


public async Task UpdateClientStats(long clientUserId, bool skipJournalEntries, bool skipLastScheduledWorkoutDate) {
            using (var scope = serviceScopeFactory.CreateScope()) {
                var context = scope.ServiceProvider.GetService<DBContext>();

... more code logic here...

                var user = await context.Users.Where(x => x.Id == clientUserId).FirstOrDefaultAsync();

                if (user != null) {
                    user.UpdateNextScheduledEvent(await Helpers.ClientHelper.GetClientNextEventDate(context.ClientProgramWorkoutDays.AsQueryable(), clientUserId));

... more logic here...

                    user.UpdateStats(completionPercentage, pastCompletionPercentage, durations7Days.Miles, durationsPast7Days.Miles, durations7Days.TimeInSeconds,
                        durationsPast7Days.TimeInSeconds, durations30Days.Miles, durationsPast30Days.Miles, durations30Days.TimeInSeconds, durationsPast30Days.TimeInSeconds,
                        durationsTrailing7to34Days.Miles, durationsTrailing7to34Days.TimeInSeconds, intensityInSeconds7Days, intensityInSecondsTrailing7to34Days);


                    await context.SaveChangesAsync();
                }
            }
        }



public abstract class BackgroundAppService : BaseAppService {
        public BackgroundAppService(DBContext context, IDomainRuleViolationCollection domainRuleViolationCollection) : base(context, domainRuleViolationCollection) {

        }
        protected abstract Task ProcessCore();

        [DisableConcurrentExecution(timeoutInSeconds: 10)]
        [AutomaticRetry(Attempts = 0, OnAttemptsExceeded = AttemptsExceededAction.Delete)]
        public void Process(PerformContext performContext) {
            using (LogContext.PushProperty("ApplicationName", "Hangfire"))
            using (LogContext.PushProperty("Hangfirejob", this.GetType().Name))
            using (LogContext.PushProperty("HangfireJobID", performContext?.BackgroundJob?.Id))
            using (LogContext.Push(new PerformContextEnricher(performContext))) {
                Log.Information("Job {jobName} Started", this.GetType().Name);

                try {
                    ProcessCore().Wait();
                } catch (Exception ex) {
                    Log.Error(ex.ToString());
                    throw ex;
                }

                Log.Information("Job {jobName} Finished", this.GetType().Name);
            }

        }

    }```


Your app crashed because of System.InvalidOperationException

Your app, crashed because of System.InvalidOperationException and aborted the requests it was processing when the overflow occurred. As a result, your app’s users may have experienced HTTP 502 errors.

This call stack caused the exception:

Microsoft.Data.Common.ADP.ExceptionWithStackTrace
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
Microsoft.EntityFrameworkCore.Storage.RelationalConnection+d__50.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
Microsoft.EntityFrameworkCore.Storage.RelationalConnection+d__50.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
Microsoft.EntityFrameworkCore.Storage.RelationalConnection+d__47.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
Microsoft.EntityFrameworkCore.Storage.RelationalCommand+d__17.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor+AsyncQueryingEnumerable`1+AsyncEnumerator+d__17[[System.__Canon System.Private.CoreLib]].MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
Microsoft.EntityFrameworkCore.Query.ShapedQueryCompilingExpressionVisitor+d__20`1[[System.__Canon System.Private.CoreLib]].MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
Microsoft.EntityFrameworkCore.Query.ShapedQueryCompilingExpressionVisitor+d__20`1[[System.__Canon System.Private.CoreLib]].MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
OurApp.Application.Helpers.ClientStatsHelper+d__2.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess
System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification
OurApp.Application.BackgroundServices.UpdateClientStats+<b__2_2>d.MoveNext
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw
System.Threading.Tasks.Task+<>c.b__139_1
System.Threading.QueueUserWorkItemCallback+<>c.<.cctor>b__6_0
System.Threading.ExecutionContext.RunForThreadPoolUnsafe[[System.__Canon System.Private.CoreLib]]
System.Threading.QueueUserWorkItemCallback.Execute
System.Threading.ThreadPoolWorkQueue.Dispatch
System.Threading._ThreadPoolWaitCallback.PerformWaitCallback

I would say that the problem comes from

userIds.ForEach(async userId => {
    await clientStatsHelper.UpdateClientStats(userId, false, false);
});

you are implicitly creating an async void delegate where exceptions will be tricky to handle. See "Avoid async void"

You should try :

foreach (var userId in userIds) {
      await clientStatsHelper.UpdateClientStats(userId, false, false);
    }

if you want to do all your processing concurrently, you may try :

var tasks = new List<Task>();
foreach (var userId in userIds) {
      tasks.Add(clientStatsHelper.UpdateClientStats(userId, false, false));
    }
await Task.WhenAll(tasks);

You should also avoid rethrowing errors with throw ex; see Is there a difference between "throw" and "throw ex"?