"FetchError no backend connection" error when Apache is running

[centos@ip-172-35-25-65 ~]$  varnishlog
    0 CLI          - Rd ping
    0 CLI          - Wr 200 19 PONG 1635280998 1.0
    0 CLI          - Rd ping
    0 CLI          - Wr 200 19 PONG 1635281001 1.0
   10 SessionOpen  c 127.0.0.2 55870 127.0.0.2:80
   10 ReqStart     c 127.0.0.2 55870 894208400
   10 RxRequest    c GET
   10 RxURL        c /
   10 RxProtocol   c HTTP/1.0
   10 RxHeader     c X-Real-IP: 198.95.75.75
   10 RxHeader     c X-Forwarded-For: 198.95.75.75
   10 RxHeader     c X-Forwarded-Proto: https
   10 RxHeader     c X-Forwarded-Port: 80
   10 RxHeader     c Host: staging03.cherry.com
   10 RxHeader     c Connection: close
   10 RxHeader     c Cache-Control: max-age=0
   10 RxHeader     c Authorization: Basic aGc6am9objEyMw==
   10 RxHeader     c Upgrade-Insecure-Requests: 1
   10 RxHeader     c User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36
   10 RxHeader     c Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
   10 RxHeader     c Accept-Encoding: gzip, deflate
   10 RxHeader     c Accept-Language: en-US,en;q=0.9,fr;q=0.8
   10 RxHeader     c Cookie: ajs_anonymous_id=%22424f4cd9-cbbc-4ead-83b1-273cb21cf453%22; _fbp=fb.1.1630002144579.2012566540; __qca=P0-1416512434-1630002144589; _edwvts=708154457303700204; _gid=GA1.2.1572498662.1635275261; ajs_user_id=%224543534%40mimpi99.com%22; _gcl_au=1.1.
   10 VCL_call     c recv pass
   10 VCL_call     c hash
   10 Hash         c /
   10 Hash         c staging03.cherry.com
   10 Hash         c 80
   10 Hash         c ajs_anonymous_id=%22424f4cd9-cbbc-4ead-83b1-273cb21cf453%22; _fbp=fb.1.1630002144579.2012566540; __qca=P0-1416512434-1630002144589; _edwvts=708154457303700204; _gid=GA1.2.1572498662.1635275261; ajs_user_id=%224543534%40mimpi99.com%22; _gcl_au=1.1.1880042
   10 VCL_return   c hash
   10 VCL_call     c pass pass
   10 FetchError   c no backend connection
   10 VCL_call     c error deliver
   10 VCL_call     c deliver deliver
   10 TxProtocol   c HTTP/1.1
   10 TxStatus     c 503
   10 TxResponse   c Service Unavailable
   10 TxHeader     c Server: Varnish
   10 TxHeader     c Content-Type: text/html; charset=utf-8
   10 TxHeader     c Retry-After: 5
   10 TxHeader     c Content-Length: 392
   10 TxHeader     c Accept-Ranges: bytes
   10 TxHeader     c Date: Tue, 26 Oct 2021 20:43:23 GMT
   10 TxHeader     c X-Varnish: 894208400
   10 TxHeader     c Via: 1.1 varnish
   10 TxHeader     c Connection: close
   10 TxHeader     c X-Age: 0
   10 TxHeader     c X-Cache: MISS
   10 Length       c 392
   10 ReqEnd       c 894208400 1635281003.852778196 1635281003.852984428 0.000073195 0.000165701 0.000040531
   10 SessionClose c error
   10 StatSess     c 127.0.0.2 55870 0 1 1 0 1 0 273 392
    0 CLI          - Rd ping
    0 CLI          - Wr 200 19 PONG 1635281004 1.0
    0 CLI          - Rd ping
    0 CLI          - Wr 200 19 PONG 1635281007 1.0
    0 CLI          - Rd ping
    0 CLI          - Wr 200 19 PONG 1635281010 1.0
    0 CLI          - Rd ping
    0 CLI          - Wr 200 19 PONG 1635281013 1.0

I tried to log what was happening when I got from the client side:

Error 503 Service Unavailable
Service Unavailable

Guru Meditation:
XID: 894208400

Now, I thought it was because of Apache not running, because when I close varnish I get a 502 gateway error from nginx. Anyway, I read the error logs:

[Tue Oct 26 14:53:47 2021] [notice] SELinux policy enabled; httpd running as context unconfined_u:system_r:httpd_t:s0
[Tue Oct 26 14:53:47 2021] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Tue Oct 26 14:53:47 2021] [notice] Digest: generating secret for digest authentication ...
[Tue Oct 26 14:53:47 2021] [notice] Digest: done
[Tue Oct 26 14:53:47 2021] [notice] FastCGI: process manager initialized (pid 23090)
[Tue Oct 26 14:53:47 2021] [notice] Apache/2.2.15 (Unix) DAV/2 mod_fastcgi/2.4.6 configured -- resuming normal operations
[Tue Oct 26 14:53:52 2021] [error] [client 127.0.0.1] Directory index forbidden by Options directive: /var/www/html/
[Tue Oct 26 14:53:52 2021] [error] [client 127.0.0.1] File does not exist: /var/www/html/favicon.ico, referer: http://staging03.hgreg.com/
[Tue Oct 26 15:01:21 2021] [error] [client 127.0.0.1] Directory index forbidden by Options directive: /var/www/html/
[Tue Oct 26 15:01:42 2021] [notice] caught SIGTERM, shutting down
[Tue Oct 26 15:01:42 2021] [notice] SELinux policy enabled; httpd running as context unconfined_u:system_r:httpd_t:s0
[Tue Oct 26 15:01:42 2021] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Tue Oct 26 15:01:42 2021] [notice] Digest: generating secret for digest authentication ...
[Tue Oct 26 15:01:42 2021] [notice] Digest: done
[Tue Oct 26 15:01:42 2021] [notice] FastCGI: process manager initialized (pid 23299)
[Tue Oct 26 15:01:42 2021] [notice] Apache/2.2.15 (Unix) DAV/2 mod_fastcgi/2.4.6 configured -- resuming normal operations
[Tue Oct 26 15:11:56 2021] [notice] caught SIGTERM, shutting down

I saw SIGTERM, shutting down, so I thought maybe I should restart Apache and I did, but I get the same error, and no new logs in the error_log.

[centos@ip-172-35-25-65 ~]$ sudo service httpd restart
Stopping httpd:                                            [  OK  ]
Starting httpd:                                            [  OK  ]
[centos@ip-172-35-25-65 ~]$ date
Tue Oct 26 17:12:32 EDT 2021
[centos@ip-172-35-25-65 ~]$ 

Now, I run a puppet config, but it didn't completely run, but I have the same files. So I am wondering what might be the issue. One of the Apache config file which is loaded since all files with conf are loaded is like this:

<VirtualHost *>
    ServerName preprod.staging03.cherry.com

    
    
    ServerAlias betacherry.staging03.cherry.com staging03.cherry.com
    
    

    DocumentRoot /home/staging03/version/preprod.staging03.cherry.com
    ServerAdmin [email protected]

    SetEnv environment preprod
    SetEnv project staging03

    UseCanonicalName Off
    #CustomLog /var/log/httpd/preprod.staging03.cherry.com_log combined
    #CustomLog /var/log/httpd/preprod.staging03.cherry.com-bytes_log "%{%s}t %I .\n%{%s}t %O ."

    ## User cherry # Needed for Cpanel::ApacheConf
    UserDir disabled
    UserDir enabled staging03
    
      #<IfModule mod_suphp.c>
    #    suPHP_UserGroup staging03 staging03
    #</IfModule>
    
    SuexecUserGroup staging03 staging03
    
    <directory "/home/staging03/version">
        AddHandler php5-fcgi .php
        Action php5-fcgi /php5-fcgi-staging03
        AllowOverride All

        
        AuthType Basic
        AuthName "staging03-preprod"
        AuthUserFile "/etc/httpd/conf.d/htpasswd.staging03"
        require valid-user

        satisfy any
        deny from all

        Order deny,allow
        SetEnvIf X-Hg-Internal-IP 1 HgInternalIP=1
        Allow from env=HgInternalIP

        SetEnvIf User-Agent "Amazon CloudFront" AmazonCloudFront
        Allow from env=AmazonCloudFront

        SetEnvIf User-Agent "^(.*)Lighthouse(.*)$" Lighthouse=1
        Allow from env=Lighthouse
        
    </directory>
    <IfModule concurrent_php.c>
        php5_admin_value open_basedir "/home/staging03:/usr/lib/php:/usr/local/lib/php:/tmp"
    </IfModule>
    <IfModule !concurrent_php.c>
        <IfModule mod_php5.c>
            php_admin_value open_basedir "/home/staging03:/usr/lib/php:/usr/local/lib/php:/tmp"
        </IfModule>
        <IfModule sapi_apache2.c>
            php_admin_value open_basedir "/home/staging03:/usr/lib/php:/usr/php4/lib/php:/usr/local/lib/php:/usr/local/php4/lib/php:/tmp"
        </IfModule>
    </IfModule>
    <IfModule !mod_disable_suexec.c>
        <IfModule !mod_ruid2.c>
            SuexecUserGroup staging03 staging03
        </IfModule>
    </IfModule>
    <IfModule mod_ruid2.c>
        RMode config
        RUidGid staging03 staging03
    </IfModule>
    <IfModule itk.c>
        # For more information on MPM ITK, please read:
        #   http://mpm-itk.sesse.net/
        AssignUserID staging03 staging03
    </IfModule>
</VirtualHost>

So what files should I look at and how do I check it's not Apache that's the problem, because we have nginx routing to varnish and then routing to Apache, so I am thinking Apache is the problem, but I don't get any useful info from the log and Apache runs without any issue, it's just not servicing the page and Varnish can't reach Apache for some reason?

I am running CENTOS 6, and I have another server with the same configurations that's running well, but when I diff the etc folder, I don't really see any significant difference.


Solution 1:

Based on your logs I can see that Both Varnish & Apache are running on the same machine. Varnish should run on port 80 and Apache on port 8080.

Apparently there also an Nginx running, so I'm assuming that's for TLS termination, running on port 443.

Step 1: ensure Apache is successfully listening on port 8080

Run sudo netstat -plnt to figure out which ports are used by each service.

Ensure that the httpd service is listening on port 8080 and verify this by running curl -I localhost:8080.

Step 2: add a health probe for the backend in your VCL file

The standard VCL doesn't offer a health probe for your default backend. Using the VCL code below, you can constantly monitor the backend health:

backend default {
    .host = "127.0.0.1";
    .port = "8080";
    .probe = {
        .url = "/";
        .timeout = 2s;
        .interval = 5s;
        .window = 10;
        .threshold = 5;
   }
}

Once the probe is added and the new VCL is loaded, you can call the following command to check the health of the backend, based on the probe:

varnishlog -g raw -i backend_health

If the output contains Still sick, you know that backend is not available and the status code may tell you why that is.

Step 3: upgrade your Varnish server

I couldn't help but notice terms like RxHeader in your VSL output. This is a clear hint that you're using an ancient version of Varnish that is no longer supported.

Even in really old versions of Varnish, the RxHeader was replaced with ReqHeader.

My advice: upgrade to Varnish 6.0 LTS. This LTS version of Varnish comes with frequent bug fixes and security patches. See https://www.varnish-software.com/developers/tutorials/installing-varnish-centos/ to learn how to install this version on CentOS.