Website unreachable / PR_END_OF_FILE_ERROR / tries connects to localhost instead

Since some weeks I have a very strange issue, where I'm stuck how to further debug, specially as I'm more of a Linux Guy in front of a MacOS (10.15.4)

From time to time I get in Firefox/Chrome a error when requesting a website with the error PR_END_OF_FILE_ERROR (firefox) or ERR_CONNECTION_CLOSED (chrome). Waiting a couple of minutes solves the issue (tried to really do nothing - stepping away for 5min). Sometimes it then works for the next 3h without a problem, sometimes it comes back after 5min.

So its super inconsistence and I didn't found out yet, what invokes the issue.

What I found out so far:

  • seems to be not an issue with antivirus Sophos (9.9.8) - disabled it, still had the issue
  • seems to be not an issue with VPN - got the issue if I'm in the company VPN and also if I'm not connected to the VPN (OpenVPN over Tunnelblick or IPSec)
  • get it if I'm connected with WiFi and when connected with Ethernet
  • get it at home and get it at work
  • sometimes some sites works, others don't, like MS Teams seems to work all the time (also happens if MS Teams is not running)

Last clue I have is that it seems to be something like a hidden proxy, because of the output from curl I get, if the issue persists.

1st try:

Tue May 19 08:31:12,xxx@MBP-133-xxx-2 ~ $ curl stackoverflow.com -v > /dev/null
Trying 151.101.129.69...
* TCP_NODELAY set
* Connected to stackoverflow.com (127.0.0.1) port 80 (#0)
> GET / HTTP/1.1
> Host: stackoverflow.com
> User-Agent: curl/7.64.1
> Accept: */*
>
* Empty reply from server
* Connection #0 to host stackoverflow.com left intact
curl: (52) Empty reply from server
* Closing connection 0

2nd try (just 1second later)

Tue May 19 08:31:13,xxx@MBP-133-xxx-2 ~ $ curl stackoverflow.com -v > /dev/null
Trying 151.101.129.69...
* TCP_NODELAY set
* Connected to stackoverflow.com (151.101.129.69) port 80 (#0)
> GET / HTTP/1.1
> Host: stackoverflow.com
> User-Agent: curl/7.64.1
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< cache-control: no-cache, no-store, must-revalidate
< location: https://stackoverflow.com/
< server: Microsoft-IIS/10.0
< x-flags: AA
< x-aspnet-duration-ms: 0
< x-request-guid: 450a82e6-54bc-483f-8825-a778ac09d170
< x-is-crawler: 1
< x-providence-cookie: 0fa36a94-4864-6d95-d737-cc8ab7e2a285
< Transfer-Encoding: chunked
< Accept-Ranges: bytes
< Date: Tue, 19 May 2020 06:31:14 GMT
< Via: 1.1 varnish
< Connection: keep-alive
< X-Served-By: cache-fra19134-FRA
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1589869874.022703,VS0,VE93
< Vary: Fastly-SSL
< X-DNS-Prefetch-Control: off
< Set-Cookie: prov=0fa36a94-4864-6d95-d737-cc8ab7e2a285; domain=.stackoverflow.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
<
{ [5 bytes data]
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Connection #0 to host stackoverflow.com left intact
* Closing connection 0

If I do this, then it more or less jumps between both states. Sometimes more failing, sometimes more succeeding.

But the biggest issue I see here for the failing connection is Trying 151.101.129.69... and then Connected to stackoverflow.com (127.0.0.1)

I do not have an proxy setup, in the /etc/hosts are only develop entries, definitiv not stackoverflow.com. Between both curls I didn't do anything (it was really just doing two calls in the row).

I'm open for any suggestion how to debug this further.

edit-00:

Requested curl --trace-ascii trace.log stackoverflow.com

Failure:

== Info:   Trying 151.101.193.69...
== Info: TCP_NODELAY set
== Info: Connected to stackoverflow.com (127.0.0.1) port 80 (#0)
=> Send header, 81 bytes (0x51)
0000: GET / HTTP/1.1
0010: Host: stackoverflow.com
0029: User-Agent: curl/7.64.1
0042: Accept: */*
004f:
== Info: Empty reply from server
== Info: Connection #0 to host stackoverflow.com left intact

Working:

== Info:   Trying 151.101.129.69...
== Info: TCP_NODELAY set
== Info: Connected to stackoverflow.com (151.101.129.69) port 80 (#0)
=> Send header, 81 bytes (0x51)
0000: GET / HTTP/1.1
0010: Host: stackoverflow.com
0029: User-Agent: curl/7.64.1
0042: Accept: */*
004f:
<= Recv header, 32 bytes (0x20)
0000: HTTP/1.1 301 Moved Permanently
<= Recv header, 52 bytes (0x34)
0000: cache-control: no-cache, no-store, must-revalidate
<= Recv header, 38 bytes (0x26)
0000: location: https://stackoverflow.com/
<= Recv header, 28 bytes (0x1c)
0000: server: Microsoft-IIS/10.0
<= Recv header, 13 bytes (0xd)
0000: x-flags: AA
<= Recv header, 25 bytes (0x19)
0000: x-aspnet-duration-ms: 0
<= Recv header, 54 bytes (0x36)
0000: x-request-guid: 6a97f488-95a4-4071-bcc0-f83d8209df2f
<= Recv header, 17 bytes (0x11)
0000: x-is-crawler: 1
<= Recv header, 59 bytes (0x3b)
0000: x-providence-cookie: 3108f7a8-ac68-c64f-680e-2e5dd3bcc55c
<= Recv header, 28 bytes (0x1c)
0000: Transfer-Encoding: chunked
<= Recv header, 22 bytes (0x16)
0000: Accept-Ranges: bytes
<= Recv header, 37 bytes (0x25)
0000: Date: Tue, 19 May 2020 06:57:58 GMT
<= Recv header, 18 bytes (0x12)
0000: Via: 1.1 varnish
<= Recv header, 24 bytes (0x18)
0000: Connection: keep-alive
<= Recv header, 33 bytes (0x21)
0000: X-Served-By: cache-fra19171-FRA
<= Recv header, 15 bytes (0xf)
0000: X-Cache: MISS
<= Recv header, 17 bytes (0x11)
0000: X-Cache-Hits: 0
<= Recv header, 38 bytes (0x26)
0000: X-Timer: S1589871478.446474,VS0,VE94
<= Recv header, 18 bytes (0x12)
0000: Vary: Fastly-SSL
<= Recv header, 29 bytes (0x1d)
0000: X-DNS-Prefetch-Control: off
<= Recv header, 139 bytes (0x8b)
0000: Set-Cookie: prov=3108f7a8-ac68-c64f-680e-2e5dd3bcc55c; domain=.s
0040: tackoverflow.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/;
0080:  HttpOnly
<= Recv header, 2 bytes (0x2)
0000:
<= Recv data, 5 bytes (0x5)
0000: 0
0003:
== Info: Connection #0 to host stackoverflow.com left intact

Output scutil --dns

DNS configuration

resolver #1
  search domain[0] : company.com
  nameserver[0] : 172.27.10.42
  nameserver[1] : 172.27.10.41
  nameserver[2] : 172.27.10.43
  if_index : 10 (en14)
  flags    : Request A records
  reach    : 0x00020002 (Reachable,Directly Reachable Address)

resolver #2
  domain   : local
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 300000

resolver #3
  domain   : 254.169.in-addr.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 300200

resolver #4
  domain   : 8.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 300400

resolver #5
  domain   : 9.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 300600

resolver #6
  domain   : a.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 300800

resolver #7
  domain   : b.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 301000

DNS configuration (for scoped queries)

resolver #1
  search domain[0] : company.com
  nameserver[0] : 172.27.10.42
  nameserver[1] : 172.27.10.41
  nameserver[2] : 172.27.10.43
  if_index : 10 (en14)
  flags    : Scoped, Request A records
  reach    : 0x00020002 (Reachable,Directly Reachable Address)

Content /etc/resolv.com

search company.com
nameserver 172.27.10.42
nameserver 172.27.10.41
nameserver 172.27.10.43

My /etc/hosts is long and has a lot of entries, because of local development. But it breaks down to this:

255.255.255.255 broadcasthost
::1             localhost
fe80::1%lo0     localhost
127.0.0.1       localhost

# a lot of entries like
127.0.0.1       abc.nauts.eu

127.0.0.1   localhost drupal8.local gitlab pharmadbv2.local webgrind.local
127.0.0.1   xhprof.nauts.eu
255.255.255.255 broadcasthost
::1             localhost

127.0.0.1   kubernetes.docker.internal

edit-01: It is a HTTP/HTTPs issue only. Like if this issue is happening for an internal server, I have no problem connecting to it via ssh.

edit-02

curl --noproxy company.com --trace-ascii trace2.log jenkins.company.com

== Info:   Trying 172.27.10.63...
== Info: TCP_NODELAY set
== Info: Connected to jenkins.company.com (127.0.0.1) port 80 (#0)
=> Send header, 84 bytes (0x54)
0000: GET / HTTP/1.1
0010: Host: jenkins.company.com
002c: User-Agent: curl/7.64.1
0045: Accept: */*
0052:
== Info: Empty reply from server
== Info: Connection #0 to host jenkins.company.com left intact

Solution 1:

I think I final figured it out.

Never trust a AV. If you set it to off, it doesn't mean its really disabled...

I uninstalled it and suddenly the issue disappeared. Which makes kinda sense, because Sophos is checking the traffic to scan for malware/virus thats why its proxies the traffic over. Why its still doing this when its disabled is another question.

It seems at one point something got seriously broken with my Sophos installation, which is why this issue appeared. Maybe the update to Catalina or a Sophos update.

I have now installed Sophos again (company policy) and the issue is still gone. So fix was reinstalling AV...

If somebody is having a similar issue, try to UNINSTALL any AV scanner. Do not trust the disabled state!