What is the difference between "CONNECT" and "GET HTTPS"?
Before getting to the real question, let me explain how my project works: for sake of simplicity, my proxy is on my laptop, where the client (my browser) also is; the remote server will be, for example, YouTube.
The client is connected to a specific port of the proxy thanks to SwitchOmega plugin: the client wants to connect to www.youtube.com and the proxy gets the following request:
CONNECT www.youtube.com:443 HTTP/1.1
Host: www.youtube.com:443
Proxy-Connection: keep-alive
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.87 Safari/537.36
I was told that when a proxy gets a CONNECT
request, it should open a TCP connection to IP:Port, return a 200 OK
message to client and send data until one side of the connection is closed.
With another plugin that tracks HTTP requests, HTTP Trace, I see a different request on my browser:
GET https://www.youtube.com/
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
and other data...
So, why my proxy receives CONNECT www.youtube.com:443 HTTP/1.1
while HTTP Trace shows GET https://www.youtube.com/
? Do they mean the same thing?
CONNECT deals with the request
CONNECT
The CONNECT method converts the request connection to a transparent TCP/IP tunnel, usually to facilitate SSL-encrypted communication (HTTPS) through an unencrypted HTTP proxy.
While GET retrieves the data.
GET
The GET method requests a representation of the specified resource. Requests using GET should only retrieve data and should have no other effect. (This is also true of some other HTTP methods.) The W3C has published guidance principles on this distinction, saying, "Web application design should be informed by the above principles, but also by the relevant limitations."
Source - Hypertext Transfer Protocol
I think you are dealing with a cosmetic issue.
GET https://www.youtube.com/
is most likely just what is logged to indicate that the fetching is done with GET
, and the target is https://www.youtube.com
.
There is no standardised way for a proxy to support GET https://
URIs, it was mooted a couple years back at the IETF HTTP WG but discarded for various reasons (trust issues with proxies mainly if I recall)
It is very unlikely to be the request sent to the proxy. As others have said, CONNECT
is used to connect to www.youtube.com:443
, then there would be some other GET request which does not contain the scheme (protocol) or authority (server:port etc) parts of the URI.
In your example it would be:
GET / HTTP/1.1
host: www.youtube.com:443