ChatGPT解决这个技术问题 Extra ChatGPT

Are HTTPS URLs encrypted?

Are all URLs encrypted when using TLS/SSL (HTTPS) encryption? I would like to know because I want all URL data to be hidden when using TLS/SSL (HTTPS).

If TLS/SSL gives you total URL encryption then I don't have to worry about hiding confidential information from URLs.

It's probably a bad idea to put confidential data in the URL anyway. It will be displayed in the browser's address bad too, remember? People don't like it if their password is visible to anyone who happens to glance at the screen. Why do you think you need to put confidential data in the URL?
URLs are also stored in browser history and server logs - if I wanted to have my name and password stored somewhere, it would not be in these two places.
For example, suppose I visit https://somewhere_i_trust/ways_to_protest_against_the_government/. Then the URL contains confidential data, namely the suggestion that I am considering protesting against my government.
I was asking myself this question when making an HTTP request from a native (not browser based) App. I'm guessing this may interest mobile App developers. In this case, the comments above (while true) are irrelevant (no url visible, no browsing history), making the answer, to my understanding a simple: "Yes, it's encrypted".
For those who think once you are HTTPS no one knows where you're going, read this first: The hostname of the server (e.g. example.com) will still be leaked due to SNI. This has absolutely nothing to do with DNS and the leak will occur even if you don't use DNS or use encrypted DNS.

m
matteo

Yes, the SSL connection is between the TCP layer and the HTTP layer. The client and server first establish a secure encrypted TCP connection (via the SSL/TLS protocol) and then the client will send the HTTP request (GET, POST, DELETE...) over that encrypted TCP connection.

Note however (as also noted in the comments) that the domain name part of the URL is sent in clear text during the first part of the TLS negotiation. So, the domain name of the server can be sniffed. But not the rest of the URL.


It is still worth noting the thing mentioned by @Jalf in the comment on the question itself. URL data will also be saved in the browser's history, which may be insecure long-term.
Not just GET or POST. Can also be DELETE, PUT, HEAD, or TRACE.
Yes it could be a security issue for a browser's history. But in my case I'm not using browser (also the original post did not mention a browser). Using a custom https call behind the scenes in a native app. It's a simple solution to making sure your app's sever connection is secure.
Note however that the DNS resolve of the URL is probably not encrypted. So someone sniffing your traffic could still probably see the domain you're trying to access.
SNI breaks the 'host' part of SSL encryption of URLs. You can test this yourself with wireshark. There is a selector for SNI, or you can just review your SSL packets when you connect to remote host.
C
Community

Since nobody provided a wire capture, here's one.
Server Name (the domain part of the URL) is presented in the ClientHello packet, in plain text.

The following shows a browser request to:
https://i.stack.imgur.com/path/?some=parameters&go=here

https://i.stack.imgur.com/rdHZZ.png

From https://www.ietf.org/rfc/rfc3546.txt:

3.1. Server Name Indication [TLS] does not provide a mechanism for a client to tell a server the name of the server it is contacting. It may be desirable for clients to provide this information to facilitate secure connections to servers that host multiple 'virtual' servers at a single underlying network address.

In order to provide the server name, clients MAY include an extension of type "server_name" in the (extended) client hello.

In short:

FQDN (the domain part of the URL) MAY be transmitted in clear inside the ClientHello packet if SNI extension is used

The rest of the URL (/path/?some=parameters&go=here) has no business being inside ClientHello since the request URL is a HTTP thing (OSI Layer 7), therefore it will never show up in a TLS handshake (Layer 4 or 5). That will come later on in a GET /path/?some=parameters&go=here HTTP/1.1 HTTP request, AFTER the secure TLS channel is established.

EXECUTIVE SUMMARY

Domain name MAY be transmitted in clear (if SNI extension is used in the TLS handshake) but URL (path and parameters) is always encrypted.

MARCH 2019 UPDATE

Thank you carlin.scott for bringing this one up.

The payload in the SNI extension can now be encrypted via this draft RFC proposal. This capability only exists in TLS 1.3 (as an option and it's up to both ends to implement it) and there is no backwards compatibility with TLS 1.2 and below.

CloudFlare is doing it and you can read more about the internals here — If the chicken must come before the egg, where do you put the chicken?

In practice this means that instead of transmitting the FQDN in plain text (like the Wireshark capture shows), it is now encrypted.

NOTE: This addresses the privacy aspect more than the security one since a reverse DNS lookup MAY reveal the intended destination host anyway.

SEPTEMBER 2020 UPDATE

There's now a draft RFC for encrypting the entire Client Hello message, not just the SNI part: https://datatracker.ietf.org/doc/draft-ietf-tls-esni/?include_text=1

At the time of writing this browser support is VERY limited.


Perfect answer, with complete explanation from A to Z. I love the Executive summary. Made my day @evilSnobu
Perfect answer, upvote! Still consider the client part since the browser history may be a leak. However, regarding the transport layer, URL-parameters are encrypted.
You may want to update this answer with the fact that TLS 1.3 encrypts the SNI extension, and the biggest CDN is doing just that: blog.cloudflare.com/encrypted-sni Of course a packet sniffer could just do a reverse-dns lookup for the IP addresses you're connecting to.
@evilSnobu, but username:password part of username:password@domain.com is encrypted, right? So it's secure to pass sensitive data in url using https.
They are encrypted on the wire (in transport) but if either end (user or server) logs the URL to a plain text file and does not sanitize credentials... now that's a different conversation.
C
Community

As the other answers have already pointed out, https "URLs" are indeed encrypted. However, your DNS request/response when resolving the domain name is probably not, and of course, if you were using a browser, your URLs might be recorded too.


And URL recording is important since there are Javascript hacks that allow a completely unrelated site to test whether a given URL is in your history or not. You can make a URL unguessable by including a longish random string in it, but if it's a public URL then the attacker can tell that it has been visited, and if it has a short secret in it, then an attacker could brute-force that at reasonable speed.
@SteveJessop, please provide a link to "Javascript hacks that allow a completely unrelated site to test whether a given URL is in your history or not"
@Pacerier: hacks date of course, but what I was talking about at the time was things like stackoverflow.com/questions/2394890/…. It was a big deal back in 2010 that these issues were being investigated and the attacks refined, but I'm not really following it at the moment.
@Pacerier: more examples: webdevwonders.com/…, webdevwonders.com/…
You can use OpenDNS with it's encrypted DNS service. I use it on my Mac, but I found the Windows version not working properly. That was a while ago though, so it might work OK now. For Linux nothing yet. opendns.com/about/innovations/dnscrypt
D
DanMan

Entire request and response is encrypted, including URL.

Note that when you use a HTTP Proxy, it knows the address (domain) of the target server, but doesn't know the requested path on this server (i.e. request and response are always encrypted).


M
Mihai Maruseac

I agree with the previous answers:

To be explicit:

With TLS, the first part of the URL (https://www.example.com/) is still visible as it builds the connection. The second part (/herearemygetparameters/1/2/3/4) is protected by TLS.

However there are a number of reasons why you should not put parameters in the GET request.

First, as already mentioned by others: - leakage through browser address bar - leakage through history

In addition to that you have leakage of URL through the http referer: user sees site A on TLS, then clicks a link to site B. If both sites are on TLS, the request to site B will contain the full URL from site A in the referer parameter of the request. And admin from site B can retrieve it from the log files of server B.)


@EJP You didn't understand what Tobias is saying. He's saying that if you click a link on site A that will take you to site B, then site B will get the referrer URL. For example, if you are on siteA.com?u=username&pw=123123, then siteB.com (which is linked to on the page of siteA.com) will receive "siteA.com?u=username&pw=123123" as the referring URL, sent to siteB.com inside the HTTPS by the browser. If this is true, that's very bad. Is this true Tobias?
@EJP, the domain is visible because of SNI which all modern web browsers use. Also see this diagram from the EFF showing that anyone can see the domain of the site you are visiting. This isn't about browser visibility. It's about what is visible to eavesdroppers.
@trusktr: Browsers should not send a Referer header from HTTPS pages. This is part of the HTTP specification.
@MartinGeisler, Keyword is "should". Browsers don't much care about "should" (as opposed to "must"). From your own link: "strongly recommended that the user be able to select whether or not the Referer field is sent. For example, a browser client could have a toggle switch for browsing openly/anonymously, which would respectively enable /disable the sending of Referer and From information". Ops, which is exactly what Chrome did. Except Chrome leaks the Referrer even if you are in incognito mode.
m
mikemaccana

Yes and no.

The server address portion is NOT encrypted since it is used to set up the connection.

This may change in future with encrypted SNI and DNS but as of 2018 both technologies are not commonly in use.

The path, query string etc. are encrypted.

Note for GET requests the user will still be able to cut and paste the URL out of the location bar, and you will probably not want to put confidential information in there that can be seen by anyone looking at the screen.


Would like to +1 this, but I find the "yes and no" misleading - you should change that to just point out that the server name will be resolved using DNS without encryption.
In my understanding, the OP uses the word URL in the right sense. I think this answer is more misleading, as it doesnt clearly makes the difference between the hostname in the URL and the hostname in the DNS resolution.
The URL is encrypted. Every aspect of the HTTP transaction is encrypted. Not just 'everything else'. Period. -1.
@EJP but the DNS lookup does use what is at one point part of the URL, so to the non-technical person, the entire URL is not encrypted. The non-technical person who's merely using Google.com to look up non-technical things does not know where the data ultimately resides or how it is handled. The domain, which is part of the URL the user is visiting, is not 100% encrypted because I as the attacker can sniff which site he is visiting. Only the /path of a URL is inherently encrypted to the layman (it doesn't matter how).
@EJP, @​trusktr, @​Lawrence, @​Guillaume. All of you are mistaken. This has nothing to do with DNS. SNI "send the name of the virtual domain as part of the TLS negotiation", so even if you don't use DNS or if your DNS is encrypted, a sniffer can still see the hostname of your requests.
R
Rhodri Cusack

An addition to the helpful answer from Marc Novakowski - the URL is stored in the logs on the server (e.g., in /etc/httpd/logs/ssl_access_log), so if you don't want the server to maintain the information over the longer term, don't put it in the URL.


N
Nicolas Guérinet

It is now 2019 and the TLS v1.3 has been released. According to Cloudflare, the server name indication (SNI aka the hostname) can be encrypted thanks to TLS v1.3. So, I told myself great! Let's see how it looks within the TCP packets of cloudflare.com So, I caught a "client hello" handshake packet from a response of the cloudflare server using Google Chrome as browser & wireshark as packet sniffer. I still can read the hostname in plain text within the Client hello packet as you can see below. It is not encrypted.

https://i.stack.imgur.com/BZDy4.jpg

So, beware of what you can read because this is still not an anonymous connection. A middleware application between the client and the server could log every domain that are requested by a client.

So, it looks like the encryption of the SNI requires additional implementations to work along with TLSv1.3

UPDATE June 2020: It looks like the Encrypted SNI is initiated by the browser. Cloudflare has a page for you to check if your browser supports Encrypted SNI:

https://www.cloudflare.com/ssl/encrypted-sni/

At this point, I think Google chrome does not support it. You can activate Encrypted SNI in Firefox manually. When I tried it for some reason, it didn't work instantly. I restarted Firefox twice before it worked:

Type: about:config in the URL field.

Check if network.security.esni.enabled is true. Clear your cache / restart

Go to the website, I mentioned before.

As you can see VPN services are still useful today for people who want to ensure that a coffee shop owner does not log the list of websites that people visit.


"the SNI can be encrypted" - that's the key point. Checking cloudflare.com/ssl/encrypted-sni with current Google Chrome says "Your browser did not encrypt the SNI when visiting this page." It takes two to tango...
Apparently current Firefox can do ESNI, but it's disabled by default: you need to enable network.security.esni.enabled, set network.trr.mode to 2 (which currently sets your DoH resolver to CloudFlare), and restart the browser (sic!); then it will use ESNI - where supported by the domain's infrastructure. See blog.mozilla.org/security/2018/10/18/… for details.
p
pbhj

A third-party that is monitoring traffic may also be able to determine the page visited by examining your traffic an comparing it with the traffic another user has when visiting the site. For example if there were 2 pages only on a site, one much larger than the other, then comparison of the size of the data transfer would tell which page you visited. There are ways this could be hidden from the third-party but they're not normal server or browser behaviour. See for example this paper from SciRate, https://scirate.com/arxiv/1403.0297.

In general other answers are correct, practically though this paper shows that pages visited (ie URL) can be determined quite effectively.


That would really only be feasible on very small sites, and in those cases, the theme/tone/nature of the site would probably still be about the same on each page.
From the citation I gave: "We present a traffic analysis attack against over 6000 webpages spanning the HTTPS deployments of 10 widely used, industry-leading websites in areas such as healthcare, finance, legal services and streaming video. Our attack identifies individual pages in the same website with 89% accuracy [...]". It seems your conclusion as to feasibility is wrong.
For anyone interesting in reading more about this sort of vulnerability, these types of attacks are generally referred to as side-channel attacks.
D
Don Gillis

You can not always count on privacy of the full URL either. For instance, as is sometimes the case on enterprise networks, supplied devices like your company PC are configured with an extra "trusted" root certificate so that your browser can quietly trust a proxy (man-in-the-middle) inspection of https traffic. This means that the full URL is exposed for inspection. This is usually saved to a log.

Furthermore, your passwords are also exposed and probably logged and this is another reason to use one time passwords or to change your passwords frequently.

Finally, the request and response content is also exposed if not otherwise encrypted.

One example of the inspection setup is described by Checkpoint here. An old style "internet café" using supplied PC's may also be set up this way.


C
Community

Linking to my answer on a duplicate question. Not only is the URL available in the browsers history, the server side logs but it's also sent as the HTTP Referer header which if you use third party content, exposes the URL to sources outside your control.


Providing your third party calls are HTTPS aswell though this isn't an issue right?
It'd be encrypted with the third parties certificate so they could see the URL
R
Ricardo BRGWeb

Althought there are some good answers already here, most of them are focusing in browser navigation. I'm writing this in 2018 and probably someone wants to know about the security of mobile apps.

For mobile apps, if you control both ends of the application (server and app), as long as you use HTTPS you're secure. iOS or Android will verify the certificate and mitigate possible MiM attacks (that would be the only weak point in all this). You can send sensitive data through HTTPS connections that it will be encrypted during transport. Just your app and the server will know any parameters sent through https.

The only "maybe" here would be if client or server are infected with malicious software that can see the data before it is wrapped in https. But if someone is infected with this kind of software, they will have access to the data, no matter what you use to transport it.


p
pedrorijo91

While you already have very good answers, I really like the explanation on this website: https://https.cio.gov/faq/#what-information-does-https-protect

in short: using HTTPS hides:

HTTP method

query params

POST body (if present)

Request headers (cookies included)

Status code


C
Chris Rutledge

Additionally, if you're building a ReSTful API, browser leakage and http referer issues are mostly mitigated as the client may not be a browser and you may not have people clicking links.

If this is the case I'd recommend oAuth2 login to obtain a bearer token. In which case the only sensitive data would be the initial credentials...which should probably be in a post request anyway