For pages already specified (either by HTTP header, or by meta tag), to have a Content-Type with a UTF-8 charset... is there a benefit of adding accept-charset="UTF-8"
to HTML forms?
(I understand the accept-charset
attribute is broken in IE for ISO-8859-1, but I haven't heard of a problem with IE and UTF-8. I'm just asking if there's a benefit to adding it with UTF-8, to help prevent invalid byte sequences from being entered.)
User agents may interpret this value as the character encoding that was used to transmit the document
- does this mean it's safer to explicitly mention it? Not sure. From my experience, I agree with what @elusive says)
If the page is already interpreted by the browser as being UTF-8, setting accept-charset="utf-8"
does nothing.
If you set the encoding of the page to UTF-8 in a <meta>
and/or HTTP header, it will be interpreted as UTF-8, unless the user deliberately goes to the View->Encoding menu and selects a different encoding, overriding the one you specified.
In that case, accept-encoding
would have the effect of setting the submission encoding back to UTF-8 in the face of the user messing about with the page encoding. However, this still won't work in IE, due the previous problems discussed with accept-encoding
in that browser.
So it's IMO doubtful whether it's worth including accept-charset
to fix the case where a non-IE user has deliberately sabotaged the page encoding (possibly messing up more on your page than just the form).
Personally, I don't bother.
I did not encounter any problems using UTF-8 with IE (6+) or any other major browser out there. You need to make sure, that a UTF-8 meta tag is set (IE needs this) and that all your files are UTF-8 encoded (which means that the webserver sends UTF-8 headers). Then there should not be any problem if you omit accept-charset
.
accept-charset
attribute was necessary (made any difference) given a UTF-8 http header.
Success story sharing
may interpret
and that the default is UNKNOWN.UNKNOWN
/unset always means the current page encoding, whether that was the server's page encoding set in a header/meta, or the encoding explicitly set by the user as an override. Exception that probably doesn't affect you: most browsers will not send form submissions in a non-ASCII-superset encoding like UTF-16 even if the page was served as that. It doesn't really make sense to do so.