I am using a hosting company and it will list the files in a directory if the file index.html
is not there. It uses ISO 8859-1 as the default encoding.
If the server is Apache, is there a way to set UTF-8 as the default instead?
I found out that it is actually using a DOCTYPE of HTML 3.2 and then there is not charset
at all... so it is not setting any encoding. But is there a way to change it to use UTF-8?
AddDefaultCharset
to utf-8 at all (On Debian, it's in /etc/apache2/conf-available/charset.conf
).
In httpd.conf add (or change if it's already there):
AddDefaultCharset utf-8
Add this to your .htaccess
:
IndexOptions +Charset=UTF-8
Or, if you have administrator rights, you could set it globally by editing httpd.conf
and adding:
AddDefaultCharset UTF-8
(You can use AddDefaultCharset
in .htaccess
too, but it won’t affect Apache-generated directory listings that way.)
.htaccess
can affect all the subdirectories as well, probably apache will look for any .htaccess
up the parent directory all the way to the root directory of the website folder
.htaccess
works on all servers — it affects all subdirectories as well. However, Apache-generated directory listing pages can’t be forced to UTF-8 by using .htaccess
(AFAIK).
.htaccess
files is generally bad practice. Bugs become harder to track when server settings are distributed across various files. There's a slight performance hit too: with each requested file, Apache has to read the directory's .htaccess
file and all .htaccess
files of parent directories. .htaccess
should therefore only be used for either directory specific settings (e.g. preventing access to a specific directory) or when there is absolutely no possibility to gain administrator rights.
See AddDefaultCharset Directive, AddCharset Directive, and this article.
AddDefaultCharset utf-8
But I have to use Chinese characters now and then. Previously, I translated Chinese characters to Unicode code and include it in the document using the hack. But it is only useful for page having a few characters. There is a better way to do that: encode the charset information in the filename, and apache will output the proper encoding header based on that. This is possible thanks to the AddCharset lines in the conf file, such as the line below: conf/httpd.conf:
AddCharset UTF-8 .utf8
So if you have a file whose names ends in .html.utf8, apache will serve the page as if it is encoded in UTF-8 and will dump the proper character-encoding directive in the header accordingly.
In file .htaccess, add this line:
AddCharset utf-8 .html .css .php .txt .js
This is for those that do not have access to their server's configuration file. It is just one more thing to try when other attempts failed.
As far as performance issues regarding the use of file .htaccess, I have not seen this. My typical page load times are 150-200 ms with or without file .htaccess.
What good is performance if your page does not render correctly? Most shared servers do not allow user access to the configuration file which is the preferred place to add a character set.
.htaccess
files, don't start now. There are performance & administrative reasons why this is a Bad Idea(tm)
On Ubuntu 12.04, it's sufficient to uncomment the line AddDefaultCharset UTF-8
in /etc/apache2/conf.d/charset
. If you're using upstream Apache, the file may be called httpd.conf, and you may have to insert the line.
/etc/apache2/conf.d/charset
. It is a custom include file by your distribution. As is any other file that’s not httpd.conf
.
/etc/apache2/conf-enabled/charset.conf
on my distribution(Ubuntu 16.4).Also didnt work.
For completeness, on Apache2 on Ubuntu, you will find the default charset in charset.conf in conf-available.
Uncomment the line
AddDefaultCharset UTF-8
I'm not sure whether you have access to the Apache config (httpd.conf) but you should be able to set an AddDefaultCharset Directive. See:
http://httpd.apache.org/docs/2.0/mod/core.html
Look for the mod_mime.c module and make sure the following is set:
AddDefaultCharset utf-8
or the equivalent Apache 1.x docs (http://httpd.apache.org/docs/1.3/mod/core.html#adddefaultcharset).
However, this only works when "the response content-type is text/plain or text/html".
You should also make sure that your pages have a charset set as well. See this for more info:
http://www.w3.org/TR/REC-html40/charset.html
This is untested, but it will probably work.
In your .htaccess file, add:
<Files ~ "\.html?$">
Header set Content-Type "text/html; charset=utf-8"
</Files>
However, this will require mod_headers on the server.
<Files>
tags.
Just a hint if you have long filenames in UTF-8 format: by default they will be shortened to 20 bytes, so it may happen that the last character might be "cut in half" and therefore unrecognized properly. Then you may want to set the following:
IndexOptions Charset=UTF-8 NameWidth=*
NameWidth
setting will prevent shortening your file names, making them properly displayed and readable.
As other users already mentioned, this should be added either in httpd.conf
or apache2.conf
(if you do have admin rights) or in .htaccess
(if you don't).
Where all the HTML files are in UTF-8 and don't have meta tags for content type, I was only able to set the needed default for these files to be sent by Apache 2.4 by adding both directives:
AddLanguage ru .html
AddCharset UTF-8 .html
Just leave it empty: 'default_charset' in WHM :::::: default_charset =''
P.S.: In WHM, go → Home → Service Configuration → PHP Configuration Editor → click 'Advanced Mode' → find 'default_charset' and leave it blank. Just nothing, not UTF-8 and not ISO.
overrides the Apache default charset (cf /etc/apache2/conf.d/charset)
If this is not enough, then you probably created your original file with the ISO 8859-1 encoding character set. You have to convert it to the proper character set:
iconv -f ISO-8859-1 -t UTF-8 source_file.php -o new file.php
In my case I added this to file .htaccess:
AddDefaultCharset off
AddDefaultCharset windows-1252
Success story sharing
/etc/apache2/conf-available/charset.conf