ChatGPT解决这个技术问题 Extra ChatGPT

How to change the default encoding to UTF-8 for Apache

I am using a hosting company and it will list the files in a directory if the file index.html is not there. It uses ISO 8859-1 as the default encoding.

If the server is Apache, is there a way to set UTF-8 as the default instead?

I found out that it is actually using a DOCTYPE of HTML 3.2 and then there is not charset at all... so it is not setting any encoding. But is there a way to change it to use UTF-8?

This question is very old but currently (in 2021), at least in my case (Debian 10), the utf-8 characters are served properly and it seems that it's not needed to uncomment or change the setting AddDefaultCharset to utf-8 at all (On Debian, it's in /etc/apache2/conf-available/charset.conf).

M
MartinodF

In httpd.conf add (or change if it's already there):

AddDefaultCharset utf-8

where in the file does one add this, anywhere?
@Geoffrey yes. if it's not already there, you can put it anywhere. however, i usually put every "custom" directive at the bottom of the file for a number of reasons (overriding pre-existing directives, order, and just to easily see what I did change from stock config).
Add AddDefaultCharset utf-8 to .htaccess - worked a charm for me. (if you don't have access to httpd.conf)
Is it case sensitive?
Since this answer is from 2009: in Ubuntu 18, you change this configuration in /etc/apache2/conf-available/charset.conf
M
Mathias Bynens

Add this to your .htaccess:

IndexOptions +Charset=UTF-8

Or, if you have administrator rights, you could set it globally by editing httpd.conf and adding:

AddDefaultCharset UTF-8

(You can use AddDefaultCharset in .htaccess too, but it won’t affect Apache-generated directory listings that way.)


This is a great solution and less invasive than modifying the httpd.conf file.
on my server, the .htaccess can affect all the subdirectories as well, probably apache will look for any .htaccess up the parent directory all the way to the root directory of the website folder
Yes, that’s how .htaccess works on all servers — it affects all subdirectories as well. However, Apache-generated directory listing pages can’t be forced to UTF-8 by using .htaccess (AFAIK).
Please note changing serverwide settings via .htaccess files is generally bad practice. Bugs become harder to track when server settings are distributed across various files. There's a slight performance hit too: with each requested file, Apache has to read the directory's .htaccess file and all .htaccess files of parent directories. .htaccess should therefore only be used for either directory specific settings (e.g. preventing access to a specific directory) or when there is absolutely no possibility to gain administrator rights.
Up voted, the IndexOptions +Charset=UTF-8 did the trick for me, thanks!
E
Eugene Yokota

See AddDefaultCharset Directive, AddCharset Directive, and this article.

AddDefaultCharset utf-8

But I have to use Chinese characters now and then. Previously, I translated Chinese characters to Unicode code and include it in the document using the &# hack. But it is only useful for page having a few characters. There is a better way to do that: encode the charset information in the filename, and apache will output the proper encoding header based on that. This is possible thanks to the AddCharset lines in the conf file, such as the line below: conf/httpd.conf:

AddCharset UTF-8 .utf8

So if you have a file whose names ends in .html.utf8, apache will serve the page as if it is encoded in UTF-8 and will dump the proper character-encoding directive in the header accordingly.


P
Peter Mortensen

In file .htaccess, add this line:

AddCharset utf-8 .html .css .php .txt .js

This is for those that do not have access to their server's configuration file. It is just one more thing to try when other attempts failed.

As far as performance issues regarding the use of file .htaccess, I have not seen this. My typical page load times are 150-200 ms with or without file .htaccess.

What good is performance if your page does not render correctly? Most shared servers do not allow user access to the configuration file which is the preferred place to add a character set.


I can't explain, but only this solution works for me. That's why a big +1
As mentioned by @Robbert earlier - if you are not already using .htaccess files, don't start now. There are performance & administrative reasons why this is a Bad Idea(tm)
When you do not include the extensions AddCharset is applied to Content Types text/html and text/plain.
This worked for me while all the above awnsers didnt. +1
The accepted answer only affects tex/html and text/plain: httpd.apache.org/docs/2.4/mod/core.html#adddefaultcharset
B
Bjartur Thorlacius

On Ubuntu 12.04, it's sufficient to uncomment the line AddDefaultCharset UTF-8 in /etc/apache2/conf.d/charset. If you're using upstream Apache, the file may be called httpd.conf, and you may have to insert the line.


There is no such file as /etc/apache2/conf.d/charset. It is a custom include file by your distribution. As is any other file that’s not httpd.conf.
Its /etc/apache2/conf-enabled/charset.conf on my distribution(Ubuntu 16.4).Also didnt work.
Can you update your answer, e.g. with Linux distribution information, incl. version. E.g., what was the original Linux distribution and version? (But without "Edit:", "Update:", or similar - the answer should appear as if it was written today.)
D
David Glance

For completeness, on Apache2 on Ubuntu, you will find the default charset in charset.conf in conf-available.

Uncomment the line

AddDefaultCharset UTF-8

What is "conf-available"? A section in a configuration fille? A file? Where is the file located?
On Ubunto 20.04 the file is here: /etc/apache2/conf-available/charset.conf
J
Jon

I'm not sure whether you have access to the Apache config (httpd.conf) but you should be able to set an AddDefaultCharset Directive. See:

http://httpd.apache.org/docs/2.0/mod/core.html

Look for the mod_mime.c module and make sure the following is set:

AddDefaultCharset utf-8 

or the equivalent Apache 1.x docs (http://httpd.apache.org/docs/1.3/mod/core.html#adddefaultcharset).

However, this only works when "the response content-type is text/plain or text/html".

You should also make sure that your pages have a charset set as well. See this for more info:

http://www.w3.org/TR/REC-html40/charset.html


P
Peter Mortensen

This is untested, but it will probably work.

In your .htaccess file, add:

<Files ~ "\.html?$">  
     Header set Content-Type "text/html; charset=utf-8"
</Files>

However, this will require mod_headers on the server.


That worked for me, whereas the chosen solution did not. Thank you! In fact, I didn't even have to wrap it in <Files> tags.
What is "mod_headers"? Where does it go or how is it set?
This worked for me, when none of the other answers would. I also found out that there is a FilesMatch tag that also worked.
P
Peter Mortensen

Just a hint if you have long filenames in UTF-8 format: by default they will be shortened to 20 bytes, so it may happen that the last character might be "cut in half" and therefore unrecognized properly. Then you may want to set the following:

IndexOptions Charset=UTF-8 NameWidth=*

NameWidth setting will prevent shortening your file names, making them properly displayed and readable.

As other users already mentioned, this should be added either in httpd.conf or apache2.conf (if you do have admin rights) or in .htaccess (if you don't).


What shorten them to 20 bytes? What is the context?
h
hon2a

Where all the HTML files are in UTF-8 and don't have meta tags for content type, I was only able to set the needed default for these files to be sent by Apache 2.4 by adding both directives:

AddLanguage ru .html
AddCharset UTF-8 .html

P
Peter Mortensen

Just leave it empty: 'default_charset' in WHM :::::: default_charset =''

P.S.: In WHM, go → Home → Service Configuration → PHP Configuration Editor → click 'Advanced Mode' → find 'default_charset' and leave it blank. Just nothing, not UTF-8 and not ISO.


ISO what? ISO 8859-1?
P
Peter Mortensen

overrides the Apache default charset (cf /etc/apache2/conf.d/charset)

If this is not enough, then you probably created your original file with the ISO 8859-1 encoding character set. You have to convert it to the proper character set:

iconv -f ISO-8859-1 -t UTF-8 source_file.php -o new file.php

P
Peter Mortensen

In my case I added this to file .htaccess:

AddDefaultCharset off
AddDefaultCharset windows-1252