ChatGPT解决这个技术问题 Extra ChatGPT

Should I use encoding declaration in Python 3?

Python 3 uses UTF-8 encoding for source-code files by default. Should I still use the encoding declaration at the beginning of every source file? Like # -*- coding: utf-8 -*-


M
Martijn Pieters

Because the default is UTF-8, you only need to use that declaration when you deviate from the default, or if you rely on other tools (like your IDE or text editor) to make use of that information.

In other words, as far as Python is concerned, only when you want to use an encoding that differs do you have to use that declaration.

Other tools, such as your editor, can support similar syntax, which is why the PEP 263 specification allows for considerable flexibility in the syntax (it must be a comment, the text coding must be there, followed by either a : or = character and optional whitespace, followed by a recognised codec).

Note that it only applies to how Python reads the source code. It doesn't apply to executing that code, so not to how printing, opening files, or any other I/O operations translate between bytes and Unicode. For more details on Python, Unicode, and encodings, I strongly urge you to read the Python Unicode HOWTO, or the very thorough Pragmatic Unicode talk by Ned Batchelder.


The # -*- coding: utf-8 -*- may still be useful for some editors to switch to the expected encoding when editing the source file.
@pepr A Byte Order Mark could do the same, no?
@endolith: the UTF-8 BOM is an abomination on this earth brought forth by Microsoft.. See en.wikipedia.org/wiki/Byte_order_mark#UTF-8
@MartijnPieters Your link doesn't seem to agree with you
@endolith: no, the WP article only summarises the background, it is my own opinion that it is an abomination. The point of a BOM is to record the byte order (hence the name, Byte Order Mark). There is no byte order confusion in UTF-8, it only has that function in UTF-16 and UTF-32. The value is already a re-purposed zero-width no-break space character (handy, as accidental printing then ends up with entirely invisible output), re-using that to be a magic constant is wrong, in my view.
S
Sławomir Lenart

No, if:

entire project use only the UTF-8, which is a default.

and you're sure your IDE tool doesn't need that encoding declaration in each file.

Yes, if

your project relies on different encoding

or relies on many encodings.

For multi-encodings projects:

If some files are encoded in the non-utf-8, then even for these encoded in UTF-8 you should add encoding declaration too, because the golden rule is Explicit is better than implicit.

Reference:

PyCharm doesn't need that declaration:

configuring encoding for specific file in pycharm

vim doesn't need that declaration, but:

# vim: set fileencoding= :