ChatGPT解决这个技术问题 Extra ChatGPT

What are the rules about using an underscore in a C++ identifier?

It's common in C++ to name member variables with some kind of prefix to denote the fact that they're member variables, rather than local variables or parameters. If you've come from an MFC background, you'll probably use m_foo. I've also seen myFoo occasionally.

C# (or possibly just .NET) seems to recommend using just an underscore, as in _foo. Is this allowed by the C++ standard?

Just to note that the ignorance of these rules does not necessarily imply that your code will not compile or run, but it is likely that your code will not be portable to different compilers and version, since it cannot be guaranteed that there will not be name clashes . To back this up I know of certain implementation of an important system that has been using as a naming convention the _ capital letter everywhere. There where no errors due to this. Of course it is bad practice.

2
24 revs, 13 users 35% Roger Pate

The rules (which did not change in C++11):

Reserved in any scope, including for use as implementation macros: identifiers beginning with an underscore followed immediately by an uppercase letter identifiers containing adjacent underscores (or "double underscore")

identifiers beginning with an underscore followed immediately by an uppercase letter

identifiers containing adjacent underscores (or "double underscore")

Reserved in the global namespace: identifiers beginning with an underscore

identifiers beginning with an underscore

Also, everything in the std namespace is reserved. (You are allowed to add template specializations, though.)

From the 2003 C++ Standard:

17.4.3.1.2 Global names [lib.global.names] Certain sets of names and function signatures are always reserved to the implementation: Each name that contains a double underscore (__) or begins with an underscore followed by an uppercase letter (2.11) is reserved to the implementation for any use. Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.165 165) Such names are also reserved in namespace ::std (17.4.3.1).

Because C++ is based on the C standard (1.1/2, C++03) and C99 is a normative reference (1.2/1, C++03) these also apply, from the 1999 C Standard:

7.1.3 Reserved identifiers Each header declares or defines all identifiers listed in its associated subclause, and optionally declares or defines identifiers listed in its associated future library directions subclause and identifiers which are always reserved either for any use or for use as file scope identifiers. All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use. All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces. Each macro name in any of the following subclauses (including the future library directions) is reserved for use as specified if any of its associated headers is included; unless explicitly stated otherwise (see 7.1.4). All identifiers with external linkage in any of the following subclauses (including the future library directions) are always reserved for use as identifiers with external linkage.154 Each identifier with file scope listed in any of the following subclauses (including the future library directions) is reserved for use as a macro name and as an identifier with file scope in the same name space if any of its associated headers is included. No other identifiers are reserved. If the program declares or defines an identifier in a context in which it is reserved (other than as allowed by 7.1.4), or defines a reserved identifier as a macro name, the behavior is undefined. If the program removes (with #undef) any macro definition of an identifier in the first group listed above, the behavior is undefined. 154) The list of reserved identifiers with external linkage includes errno, math_errhandling, setjmp, and va_end.

Other restrictions might apply. For example, the POSIX standard reserves a lot of identifiers that are likely to show up in normal code:

Names beginning with a capital E followed a digit or uppercase letter: may be used for additional error code names.

may be used for additional error code names.

Names that begin with either is or to followed by a lowercase letter may be used for additional character testing and conversion functions.

may be used for additional character testing and conversion functions.

Names that begin with LC_ followed by an uppercase letter may be used for additional macros specifying locale attributes.

may be used for additional macros specifying locale attributes.

Names of all existing mathematics functions suffixed with f or l are reserved for corresponding functions that operate on float and long double arguments, respectively.

for corresponding functions that operate on float and long double arguments, respectively.

Names that begin with SIG followed by an uppercase letter are reserved for additional signal names.

for additional signal names.

Names that begin with SIG_ followed by an uppercase letter are reserved for additional signal actions.

for additional signal actions.

Names beginning with str, mem, or wcs followed by a lowercase letter are reserved for additional string and array functions.

for additional string and array functions.

Names beginning with PRI or SCN followed by any lowercase letter or X are reserved for additional format specifier macros

for additional format specifier macros

Names that end with _t are reserved for additional type names.

for additional type names.

While using these names for your own purposes right now might not cause a problem, they do raise the possibility of conflict with future versions of that standard.

Personally I just don't start identifiers with underscores. New addition to my rule: Don't use double underscores anywhere, which is easy as I rarely use underscore.

After doing research on this article I no longer end my identifiers with _t as this is reserved by the POSIX standard.

The rule about any identifier ending with _t surprised me a lot. I think that is a POSIX standard (not sure yet) looking for clarification and official chapter and verse. This is from the GNU libtool manual, listing reserved names.

CesarB provided the following link to the POSIX 2004 reserved symbols and notes 'that many other reserved prefixes and suffixes ... can be found there'. The POSIX 2008 reserved symbols are defined here. The restrictions are somewhat more nuanced than those above.


The C++ standard doesn't "import" the C one, does it? They import certain headers, but not the language as a whole, or naming rules, as far as I know. But yeah, the _t one surprised me as well. But since it's C, it can only apply to the global ns. Should be safe to use _t inside classes as I read it
The C++ Standard doesn't "import" the C Standard. It references the C Standard. The C++ library introduction says "The library also makes available the facilities of the Standard C Library". It does that by including headers of the C Standard library with appropriate changes, but not by "importing" it. The C++ Standard has an own set of rules that describes the reserved names. If a name reserved in C should be reserved in C++, that is the place to say this. But the C++ Standard doesn't say so. So i don't believe that things reserved in C are reserved in C++ - but i could well be wrong.
This is what I found about the "_t" issue: n1256 (C99 TC3) says: "Typedef names beginning with int or uint and ending with _t" are reserved. I think that still allows using names like "foo_t" - but i think these are then reserved by POSIX.
So 'tolerance' is reserved by POSIX as it starts with 'to' + a lowercase letter? I bet a lot of code breaks this rule!
@LokiAstari, "The C++ standard is defined in terms of the C standard. Basically it says the C++ is C with these differences and additions." Nonsense! C++ only references the C standard in [basic.fundamental] and the library. If what you say is true, where does C++ say that _Bool and _Imaginary don't exist in C++? The C++ language are defined explicitly, not in terms of "edits" to C, otherwise the standard could be much shorter!
p
paercebal

The rules to avoid collision of names are both in the C++ standard (see Stroustrup book) and mentioned by C++ gurus (Sutter, etc.).

Personal rule

Because I did not want to deal with cases, and wanted a simple rule, I have designed a personal one that is both simple and correct:

When naming a symbol, you will avoid collision with compiler/OS/standard libraries if you:

never start a symbol with an underscore

never name a symbol with two consecutive underscores inside.

Of course, putting your code in an unique namespace helps to avoid collision, too (but won't protect against evil macros)

Some examples

(I use macros because they are the more code-polluting of C/C++ symbols, but it could be anything from variable name to class name)

#define _WRONG
#define __WRONG_AGAIN
#define RIGHT_
#define WRONG__WRONG
#define RIGHT_RIGHT
#define RIGHT_x_RIGHT

Extracts from C++0x draft

From the n3242.pdf file (I expect the final standard text to be similar):

17.6.3.3.2 Global names [global.names] Certain sets of names and function signatures are always reserved to the implementation: — Each name that contains a double underscore _ _ or begins with an underscore followed by an uppercase letter (2.12) is reserved to the implementation for any use. — Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.

But also:

17.6.3.3.5 User-defined literal suffixes [usrlit.suffix] Literal suffix identifiers that do not start with an underscore are reserved for future standardization.

This last clause is confusing, unless you consider that a name starting with one underscore and followed by a lowercase letter would be Ok if not defined in the global namespace...


@Meysam : __WRONG_AGAIN__ contains two consecutive underscores (two at the beginning, and two at the end), so this is wrong according to the standard.
@BЈовић : WRONG__WRONG contains two consecutive underscores (two in the middle), so this is wrong according to the standard
putting your code in an unique namespace helps to avoid collision, too: but this is still not enough, since the identifier may collide with a keyword regardless of scope (e.g. __attribute__ for GCC).
Why is there any problem of having two consecutive underscores in the middle according to the standard? User-defined literal suffixes apply to literal values like 1234567L or 4.0f; IIRC this refers to ohttp://en.cppreference.com/w/cpp/language/user_literal
Why is there any problem of having two consecutive underscores in the middle according to the standard? Because the standard say those are reserved. This is not an advice on good or bad style. It's a decision from the standard. Why they decided this? I guess the first compilers already used such conventions informally before standardization.
C
Community

From MSDN:

Use of two sequential underscore characters ( __ ) at the beginning of an identifier, or a single leading underscore followed by a capital letter, is reserved for C++ implementations in all scopes. You should avoid using one leading underscore followed by a lowercase letter for names with file scope because of possible conflicts with current or future reserved identifiers.

This means that you can use a single underscore as a member variable prefix, as long as it's followed by a lower-case letter.

This is apparently taken from section 17.4.3.1.2 of the C++ standard, but I can't find an original source for the full standard online.

See also this question.


I found a similar text in n3092.pdf (the draft of C++0x standard) at section: "17.6.3.3.2 Global names"
Interestingly, this seems to be the only answer which has direct, concise answer to the question.
@hyde: Actually, it isn't, since it's skipping the rule to not to have any identifiers with a leading underscore in the global namespace. See Roger's answer. I'd be very wary of citations of MS VC docs as an authority on the C++ standard.
@sbi I was referring to "you can use a single underscore as a member variable prefix, as long as it's followed by a lower-case letter" in this answer, which answers the question on the question text directly and concisely, without being drowned in a wall of text.
First, I still consider the lack of any hint that the same rule does not apply to the global namespace a failure. What's worse, though, is that adjacent underscores are forbidden not only at the beginning of, but anywhere in, an identifier. So this answer isn't merely omitting a fact, but actually makes at least one actively wrong claim. As I said, referring to the MSVC docs is something I wouldn't do unless the question is solely about VC.
M
Max Lybbert

As for the other part of the question, it's common to put the underscore at the end of the variable name to not clash with anything internal.

I do this even inside classes and namespaces because I then only have to remember one rule (compared to "at the end of the name in global scope, and the beginning of the name everywhere else").


A
Arsen Khachaturyan

Yes, underscores may be used anywhere in an identifier. I believe the rules are: any of a-z, A-Z, _ in the first character and those +0-9 for the following characters.

Underscore prefixes are common in C code -- a single underscore means "private", and double underscores are usually reserved for use by the compiler.


They are common in libraries. They should not be common in user code.
People do write libraries in C, you know.
"Yes, underscores may be used anywhere in an identifier." This is wrong for global identifiers. See Roger's answer.
@sbi According to the C and C++ standards, yes, semantically, global identifiers with leading underscores are reserved. They are syntactically valid identifiers though, and the compiler won't stop you from naming a function _Foo, though by doing so you're relying on nonstandard implementation details and thus risk having your code broken by future versions of the language/standard library implementation/OS.
@BenW: TTBOMK, the C++ standard simply says that global identifiers starting with an underscore are not allowed, without making any distinction between syntax and semantic. (Also any identifiers starting with an underscore followed by a capital letter, and an identifiers with two consecutive underscores.)