In Python, identifiers (names of variables, functions, classes, etc.) need to be defined according to rules. Names that do not follow the rules cannot be used as identifiers and will result in an error.
The following information is provided here.
- Characters that can and cannot be used in identifiers (names)
- ASCII characters
- Unicode character
- normalization (e.g. in maths)
- Check if the string is a valid identifier:
isidentifier()
- Words that cannot be used as identifiers (names) (reserved words)
- Words that should not be used as identifiers (names)
- Naming conventions for PEP8
The following description is given in Python 3, and may be different in Python 2.
Characters that can and cannot be used in identifiers (names)
Indicates characters that can and cannot be used as identifiers (names).
In addition, although there are many things to write about, basically all you need to remember is the following.
- Use uppercase and lowercase letters, numbers, and underscores.
- The first (first) letter cannot be a number.
ASCII characters
ASCII characters that can be used as identifiers (names) are uppercase and lowercase alphabets (A~Z,a~z), numbers (0~9), and underscores (_). The alphabet is case-sensitive.
AbcDef_123 = 100
print(AbcDef_123)
# 100
Symbols other than underscores cannot be used.
# AbcDef-123 = 100
# SyntaxError: can't assign to operator
Also, numbers cannot be used at the beginning (first letter).
# 1_abc = 100
# SyntaxError: invalid token
Underscores can also be used at the beginning.
_abc = 100
print(_abc)
# 100
However, note that an underscore at the beginning may have a special meaning.
Unicode character
Since Python 3, Unicode characters can also be used.
変数1 = 100
print(変数1)
# 100
Not all Unicode characters can be used, and depending on the Unicode category, some cannot be used. For example, symbols such as punctuation marks and pictograms cannot be used.
# 変数。 = 100
# SyntaxError: invalid character in identifier
# ☺ = 100
# SyntaxError: invalid character in identifier
See the official documentation for the Unicode category codes that can be used.
In many cases, there is no advantage to using Chinese characters, etc., simply because Unicode characters can also be used (without error).
normalization (e.g. in maths)
Unicode characters are converted to the normalized form NFKC for interpretation. For example, full-width alphabets are converted to half-width alphabets (ASCII characters).
Note that even if the source code shows a different display, it is considered the same object and will be overwritten.
ABC = 100
ABC = -100
print(ABC)
# -100
print(ABC)
# -100
print(ABC is ABC)
# True
Check if the string is a valid identifier: isidentifier()
Whether or not a string is valid as an identifier can be checked with the string method isidentifier().
It returns true if it is valid as an identifier, and false if it is invalid.
print('AbcDef_123'.isidentifier())
# True
print('AbcDef-123'.isidentifier())
# False
print('変数1'.isidentifier())
# True
print('☺'.isidentifier())
# False
Words that cannot be used as identifiers (names) (reserved words)
There are some words (reserved words) that cannot be used as identifiers even if they are valid strings as identifiers (names).
Since a reserved word is a valid string as an identifier, isidentifier() returns true, but an error occurs if it is used as an identifier.
print('None'.isidentifier())
# True
# None = 100
# SyntaxError: can't assign to keyword
To get a list of reserved words and to check if a string is a reserved word, use the keyword module of the standard library.
Words that should not be used as identifiers (names)
The names of Python's built-in functions, for example, can be used as identifiers, so you can assign new values to them as variables.
For example, len() is a built-in function that returns the number of elements in a list or the number of characters in a string.
print(len)
# <built-in function len>
print(len('abc'))
# 3
If you assign a new value to this name len, the original function will be overwritten and become unusable. Note that no error or warning will be printed when assigning a new value.
print(len('abc'))
# 3
len = 100
print(len)
# 100
# print(len('abc'))
# TypeError: 'int' object is not callable
Another common mistake is to use list = [0, 1, 2], which makes it impossible to use list(). Be careful.
Naming conventions for PEP8
PEP stands for Python Enhancement Proposal, a document that describes new features and other aspects of Python.
PEP stands for Python Enhancement Proposal. A PEP is a design document providing information to the Python community, or describing a new feature for Python or its processes or environment.
PEP 1 — PEP Purpose and Guidelines | Python.org
PEP8 is the eighth one, and it describes the “Style Guide for Python Code”, that is, the style guide for Python.
Naming conventions are also mentioned.
See the link above for more details, but for example, the following writing style is recommended.
- Module
lowercase_underscore
- Lowercase + underscore
- Package
lowercase
- all lower case letters
- Classes, Exceptions
CapitalizedWords
(CamelCase
)- Capitalize the first letter of a word, no underscore
- Functions, variables, and methods
lowercase_underscore
- Lowercase + underscore
- constant
ALL_CAPS
- Capital letters + underscore
However, if your organization does not have its own naming conventions, it is recommended to follow PEP8.