Valid and invalid names and naming conventions for identifiers (e.g. variable names) in Python

Money and Business

In Python, identifiers (names of variables, functions, classes, etc.) need to be defined according to rules. Names that do not follow the rules cannot be used as identifiers and will result in an error.

The following information is provided here.

  • Characters that can and cannot be used in identifiers (names)
    • ASCII characters
    • Unicode character
      • normalization (e.g. in maths)
  • Check if the string is a valid identifier:isidentifier()
  • Words that cannot be used as identifiers (names) (reserved words)
  • Words that should not be used as identifiers (names)
  • Naming conventions for PEP8

The following description is given in Python 3, and may be different in Python 2.

Characters that can and cannot be used in identifiers (names)

Indicates characters that can and cannot be used as identifiers (names).

In addition, although there are many things to write about, basically all you need to remember is the following.

  • Use uppercase and lowercase letters, numbers, and underscores.
  • The first (first) letter cannot be a number.

ASCII characters

ASCII characters that can be used as identifiers (names) are uppercase and lowercase alphabets (A~Z,a~z), numbers (0~9), and underscores (_). The alphabet is case-sensitive.

AbcDef_123 = 100
print(AbcDef_123)
# 100

Symbols other than underscores cannot be used.

# AbcDef-123 = 100
# SyntaxError: can't assign to operator

Also, numbers cannot be used at the beginning (first letter).

# 1_abc = 100
# SyntaxError: invalid token

Underscores can also be used at the beginning.

_abc = 100
print(_abc)
# 100

However, note that an underscore at the beginning may have a special meaning.

Unicode character

Since Python 3, Unicode characters can also be used.

変数1 = 100
print(変数1)
# 100

Not all Unicode characters can be used, and depending on the Unicode category, some cannot be used. For example, symbols such as punctuation marks and pictograms cannot be used.

# 変数。 = 100
# SyntaxError: invalid character in identifier

# ☺ = 100
# SyntaxError: invalid character in identifier

See the official documentation for the Unicode category codes that can be used.

In many cases, there is no advantage to using Chinese characters, etc., simply because Unicode characters can also be used (without error).

normalization (e.g. in maths)

Unicode characters are converted to the normalized form NFKC for interpretation. For example, full-width alphabets are converted to half-width alphabets (ASCII characters).

Note that even if the source code shows a different display, it is considered the same object and will be overwritten.

ABC = 100
ABC = -100

print(ABC)
# -100

print(ABC)
# -100

print(ABC is ABC)
# True

Check if the string is a valid identifier: isidentifier()

Whether or not a string is valid as an identifier can be checked with the string method isidentifier().

It returns true if it is valid as an identifier, and false if it is invalid.

print('AbcDef_123'.isidentifier())
# True

print('AbcDef-123'.isidentifier())
# False

print('変数1'.isidentifier())
# True

print('☺'.isidentifier())
# False

Words that cannot be used as identifiers (names) (reserved words)

There are some words (reserved words) that cannot be used as identifiers even if they are valid strings as identifiers (names).

Since a reserved word is a valid string as an identifier, isidentifier() returns true, but an error occurs if it is used as an identifier.

print('None'.isidentifier())
# True

# None = 100
# SyntaxError: can't assign to keyword

To get a list of reserved words and to check if a string is a reserved word, use the keyword module of the standard library.

Words that should not be used as identifiers (names)

The names of Python's built-in functions, for example, can be used as identifiers, so you can assign new values to them as variables.

For example, len() is a built-in function that returns the number of elements in a list or the number of characters in a string.

print(len)
# <built-in function len>

print(len('abc'))
# 3

If you assign a new value to this name len, the original function will be overwritten and become unusable. Note that no error or warning will be printed when assigning a new value.

print(len('abc'))
# 3

len = 100
print(len)
# 100

# print(len('abc'))
# TypeError: 'int' object is not callable

Another common mistake is to use list = [0, 1, 2], which makes it impossible to use list(). Be careful.

Naming conventions for PEP8

PEP stands for Python Enhancement Proposal, a document that describes new features and other aspects of Python.

PEP stands for Python Enhancement Proposal. A PEP is a design document providing information to the Python community, or describing a new feature for Python or its processes or environment.
PEP 1 — PEP Purpose and Guidelines | Python.org

PEP8 is the eighth one, and it describes the “Style Guide for Python Code”, that is, the style guide for Python.

Naming conventions are also mentioned.

See the link above for more details, but for example, the following writing style is recommended.

  • Module
    • lowercase_underscore
    • Lowercase + underscore
  • Package
    • lowercase
    • all lower case letters
  • Classes, Exceptions
    • CapitalizedWords(CamelCase)
    • Capitalize the first letter of a word, no underscore
  • Functions, variables, and methods
    • lowercase_underscore
    • Lowercase + underscore
  • constant
    • ALL_CAPS
    • Capital letters + underscore

However, if your organization does not have its own naming conventions, it is recommended to follow PEP8.

Copied title and URL