Ignoring (disabling) escape sequences in Python with raw strings

Money and Business

'...', "..."In Python, if you prefix these string literals with one of the following characters, the value will become a string without expanding the escape sequence.

  • r
  • R

Useful when dealing with strings that use a lot of backslashes, such as Windows paths and regular expression patterns.
The following information is provided here.

  • escape sequence
  • Ignore (disable) escape sequences in raw strings
  • Convert normal string to raw string:repr()
  • Note the backslash at the end.

escape sequence

In Python, characters that cannot be represented in a normal string (such as tabs and newlines) are described using escape sequences with backslashes, similar to the C language. An example of an escape sequence is shown below.

  • \t
  • \n
s = 'a\tb\nA\tB'
print(s)
# a b
# A B

Ignore (disable) escape sequences in raw strings

'...', "..."If you prefix such a string literal with one of the following, the value will become a string without expanding the escape sequence. Such a string is called a raw string.

  • r
  • R
rs = r'a\tb\nA\tB'
print(rs)
# a\tb\nA\tB

There is no special type called raw string type, it is just a string type and is equal to a normal string with a backslash represented as follows
\\

print(type(rs))
# <class 'str'>

print(rs == 'a\\tb\\nA\\tB')
# True

In a normal string, an escape sequence is considered to be one character, but in a raw string, backslashes are also counted as characters. The length of the string and each character is as follows.

print(len(s))
# 7

print(list(s))
# ['a', '\t', 'b', '\n', 'A', '\t', 'B']

print(len(rs))
# 10

print(list(rs))
# ['a', '\\', 't', 'b', '\\', 'n', 'A', '\\', 't', 'B']

Windows Path

Using the raw string is useful when you want to represent a Windows path as a string.

Windows paths are separated by backslashes, so if you use a normal string, you have to escape the path as follows, but if you use a raw string, you can write it as is. The values are equivalent.
\\

path = 'C:\\Windows\\system32\\cmd.exe'
rpath = r'C:\Windows\system32\cmd.exe'
print(path == rpath)
# True

Note that a string ending with an odd number of backslashes will result in an error, as described below. In this case, it is necessary to write the string as a normal string, or concatenate it by writing only the end of the string as a normal string.

path2 = 'C:\\Windows\\system32\\'
# rpath2 = r'C:\Windows\system32\'
# SyntaxError: EOL while scanning string literal
rpath2 = r'C:\Windows\system32' + '\\'
print(path2 == rpath2)
# True

Convert normal strings to raw strings with repr()

If you want to convert a normal string into a raw string ignoring (disabling) escape sequences, you can use the built-in function repr().

s_r = repr(s)
print(s_r)
# 'a\tb\nA\tB'

What repr() returns is a string representing an object such that it has the same value as when it was passed to eval(), with leading and trailing characters.

print(list(s_r))
# ["'", 'a', '\\', 't', 'b', '\\', 'n', 'A', '\\', 't', 'B', "'"]

Using slices, we can get a string equivalent to the raw string with r attached.

s_r2 = repr(s)[1:-1]
print(s_r2)
# a\tb\nA\tB

print(s_r2 == rs)
# True

print(r'\t' == repr('\t')[1:-1])
# True

Note the backslash at the end.

Since a backslash escapes the quoting character immediately after it, an error will occur if there are an odd number of backslashes at the end of the string. An even number of backslashes is OK.

# print(r'\')
# SyntaxError: EOL while scanning string literal

print(r'\\')
# \\

# print(r'\\\')
# SyntaxError: EOL while scanning string literal