Using Python list comprehensions notation

Money and Business

In Python, it is simple to use list comprehensions notation when generating a new list.(List comprehensions)

In this article, we will first discuss the following

  • Basic type of list comprehension notation
  • List comprehension notation with conditional branching by if
  • Combination with ternary operators (if else-like processing)
  • zip(),enumerate()Combination with these
  • nested list inclusion notation

Next, we will explain the set of list comprehension notation with sample code.

  • set inclusion notation(Set comprehensions)
  • dictionary inclusion notation(Dict comprehensions)
  • generator type(Generator expressions)

Basic type of list comprehension notation

The list comprehension notation is written as follows.

[Expression for Any Variable Name in Iterable Object]

It takes each element of an iterable object such as a list, tuple, or range by an arbitrary variable name and evaluates it with an expression. A new list with the evaluation result as an element is returned.

An example is given along with an equivalent for statement.

squares = [i**2 for i in range(5)]
print(squares)
# [0, 1, 4, 9, 16]
squares = []
for i in range(5):
    squares.append(i**2)

print(squares)
# [0, 1, 4, 9, 16]

The same process can be done with map(), but the list comprehension notation is preferred for its simplicity and clarity.

List comprehension notation with conditional branching by if

Conditional branching with if is also possible. Write the if in the postfix as follows.

[Expression for Any Variable Name in Iterable Object if Conditional Expression]

Only the elements of the iterable object whose conditional expression is true are evaluated by the expression, and a new list whose elements are the result is returned.

You can use any variable name in the conditional expression.

An example is given along with an equivalent for statement.

odds = [i for i in range(10) if i % 2 == 1]
print(odds)
# [1, 3, 5, 7, 9]
odds = []
for i in range(10):
    if i % 2 == 1:
        odds.append(i)

print(odds)
# [1, 3, 5, 7, 9]

The same process can be done with filter(), but the list comprehension notation is preferred for its simplicity and clarity.

Combination with ternary operators (if else-like processing)

In the example above, only those elements that meet the criteria are processed, and those that do not meet the criteria are excluded from the new list.

If you want to switch the process depending on the condition, or if you want to process elements that do not satisfy the condition differently, as in if else, use the ternary operator.

In Python, the ternary operator can be written as follows

Value When True if Conditional Expression else Value When False

This is used in the expression part of the list comprehension notation as shown below.

[Value When True if Conditional Expression else Value When False for Any Variable Name in Iterable Object]

An example is given along with an equivalent for statement.

odd_even = ['odd' if i % 2 == 1 else 'even' for i in range(10)]
print(odd_even)
# ['even', 'odd', 'even', 'odd', 'even', 'odd', 'even', 'odd', 'even', 'odd']
odd_even = []
for i in range(10):
    if i % 2 == 1:
        odd_even.append('odd')
    else:
        odd_even.append('even')

print(odd_even)
# ['even', 'odd', 'even', 'odd', 'even', 'odd', 'even', 'odd', 'even', 'odd']

It is also possible to write expressions using arbitrary variable names for the true and false values.

If the condition is satisfied, some processing is done, otherwise the value of the original iterable object is left unchanged.

odd10 = [i * 10 if i % 2 == 1 else i for i in range(10)]
print(odd10)
# [0, 10, 2, 30, 4, 50, 6, 70, 8, 90]

Combination with zip() and enumerate()

Useful functions that are often used in the for statement include zip(), which combines multiple iterables, and enumerate(), which returns a value along with its index.

Of course, it is possible to use zip() and enumerate() with list comprehension notation. It is not a special syntax, and it is not difficult if you consider the correspondence with the for statement.

Example of zip().

l_str1 = ['a', 'b', 'c']
l_str2 = ['x', 'y', 'z']

l_zip = [(s1, s2) for s1, s2 in zip(l_str1, l_str2)]
print(l_zip)
# [('a', 'x'), ('b', 'y'), ('c', 'z')]
l_zip = []
for s1, s2 in zip(l_str1, l_str2):
    l_zip.append((s1, s2))

print(l_zip)
# [('a', 'x'), ('b', 'y'), ('c', 'z')]

Example of enumerate().

l_enu = [(i, s) for i, s in enumerate(l_str1)]
print(l_enu)
# [(0, 'a'), (1, 'b'), (2, 'c')]
l_enu = []
for i, s in enumerate(l_str1):
    l_enu.append((i, s))

print(l_enu)
# [(0, 'a'), (1, 'b'), (2, 'c')]

The idea is the same as before when using if.

l_zip_if = [(s1, s2) for s1, s2 in zip(l_str1, l_str2) if s1 != 'b']
print(l_zip_if)
# [('a', 'x'), ('c', 'z')]

Each element can also be used to calculate a new element.

l_int1 = [1, 2, 3]
l_int2 = [10, 20, 30]

l_sub = [i2 - i1 for i1, i2 in zip(l_int1, l_int2)]
print(l_sub)
# [9, 18, 27]

nested list inclusion notation

Like nesting for loops, list comprehension notation can also be nested.

[Expression for Variable Name 1 in Iterable Object 1
    for Variable Name 2 in Iterable Object 2
        for Variable Name 3 in Iterable Object 3 ... ]

For convenience, line breaks and indentations have been added, but are not required for grammar; they can be continued on a single line.

An example is given along with an equivalent for statement.

matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

flat = [x for row in matrix for x in row]
print(flat)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
flat = []
for row in matrix:
    for x in row:
        flat.append(x)

print(flat)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]

It is also possible to use multiple variables.

cells = [(row, col) for row in range(3) for col in range(2)]
print(cells)
# [(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1)]

You can also do conditional branching.

cells = [(row, col) for row in range(3)
         for col in range(2) if col == row]
print(cells)
# [(0, 0), (1, 1)]

It is also possible to conditionally branch for each iterable object.

cells = [(row, col) for row in range(3) if row % 2 == 0
         for col in range(2) if col % 2 == 0]
print(cells)
# [(0, 0), (2, 0)]

set inclusion notation(Set comprehensions)

Changing square brackets [] in the list comprehension notation to curly brackets {} creates a set (set-type object).

{Expression for Any Variable Name in Iterable Object}
s = {i**2 for i in range(5)}

print(s)
# {0, 1, 4, 9, 16}

dictionary inclusion notation(Dict comprehensions)

Dictionaries (dict type objects) can also be generated with comprehension notation.

{}, and specify the key and value in the expression part as key: value.

{Key: Value for Any Variable Name in Iterable Object}

Any expression can be specified for key and value.

l = ['Alice', 'Bob', 'Charlie']

d = {s: len(s) for s in l}
print(d)
# {'Alice': 5, 'Bob': 3, 'Charlie': 7}

To create a new dictionary from a list of keys and values, use the zip() function.

keys = ['k1', 'k2', 'k3']
values = [1, 2, 3]

d = {k: v for k, v in zip(keys, values)}
print(d)
# {'k1': 1, 'k2': 2, 'k3': 3}

generator type(Generator expressions)

If square brackets [] in the list comprehensions notation are used as round brackets (), a generator is returned instead of a tuple. This is called generator expressions.

Example of list comprehension notation.

l = [i**2 for i in range(5)]

print(l)
# [0, 1, 4, 9, 16]

print(type(l))
# <class 'list'>

Example of a generator expression. If you print() the generator as it is, it will not print out its contents, but if you run it with a for statement, you can get the contents.

g = (i**2 for i in range(5))

print(g)
# <generator object <genexpr> at 0x10af944f8>

print(type(g))
# <class 'generator'>

for i in g:
    print(i)
# 0
# 1
# 4
# 9
# 16

Generator expressions also allow conditional branching and nesting using if as well as list comprehension notation.

g_cells = ((row, col) for row in range(0, 3)
           for col in range(0, 2) if col == row)

print(type(g_cells))
# <class 'generator'>

for i in g_cells:
    print(i)
# (0, 0)
# (1, 1)

For example, if a list with a large number of elements is generated using list comprehension notation and then looped through with a for statement, the list containing all the elements will be generated at the beginning if list comprehension notation is used. On the other hand, if you use a generator expression, each time the loop is repeated, the elements are generated one by one, thus reducing the amount of memory used.

If the generator expression is the only argument of the function, the round brackets () can be omitted.

print(sum([i**2 for i in range(5)]))
# 30

print(sum((i**2 for i in range(5))))
# 30

print(sum(i**2 for i in range(5)))
# 30

As for processing speed, the list comprehension notation is often faster than the generator notation when all elements are processed.

However, when judging with all() or any(), for example, the result is determined when false or true is present, so using generator expressions may be faster than using list comprehension notation.

There is no tuple comprehension notation, but if you use a generator expression as an argument of tuple(), you can generate a tuple in the comprehension notation.

t = tuple(i**2 for i in range(5))

print(t)
# (0, 1, 4, 9, 16)

print(type(t))
# <class 'tuple'>
Copied title and URL