Determining if a list (array) has duplicate elements in Python

Money and Business

The following is a description of how to determine if a list (array) has duplicate elements (all elements are unique/unique) in Python, for each of the following cases.

  • For a list with no list in the element
  • For lists with lists of elements (two-dimensional arrays, lists of lists, etc.)

See the following article on how to remove or extract duplicate elements from a list.

Note that lists can store different types of data and are strictly different from arrays. If you want to handle arrays in processes that require memory size and memory addresses or numerical processing of large data, use array (standard library) or NumPy.

Determine if there are duplicate elements in the list (if the element has no list)

If the element does not have an updatable object such as a list, use the constructor set() of the set set type.

The set type is a data type that has no duplicate elements. When a list is passed to the constructor set(), duplicate values are ignored and an object of type set with only unique values as elements is returned.

The number of elements in this set type object and the original list are obtained and compared using the built-in function len().

  • If the number of elements is equal, there are no duplicate elements in the original list
  • Duplicate elements are included in the original list if the number of elements is different

Functions that return false if there are no duplicate elements and true if there are duplicate elements are as follows

def has_duplicates(seq):
    return len(seq) != len(set(seq))

l = [0, 1, 2]
print(has_duplicates(l))
# False

l = [0, 1, 1, 2]
print(has_duplicates(l))
# True

The example is a list, but the same function can be used with tuples.

Mutable (updatable) objects such as lists cannot be elements of type set. Therefore, lists with lists as elements (two-dimensional arrays, lists of lists, etc.) will result in a TypeError. The countermeasure is shown below.

l_2d = [[0, 1], [1, 1], [0, 1], [1, 0]]
# print(has_duplicates(l_2d))
# TypeError: unhashable type: 'list'

Determine if there are duplicate elements in the list (if the element has a list)

In the case of a list with a list of elements (such as a list of lists), the following functions can be used to determine if there are duplicate elements.

def has_duplicates2(seq):
    seen = []
    unique_list = [x for x in seq if x not in seen and not seen.append(x)]
    return len(seq) != len(unique_list)

l_2d = [[0, 0], [0, 1], [1, 1], [1, 0]]
print(has_duplicates2(l_2d))
# False

l_2d = [[0, 0], [0, 1], [1, 1], [1, 1]]
print(has_duplicates2(l_2d))
# True

Instead of set(), the list comprehension notation generates a list whose elements are only unique values, and the number of elements is compared. See the following article for details.

This function is also valid for lists that do not have a list of elements.

l = [0, 1, 2]
print(has_duplicates2(l))
# False

l = [0, 1, 1, 2]
print(has_duplicates2(l))
# True

The example so far is the determination of whether the list of elements is duplicated (contains the same list).

Whether the elements of each list overlap can be determined after flattening the original list to one dimension.

l_2d = [[0, 1], [2, 3]]
print(sum(l_2d, []))
# [0, 1, 2, 3]

print(has_duplicates(sum(l_2d, [])))
# False

l_2d = [[0, 1], [2, 0]]
print(has_duplicates(sum(l_2d, [])))
# True

Here, sum() is used to flatten the list, but itertools.chain.from_iterable() can also be used. In addition, when flattening a list of three or more dimensions, it is necessary to define a new function.