When splitting a comma-separated string into a list in Python, if there are no spaces in between, just split() will work. If there are spaces, it is useful to combine it with strip() to remove the extra spaces. In addition, using the list comprehension notation is a smart way to write.
In this section, we first explain the following.
- Split a string with a specified delimiter and return it as a list
split()
- Remove extra characters from the beginning and end of a string.
strip()
- List comprehension notation to apply functions and methods to list elements.
It also shows how to make a list of strings separated by spaces and commas by removing spaces, as shown below.'one, two, three'
In addition, we will discuss the following
- How to get it as a list of numbers
- How to use join() to join a list and make it a string again
split(): Split a string with a specified delimiter and return it as a list
Using the method split() for strings, you can split a string with a specified delimiter and get it as a list (array). The specified delimiter can be specified by the following argument.sep
If the argument sep is omitted and no delimiter is specified, it splits the string by spaces and returns a list. Consecutive spaces and tabs will also split the list, so if you want to make a list of tab-delimited strings, you can use split() without the argument.
s = 'one two three' l = s.split() print(l) # ['one', 'two', 'three'] s = 'one two three' l = s.split() print(l) # ['one', 'two', 'three'] s = 'one\ttwo\tthree' l = s.split() print(l) # ['one', 'two', 'three']
If a delimiter is specified in the sep argument, it divides the list by that string and returns a list.
s = 'one::two::three' l = s.split('::') print(l) # ['one', 'two', 'three']
In the case of a comma-separated string, if there is no extra white space, there is no problem, but if you run split() with a comma as the delimiter for a string separated by a comma + white space, you will end up with a list of strings with white space left at the beginning.
s = 'one,two,three' l = s.split(',') print(l) # ['one', 'two', 'three'] s = 'one, two, three' l = s.split(',') print(l) # ['one', ' two', ' three']
You can use a comma + space as the delimiter as follows, but it will not work if the number of spaces in the original string is different.', '
s = 'one, two, three' l = s.split(', ') print(l) # ['one', 'two', 'three'] s = 'one, two, three' l = s.split(', ') print(l) # ['one', 'two', ' three']
The string method strip(), which will be explained next, can be used to deal with two spaces.
strip(): Remove extra characters from the beginning and end of a string.
strip() is a method to remove extra characters from the beginning and end of a string.
If the argument is omitted, a new string is returned with whitespace characters removed. The original string itself is not changed.
s = ' one ' print(s.strip()) # one print(s) # one
If a string is specified as an argument, the characters contained in the string will be removed.
s = '-+-one-+-' print(s.strip('-+')) # one
In this case, spaces are not removed. Therefore, if you want to remove whitespace as well, pass a string including spaces as an argument, as shown below.'-+ '
s = '-+- one -+-' print(s.strip('-+')) # one s = '-+- one -+-' print(s.strip('-+ ')) # one
strip() handles both ends, but the following functions are also available.
lstrip()
:Process only the beginningrstrip()
:Process the end of the line only.
List comprehension notation: apply functions and methods to list elements
If you want to apply a function or method to the elements of a list, it is smart to use the list comprehension notation instead of the for loop if you want to get the list in the end.
- Related Articles:Using Python list comprehensions notation
Here, we apply strip() to the list obtained by splitting the string with split(). The extra whitespace in a comma-separated string containing whitespace can be removed to make a list.
s = 'one, two, three' l = [x.strip() for x in s.split(',')] print(l) # ['one', 'two', 'three']
When this is applied to an empty string, a list with a single empty string as an element can be obtained.
s = '' l = [x.strip() for x in s.split(',')] print(l) print(len(l)) # [''] # 1
If you want to get an empty list for an empty string, you can set up a conditional branch in the list comprehension notation.
s = '' l = [x.strip() for x in s.split(',') if not s == ''] print(l) print(len(l)) # [] # 0
'one, , three'
Also, if a comma-separated element is missing, as described above, the first method will list it as an empty string element.
s = 'one, , three' l = [x.strip() for x in s.split(',')] print(l) print(len(l)) # ['one', '', 'three'] # 3
If you want to ignore the missing parts, you can set up a conditional branch in the list comprehension notation.
s = 'one, ,three' l = [x.strip() for x in s.split(',') if not x.strip() == ''] print(l) print(len(l)) # ['one', 'three'] # 2
Get as a list of numbers
If you want to get a comma-separated string of numbers as a list of numbers instead of a string, apply int() or float() to convert the string to a number in the list comprehension notation.
s = '1, 2, 3, 4' l = [x.strip() for x in s.split(',')] print(l) print(type(l[0])) # ['1', '2', '3', '4'] # <class 'str'> s = '1, 2, 3, 4' l = [int(x.strip()) for x in s.split(',')] print(l) print(type(l[0])) # [1, 2, 3, 4] # <class 'int'>
join(): Merge a list and get it as a string
In the opposite pattern, if you want to join a list and get strings separated by a specific delimiter, use the join() method.
It is easy to make a mistake, but note that join() is a string method, not a list method. The list is specified as an argument.
s = 'one, two, three' l = [x.strip() for x in s.split(',')] print(l) # ['one', 'two', 'three'] print(','.join(l)) # one,two,three print('::'.join(l)) # one::two::three
You can write it in one line as follows.
s = 'one, two, three' s_new = '-'.join([x.strip() for x in s.split(',')]) print(s_new) # one-two-three
If you just want to change a fixed delimiter, it is easier to replace it with the replace() method.
s = 'one,two,three' s_new = s.replace(',', '+') print(s_new) # one+two+three