python: convert "5,4,2,4,1,0" into [[5, 4], [2, 4], [1, 0]]

There are two important one line idioms in Python that help make this "straightforward".

The first idiom, use zip(). From the Python documents:

The left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using zip(*[iter(s)]*n).

So applying to your example:

>>> num_str = '5,4,2,4,1,0,3,0,5,1,3,3,14,32,3,5'
>>> zip(*[iter(num_str.split(","))]*2)
[('5', '4'), ('2', '4'), ('1', '0'), ('3', '0'), ('5', '1'), 
('3', '3'), ('14', '32'), ('3', '5')]

That produces tuples each of length 2.

If you want the length of the sub elements to be different:

>>> zip(*[iter(num_str.split(","))]*4)
[('5', '4', '2', '4'), ('1', '0', '3', '0'), ('5', '1', '3', '3'), 
('14', '32', '3', '5')]

The second idiom is list comprehensions. If you want sub elements to be lists, wrap in a comprehension:

>>> [list(t) for t in zip(*[iter(num_str.split(","))]*4)]
[['5', '4', '2', '4'], ['1', '0', '3', '0'], ['5', '1', '3', '3'], 
['14', '32', '3', '5']]
>>> [list(t) for t in zip(*[iter(num_str.split(","))]*2)]
[['5', '4'], ['2', '4'], ['1', '0'], ['3', '0'], ['5', '1'], ['3', '3'], 
['14', '32'], ['3', '5']]

Any sub element groups that are not complete will be truncated by zip(). So if your string is not a multiple of 2, for example, you will loose the last element.

If you want to return sub elements that are not complete (ie, if your num_str is not a multiple of the sub element's length) use a slice idiom:

>>> l=num_str.split(',')
>>> [l[i:i+2] for i in range(0,len(l),2)]
[['5', '4'], ['2', '4'], ['1', '0'], ['3', '0'], ['5', '1'], 
['3', '3'], ['14', '32'], ['3', '5']]
>>> [l[i:i+7] for i in range(0,len(l),7)]
[['5', '4', '2', '4', '1', '0', '3'], ['0', '5', '1', '3', '3', '14', '32'], 
['3', '5']]

If you want each element to be an int, you can apply that prior to the other transforms discussed here:

>>> nums=[int(x) for x in num_str.split(",")]
>>> zip(*[iter(nums)]*2)
# etc etc etc

As pointed out in the comments, with Python 2.4+, you can also replace the list comprehension with a Generator Expression by replacing the [ ] with ( ) as in:

 >>> nums=(int(x) for x in num_str.split(","))
 >>> zip(nums,nums)
 [(5, 4), (2, 4), (1, 0), (3, 0), (5, 1), (3, 3), (14, 32), (3, 5)]
 # or map(list,zip(nums,nums)) for the list of lists version...

If your string is long, and you know that you only need 2 elements, this is more efficient.


One option:

>>> num_str = '5,4,2,4,1,0,3,0,5,1,3,3,4,3,3,5'
>>> l = num_str.split(',')
>>> zip(l[::2], l[1::2])
[('5', '4'), ('2', '4'), ('1', '0'), ('3', '0'), ('5', '1'), ('3', '3'), ('4', '3'), ('3', '5')]

Reference: str.split(), zip(), General information about sequence types and slicing

If you actually want integers, you could convert the list to integers first using map:

>>> l = map(int, num_str.split(','))

Explanation:

split creates a list of the single elements. The trick is the slicing: the syntax is list[start:end:step]. l[::2] will return every second element starting from the first one (so the first, third,...), whereas the second slice l[1::2] returns every second element from the second one (so the second, forth, ...).

Update: If you really want lists, you could use map again on the result list:

>>> xy_list = map(list, xy_list)

Note that @Johnsyweb's answer is probably faster as it seems to not do any unnecessary iterations. But the actual difference depends of course on the size of the list.


#!/usr/bin/env python

from itertools import izip

def pairwise(iterable):
    "s -> (s0,s1), (s2,s3), (s4, s5), ..."
    a = iter(iterable)
    return izip(a, a)

s = '5,4,2,4,1,0,3,0,5,1,3,3,4,3,3,5'
fields = s.split(',')
print [[int(x), int(y)] for x,y in pairwise(fields)]

Taken from @martineau's answer to my question, which I have found to be very fast.

Output:

[[5, 4], [2, 4], [1, 0], [3, 0], [5, 1], [3, 3], [4, 3], [3, 5]]

First, use split to make a list of numbers (as in all of the other answers).

num_list = num_str.split(",")

Then, convert to integers:

num_list = [int(i) for i in num_list]

Then, use the itertools groupby recipe:

from itertools import izip_longest
def grouper(n, iterable, fillvalue=None):
   "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
   args = [iter(iterable)] * n
   return izip_longest(fillvalue=fillvalue, *args)

pair_list = grouper(2, num_list)

Of course, you can compress this into a single line if you're frugal:

pair_list = grouper(2, [int(i) for i in num_str.split(",")]

>>> num_str = '5,4,2,4,1,0,3,0,5,1,3,3,4,3,3,5'
>>> inums = iter([int(x) for x in num_str.split(',')])
>>> [[x, inums.next()] for x in inums]
[[5, 4], [2, 4], [1, 0], [3, 0], [5, 1], [3, 3], [4, 3], [3, 5]]
>>>