Split a string to even sized chunks
How would I be able to take a string like 'aaaaaaaaaaaaaaaaaaaaaaa'
and split it into 4 length tuples like (aaaa
,aaaa
,aaaa
)
Use textwrap.wrap
:
>>> import textwrap
>>> s = 'aaaaaaaaaaaaaaaaaaaaaaa'
>>> textwrap.wrap(s, 4)
['aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaa']
Using list comprehension, generator expression:
>>> s = 'aaaaaaaaaaaaaaaaaaaaaaa'
>>> [s[i:i+4] for i in range(0, len(s), 4)]
['aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaa']
>>> tuple(s[i:i+4] for i in range(0, len(s), 4))
('aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaa')
>>> s = 'a bcdefghi j'
>>> tuple(s[i:i+4] for i in range(0, len(s), 4))
('a bc', 'defg', 'hi j')
Another solution using regex:
>>> s = 'aaaaaaaaaaaaaaaaaaaaaaa'
>>> import re
>>> re.findall('[a-z]{4}', s)
['aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaaa']
>>>
You could use the grouper recipe, zip(*[iter(s)]*4)
:
In [113]: s = 'aaaaaaaaaaaaaaaaaaaaaaa'
In [114]: [''.join(item) for item in zip(*[iter(s)]*4)]
Out[114]: ['aaaa', 'aaaa', 'aaaa', 'aaaa', 'aaaa']
Note that textwrap.wrap
may not split s
into strings of length 4 if the string contains spaces:
In [43]: textwrap.wrap('I am a hat', 4)
Out[43]: ['I am', 'a', 'hat']
The grouper recipe is faster than using textwrap
:
In [115]: import textwrap
In [116]: %timeit [''.join(item) for item in zip(*[iter(s)]*4)]
100000 loops, best of 3: 2.41 µs per loop
In [117]: %timeit textwrap.wrap(s, 4)
10000 loops, best of 3: 32.5 µs per loop
And the grouper recipe can work with any iterator, while textwrap
only works with strings.