python split a string with at least 2 whitespaces

I would like to split a string only where there are at least two or more whitespaces.

For example

str = '10DEUTSCH        GGS Neue Heide 25-27     Wahn-Heide   -1      -1'
print str.split()

Results:

['10DEUTSCH', 'GGS', 'Neue', 'Heide', '25-27', 'Wahn-Heide', '-1', '-1']

I would like it to look like this:

['10DEUTSCH', 'GGS Neue Heide 25-27', 'Wahn-Heide', '-1', '-1']

In [4]: import re    
In [5]: text = '10DEUTSCH        GGS Neue Heide 25-27     Wahn-Heide   -1      -1'
In [7]: re.split(r'\s{2,}', text)
Out[7]: ['10DEUTSCH', 'GGS Neue Heide 25-27', 'Wahn-Heide', '-1', '-1']

Update 2021+ answer.

`str.split` now accepts regular expressions to split on.

read more here

row = '10DEUTSCH        GGS Neue Heide 25-27     Wahn-Heide   -1      -1'
df = pd.DataFrame({'string' : row},index=[0])

print(df)
                                              string
0  10DEUTSCH        GGS Neue Heide 25-27     Wahn...

df1 = df['string'].str.split('\s{2,}',expand=True)
print(df1)

           0                     1           2   3   4
0  10DEUTSCH  GGS Neue Heide 25-27  Wahn-Heide  -1  -1

As has been pointed out, str is not a good name for your string, so using words instead:

output = [s.strip() for s in words.split('  ') if s]

The .split(' ') -- with two spaces -- will give you a list that includes empty strings, and items with trailing/leading whitespace. The list comprehension iterates through that list, keeps any non-blank items (if s), and .strip() takes care of any leading/trailing whitespace.

In [30]: strs='10DEUTSCH        GGS Neue Heide 25-27     Wahn-Heide   -1      -1'

In [38]: filter(None, strs.split("  "))

Out[38]: ['10DEUTSCH', 'GGS Neue Heide 25-27', ' Wahn-Heide', ' -1', '-1']

In [32]: map(str.strip, filter(None, strs.split("  ")))

Out[32]: ['10DEUTSCH', 'GGS Neue Heide 25-27', 'Wahn-Heide', '-1', '-1']

For python 3, wrap the result of filter and map with list to force iteration.

python split a string with at least 2 whitespaces

Update 2021+ answer.

`str.split` now accepts regular expressions to split on.

Related

Recent Posts

python split a string with at least 2 whitespaces

Update 2021+ answer.

str.split now accepts regular expressions to split on.

Related

Recent Posts

`str.split` now accepts regular expressions to split on.