Converting a loop with an assignment into a comprehension [duplicate]

Converting a loop into a comprehension is simple enough:

mylist = []
for word in ['Hello', 'world']:
    mylist.append(word.split('l')[0])

to

mylist = [word.split('l')[0] for word in ['Hello', 'world']]

But I'm not sure how to proceed when the loop involves assigning a value to a reference.

mylist = []
for word in ['Hello', 'world']:
    split_word = word.split('l')
    mylist.append(split_word[0]+split_word[1])

And the comprehension ends up looking like this:

mylist = [word.split('l')[0]+word.split('l')[1] for word in ['Hello', 'world']]

This calculates word.split('l') multiple times whereas the loop only calculates it once and saves a reference. I've tried the following:

mylist = [split_word[0]+split_word[1] for word in ['Hello', 'world'] with word.split('l') as split_word]

which fails because with doesn't work that way, and:

mylist = [split_word[0]+split_word[1] for word in ['Hello', 'world'] for split_word = word.split('l')]

which doesn't work either. I'm aware of unpacking via * and ** but I'm not sure where that would fit in here. Is it possible to turn these sorts of loops into comprehensions, hopefully in a tidy way?


You can't directly translate that loop to a comprehension. Comprehensions, being expressions, can only contain expressions, and assignments are statements.

However, that doesn't mean there are no options.

First, at the cost of calling split twice, you can just do this:

mylist = [word.split('l')[0]+word.split('l')[1] for word in ['Hello', 'world']]

But you don't want to call split twice.


The most general way around that is to use a chain of generator expressions (with one list comprehension at the end) to transform things:

words = (word.split('l') for word in ['Hello', 'world'])
mylist = [w[0]+w[1] for w in words]

If you really want to merge that all into one expression, you can:

mylist = [w[0]+w[1] for w in 
          (word.split('l') for word in ['Hello', 'world'])]

But unless you actually need it to be in an expression, it's probably more readable not to do that.


A more specific way in this case is to replace the w[0]+w[1] with something equivalent that doesn't need to reference w twice:

mylist = [''.join(word.split('l')[:2]) for word in ['Hello', 'world']]

And you can always generalize this one, too. You can turn any expression into a function, which means you can avoid evaluating any part of it by passing it as an argument to that function. If there isn't a function that does what you want, write it:

def join_up(split_word):
    return split_word[0]+split_word[1]
mylist = [join_up(word.split('l')) for word in ['Hello', 'world']]

If you need to make that all into one expression without repeating any work, it may not be pretty:

mylist = [(lambda split_word: split_word[0]+split_word[1])(word.split('l')) 
          for word in ['Hello', 'world']]

But ultimately, unless I already had a function lying around that did what I needed, I'd use the chain-of-generator-expressions solution.

Or, of course, just keep it in an explicit loop; there's nothing wrong with for loops, and if the intermediate temporary variable makes your code clearer, there's no better way to do that than with an assignment statement.


Be creative.

mylist = [''.join(word.split('l')[:2]) for word in ['Hello', 'world']]

...

mylist = [''.join(operator.itemgetter(0, 1)(word.split('l')))
            for word in ['Hello', 'world']]