How to frame two for loops in list comprehension python
I have two lists as below
tags = [u'man', u'you', u'are', u'awesome']
entries = [[u'man', u'thats'],[ u'right',u'awesome']]
I want to extract entries from entries
when they are in tags
:
result = []
for tag in tags:
for entry in entries:
if tag in entry:
result.extend(entry)
How can I write the two loops as a single line list comprehension?
The best way to remember this is that the order of for loop inside the list comprehension is based on the order in which they appear in traditional loop approach. Outer most loop comes first, and then the inner loops subsequently.
So, the equivalent list comprehension would be:
[entry for tag in tags for entry in entries if tag in entry]
In general, if-else
statement comes before the first for loop, and if you have just an if
statement, it will come at the end. For e.g, if you would like to add an empty list, if tag
is not in entry, you would do it like this:
[entry if tag in entry else [] for tag in tags for entry in entries]
This should do it:
[entry for tag in tags for entry in entries if tag in entry]
The appropriate LC would be
[entry for tag in tags for entry in entries if tag in entry]
The order of the loops in the LC is similar to the ones in nested loops, the if statements go to the end and the conditional expressions go in the beginning, something like
[a if a else b for a in sequence]
See the Demo -
>>> tags = [u'man', u'you', u'are', u'awesome']
>>> entries = [[u'man', u'thats'],[ u'right',u'awesome']]
>>> [entry for tag in tags for entry in entries if tag in entry]
[[u'man', u'thats'], [u'right', u'awesome']]
>>> result = []
for tag in tags:
for entry in entries:
if tag in entry:
result.append(entry)
>>> result
[[u'man', u'thats'], [u'right', u'awesome']]
EDIT - Since, you need the result to be flattened, you could use a similar list comprehension and then flatten the results.
>>> result = [entry for tag in tags for entry in entries if tag in entry]
>>> from itertools import chain
>>> list(chain.from_iterable(result))
[u'man', u'thats', u'right', u'awesome']
Adding this together, you could just do
>>> list(chain.from_iterable(entry for tag in tags for entry in entries if tag in entry))
[u'man', u'thats', u'right', u'awesome']
You use a generator expression here instead of a list comprehension. (Perfectly matches the 79 character limit too (without the list
call))
In comprehension, the nested lists iteration should follow the same order than the equivalent imbricated for loops.
To understand, we will take a simple example from NLP. You want to create a list of all words from a list of sentences where each sentence is a list of words.
>>> list_of_sentences = [['The','cat','chases', 'the', 'mouse','.'],['The','dog','barks','.']]
>>> all_words = [word for sentence in list_of_sentences for word in sentence]
>>> all_words
['The', 'cat', 'chases', 'the', 'mouse', '.', 'The', 'dog', 'barks', '.']
To remove the repeated words, you can use a set {} instead of a list []
>>> all_unique_words = list({word for sentence in list_of_sentences for word in sentence}]
>>> all_unique_words
['.', 'dog', 'the', 'chase', 'barks', 'mouse', 'The', 'cat']
or apply list(set(all_words))
>>> all_unique_words = list(set(all_words))
['.', 'dog', 'the', 'chases', 'barks', 'mouse', 'The', 'cat']