Loop until list is not empty in Python
I'm working on a web scrapper that has two classes. One takes the data and the other class processes it. The end result of the first class is a list of elements such as results = [1, 2, 3, 4, 5, ...]
. The problem is sometimes, due to server-side error, the list can come out empty. How can I loop through this to restart the process until the list is not empty?
I kinda solved it like this. But I'm not sure if this is efficient or a good practice.
class DataScrapper:
def __init__(self):
...
def getData(self):
self.results = []
while not self.results:
...
return self.results
Is this a pythonic way of solving the problem? Is there another more efficient way? Thank you very much.
Solution 1:
Your idiom is simple and good for most cases.
You must however keep in mind of 2 things:
- You don't cap the retries. If the server is down for a long time, your script will get stuck.
- You keep on generating requests even during downtimes. That can cause a large client and server load. I highly suggest using an exponential backoff strategy.
A quick search in google found the backoff library which allows you to do both:
@backoff.on_predicate(backoff.expo, lambda x: x == [], max_tries=10)
def getData(self):
self.results = []
...
return self.results
It checks the return value, and if it's an empty list, runs the function again with increasing delays until you reach 10 tries.