Return empty list instead of TimeoutError if function times out
Solution 1:
You could put another decorator around it that handles the catches the timeout exception and returns an empty list. If you don't want to write it on your own, you might want to use the @ignore from funcy https://funcy.readthedocs.io/en/stable/flow.html. Assuming you get a TimeoutError exception, your could should look like
import timeout_decorator
import requests
from bs4 import BeautifulSoup as bs
@ignore(TimeoutError, default=[])
@timeout_decorator.timeout(5, use_signals=False)
def get_soup(url):
session = requests.Session()
# set the User-agent as a regular browser
session.headers["User-Agent"] = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36"
# get the HTML content
html = session.get(url).content
# parse HTML using beautiful soup
soup = bs(html, "html.parser")
return soup