ReactorNotRestartable error in while loop with scrapy
Solution 1:
By default, CrawlerProcess
's .start()
will stop the Twisted reactor it creates when all crawlers have finished.
You should call process.start(stop_after_crawl=False)
if you create process
in each iteration.
Another option is to handle the Twisted reactor yourself and use CrawlerRunner
. The docs have an example on doing that.
Solution 2:
I was able to solve this problem like this. process.start()
should be called only once.
from time import sleep
from scrapy import signals
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
from scrapy.xlib.pydispatch import dispatcher
result = None
def set_result(item):
result = item
while True:
process = CrawlerProcess(get_project_settings())
dispatcher.connect(set_result, signals.item_scraped)
process.crawl('my_spider')
process.start()