Python Scrapy: Convert relative paths to absolute paths

Solution 1:

From Scrapy docs:

def parse(self, response):
    # ... code ommited
    next_page = response.urljoin(next_page)
    yield scrapy.Request(next_page, self.parse)

that is, response object has a method to do exactly this.

Solution 2:

What i do is:

import urlparse
...

def parse(self, response):
    ...
    urlparse.urljoin(response.url, extractedLink.strip())
    ...

Notice strip(), because i meet sometimes strange links like:

<a href="
              /MID_BRAND_NEW!%c2%a0MID_70006_Google_Android_2.2_7%22%c2%a0Tablet_PC_Silver/a904326516.html
            ">MID BRAND NEW!&nbsp;MID 70006 Google Android 2.2 7"&nbsp;Tablet PC Silver</a>