How "legal" is site-scraping using cURL? [closed]

Recently I was experimenting with the cURL, and I found lot is possible with it. I built a small script that crawls a musical site, which plays online songs. On the way of my experiment, I found that it is possible to crawl the song source also.. (that site doesn't provides downloading).

I just what to know.. Is it fully LEGAL to crawl the sites ? I means using http and port '80',

There are losts of download managers available in the market, they can download from almost all sites.. are all those valid and legal.


The answer your question exactly is yes. The only possible exception is cryptography in your country, if cURL was built with SSL support statically linked, or you're exporting from the US to one of a few countries considered hostile.

Scraping a site's publicly visible webpages is also legal, generally. If you download one copy each of all the pages you could see in your browser, you won't have any problems. If you start to cause problems for other uses, it might be considered a denial-of-service attack. You may also need to check with the site's terms or conditions, but since you already download it to view it, there isn't much difference (it's a subtle technicality, at best).

Downloading music, however, is just that. It doesn't matter if you use Limewire, uTorrent, Megaupload, Flashget or cURL, you're still downloading music. That's legal if the artist/record label says it is, if you own a license, or generally if you're legally allowed to do so.

So, cURL is completely legal. But like anything else, what you do with it may not be.


IMDB.com explicitly forbids the use of scrappers like this with their site as part of their terms of service.


I can't comment on answers since I don't have rep here, but several answers have stated that it might not be legal depending on the websites terms of service. And this is a subtle technicality, but if that's the case, then it's still legal, but you could be sued civilly for breach of contract or copyright violations (although copyright violations can be criminally illegal too). But in general, just because a website's TOS says you can't do something, doesn't mean they have the legal authority from barring you from doing it.


Generally it matters more what you do with it, then how you acquire it. IE, you can copy a CD, but what did you do with that CD? Did you sell it to someone (illegal) or did you simply put it on a shelf on top of your old CD so you have a non-scratched copy (legal). Likewise even when you own the music outright you still only have the right to copy it for your own use, not even for others use.

Here's a question. Generally when it comes to the internet, if it is "published" by someone with rights to publish it, and there aren't any stipulations that it isn't free to use (IE a TOS), then usually it is considered fair game to use it in a non-commercial manner. But, what if the content isn't even part of the "visible" portion of the web page and requires source-scraping and folder surfing to acquire/access. Although it may be on a "public" network and accessible by non-secure means. That's almost like claiming you can rob someone's house because they left their door open, which is a bit of a stretch analogy but valid to some extent. If there are no links on the page, it could be argued that the content was not "published" and therefore you never had rights to access it.

But that's probably much ado about nothing, if you aren't doing anything crazy, or attempting to profit off of others work, then usually no one cares if you source scrape.