wget does not recurse when piping the output to stdout [closed]

I does not seem possible to achieve my goal with current versions of wget.

After studying the source code for wget version 1.18, I came to these conclusions:

  • wget cannot recurse if it does not store the downloaded files, at least temporarily as for --spider.

  • When passed -O filename, it keeps appending to filename and reparses the whole file after each download, loading it completely in memory (or mapping it). This is very cumbersome and inefficient.

  • When passed -O-, it pipes the downloaded file to stdout and attempts to reload - to look for more urls to fetch... Which causes stdin to be read for this purpose. This is a side effect of the implementation.

I wrote a patch to add a more sensible piping option, relying on --spider to download html and css files for recursive operation and piping only these files before they are removed. I will publish the patch when it is reasonably tested and documented.