macOS Terminal - using wget and bash- ERROR "Argument list is too long"
I'm using wget
and bash
to download a number of sequential files from a URL using {1..####}
but am getting the error: Argument list is too long
-
When I run
getconf ARG_MAX
it says262144
- what is this limit in reference to? -
What command will increase the argument limit (or can I remove it or set it to infinite?)
Solution 1:
-
ARG_MAX
is the limit (in terms of memory used) on the size of the total argument list + environment variables passed to an executable. See this previous question, and this more detailed explanation. -
You can use
xargs
to split a list of arguments into small-enough-to-process groups, but depending on the form of the arguments (and whether they contain various troublesome characters like whitespace, escapes, quotes etc) this can get complicated. One generally safe way to do it is to useprintf '%s\0'
to print the argument list with nulls terminating it, thenxargs -0
to consume the list:printf '%s\0' https://example.com/prefix{1..100}.html | xargs -0 curl -O
Note that any arguments that should be passed to each invocation of the utility (like the
-O
in this example) must be included in thexargs
invocation, not theprintf
arg list. Also, if there are arguments that need to be passed after the big list, you need a more complex invocation ofxargs
.Also, this may look like it shouldn't work because the huge argument list is still being passed to
printf
, but that's a shell builtin, not a separate executable, so it's handled inside bash itself and the limit doesn't apply.
[BTW, I thought there must be a previous Q&A covering this, but I couldn't find one. If someone else finds a good one, please mark this question as a duplicate.]
Solution 2:
When you run
wget 'https://example.com/prefix'{1..9999}'.html'
the expansion of the {1..9999}
is done by the shell, resulting in an extremely long list of arguments (run echo foo{1..10}
to see what happens).
Instead, you can just run
for i in {1..9999}; do
wget 'https://example.com/prefix'${i}'.html'
done
or (as a one-liner)
for i in {1..9999}; do wget 'https://example.com/prefix'${i}'.html'; done
to have the shell handle the loop directly and not in the arguments passed to wget
. The overall performance of the downloads is limited by the network anyway, so forking and executing 10'000 wget
processes (instead of just one) doesn't have a noticeable impact.
PS: Replace 9999 with whatever the highest number is, or use something like {1,7,9,15,22,36}
for specific numbers.