Multiple domains into wget --accept-regex?
Solution 1:
Normally --accept-regex
uses the POSIX Extended Regular Expression syntax, where a single |
is used for alternative branches. (The same applies if you tell wget to use PCRE syntax, which is a superset of POSIX ERE.)
Note that POSIX Extended regexp syntax (used by egrep
or sed -E
) is different from the POSIX Basic regexp syntax (used by grep
or sed
). For example, BRE uses \|
for alternative branches and |
for a literal pipe symbol, but ERE does the exact opposite. The same goes for parentheses and many other special characters which have to be backslash-prefixed in BRE but not in ERE.
In any case the regexp would look like this:
-
de.wikipedia.org|upload.wikimedia.org
(de|upload).wikimedia.org
-
More correct (dots are special in regex syntax as well):
de\.wikipedia\.org|upload\.wikimedia\.org
(de|upload)\.wikimedia\.org
Note that the |
character is special in most interactive shells (it is the pipe operator), so any parameter containing it needs to be quoted:
wget --accept-regex "(de|upload).wikimedia.org"