Marc Blase

Download CSS images with WGET

Sometimes you just don’t have the access you need when working on a site. While the thought of downloading all the included files individually sounds like fun, there’s always a better use of time. So I did some research and found that the best solution is via the command line using WGET. Here’s the command:

wget --page-requisites http://site/path/page.html

If you need to download a whole site, give this a go:

wget -m -p -E -k -K -np http://site/path/

Thanks stackoverflow users for this solution.

UPDATE:

I ran into an issue in trying to backup a WP site for a client where the original dev took it hostage and wouldn’t release it. They had turned on the “Discourage search engines” option in the admin and WGET was failing. So here’s a new method to circumvent a robots.txt file with Disallow: /.

wget -e robots=off -k -K  -E -r -l 10 -p -N -F --restrict-file-names=windows -nH http://DOMAIN.TLD/

UPDATE:

I ran into an issue with a client site hosted on a SaaS Member Solutions website. Some of the assets/URLs/etc if not used/placed properly would use the clients proxy URL for the service, eg. https://client-service-url.com — so when running the above WGET commands anything that used the proxy URL would not be downloaded. Thus spawned the following which uses the -D flag.

wget -e robots=off -k -K -r -l 3 -E -p -N -F --restrict-file-names=windows -D CLIENT_DOMAIN.com,CLIENT_VARIANT_DOMAIN.com -w 1 -nH https://CLIENT_DOMAIN.com
Published on July 9, 2013