Thursday, April 7, 2016

How to download and archive a website using wget on OSX

(This is how to get wget without installing homebrew or installing someone else's compiled code.  I just wanted to remind myself how to install from source.)

Install xcode from Apple.  I think the most recent versions of OSX have this built-in.  Mine did and I cannot remember if I installed it previously or not.

Initialize xcode so no weird errors happen when we call it later.  (Use Launcher to find xcode and run it.  Accept the EULA).

Install the xcode command line tools:

xcode-select --install

Thankfully, curl is already built-in.  So use curl to download wget:

cd ~/Downloads
curl -O http://ftp.gnu.org/gnu/wget/wget-1.15.tar.gz
tar -zxvf wget-1.15.tar.gz
cd wget-1.15/
./configure --without-ssl
make
sudo make install
cd ..
rm -rf ~/Downloads/wget*
wget

The line above simply tests wget.  Now test wget:

wget http://ftp.gnu.org/gnu/wget/wget-1.15.tar.gz

Now download the desired web site:

cd ~/Downloads
wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains www.whatever.com --no-parent www.whatever.com/.

Sources:
I followed these two pages but put my corrections in my instructions above.