Wget

Mainlining your server with wget, or How do I avoid the painful, slow download/upload process?
Using the wget program over SSH at the UNIX shell command line prompt is a great shortcut for uploading software or other files from a remote server to your DreamHost server. You can avoid the sometimes painful and slow download/upload process, and mainline downloads straight to DreamHost's server using their big, fast pipes.

Note: rsync may be a better (faster, less complicated) option for users migrating between two rsync enabled servers (such as moving from DH to DH PS)

Wget is a powerful tool, with lots of options, but even the basics are useful.

Prerequisites: You need to have an SSH or Telnet client, and know how to log on the server and change directory (cd command) to where you want to "inject" your files.

Basic Usage
or

It should be possible to copy/paste the URL from your browser to the shell command line, but you're on your own to find the right combination of menu, keyboard or mouse-click motions (e.g. Edit Copy, Ctrl-C, right-click; Edit Paste, Ctrl-V, right-click or center-click).

When the file is on the server, you may need to use gunzip}, unzip and/or tar to expand and unpack the download. For the example above,  will do the trick. For a zip file:. For a plain gzip file:.

If you need to pass variables to a script then you may need to enclose the URL in sinqle quotes. This will prevent the ampersand character from being interpreted as the shell command.

Advanced Usage
To create a mirror image of a folder on a different server (with the same structure as the original has) you could simply ftp into the server and transfer it:

wget -r ftp://username:password@yourdomain.com/folder/*

This will now download 'folder/' and everything within it keeping its directory structure. This can save you a lot of time rather than using wget on each file individually.

Now you could simply zip the folder using:

zip -r folder.zip folder

and then clean up by deleting the copy:

rm -rf folder

Its a great way to backup your whole website at once and of course its very helpful moving large sites across hosts.

Download the entire contents of example.com

wget -r -l 0 http://www.example.com/

Taken from: GNU Wget Manual - Examples - Advanced Usage

Man Page Info
Do  in shell for more options, but the following is an excerpt:

NAME Wget - The non-interactive network downloader.

SYNOPSIS wget [option]... [URL]...

DESCRIPTION

GNU Wget is a free utility for non-interactive download of files from the Web. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies.

Wget is non-interactive, meaning that it can work in the background, while the user is not logged on. This allows you to start a retrieval and disconnect from the system, letting Wget finish the work. By contrast, most of the Web browsers require constant user's presence, which can be a great hindrance when transferring a lot of data.

Wget can follow links in HTML and XHTML pages and create local versions of remote web sites, fully recreating the directory structure of the original site. This is sometimes referred to as "recursive downloading." While doing that, Wget respects the Robot Exclusion Standard (/robots.txt). Wget can be instructed to convert the links in downloaded HTML files to the local files for offline viewing.

Wget has been designed for robustness over slow or unstable network connections; if a download fails due to a network problem, it will keep retrying until the whole file has been retrieved. If the server supports regetting, it will instruct the server to continue the download from where it left off.

Custom Installation
For those of us who'd like to take advantage of the latest version of Wget, specifically versions with 'large file support', the information below will get you started.

Please keep in mind this article was designed for ADVANCED USERS who already have some *nix shell experience.

Custom Wget installations will NOT be supported by DH Staff.

TODO:
 * Create an 'uninstall' feature.
 * Custom OpenSSL option.

Create and run the follwing shell install script in your home directory:

wget_install.sh
set -e
 * 1) !/bin/sh


 * 1) Version 1.0.2, 2007-09-19
 * 2) - Initial Release 2007-09-19 by Chris Shymanik (chris@chipsncheese.com)
 * 3)   - Custom OpenSSL support still in development.
 * 4)   - Optional locale/man/info file wipe option added (1.0.2)
 * 1)   - Optional locale/man/info file wipe option added (1.0.2)


 * 1) USER CONFIGURATION OPTIONS

SRCDIR=${HOME}/source DISTDIR=${HOME}/dist DISTDEL="Yes" MINIMALINSTALL="Yes" INSTALLDIR=${HOME} BINDIR=${HOME}/bin CONFIGDIR=${HOME}/etc LIBSSL=/usr/local/ssl NICE=19 WGT="wget-1.10.2"
 * 1) Where do you want all this stuff built?
 * 2) ***Don't pick a directory that already exists!***
 * 3) NOte: Directories that don't exist will be created for you!
 * 1) Set DISTDIR to somewhere persistent.
 * 1) Delete contents of DISTDIR after installation? (Default: Yes)
 * 1) Wipe "unneeded" contents (info, man, and locale directories)?
 * 2) (Default: Yes)
 * 1) Where to install everything to? (Default: ${HOME}
 * 2) Note: Best to leave this AS IS for now. You've been warned.
 * 1) Set BINDIR to wherever you keep your binaries.
 * 2) The default is, ${HOME}/bin (/home/username/bin).
 * 1) Set CONFIGDIR to wherever you keep your config (etc) files.
 * 2) The defauls is, ${HOME}/etc (/home/username/etc).
 * 1) Enable Custom OpenSSL Installation? (!!This feature is NOT currently functional!!)
 * 2) *DISABLED*
 * 3) CUSTOMSSL="No"
 * 4) Path to your OpenSSL install. (Default: /usr/local/ssl)
 * 1) Set whatever nice value you wish here.
 * 2) Higher values indicate lower priority,
 * 3) Lower values indicate higher priority.
 * 4) Range: -20 to 20
 * 1) Name of the WGET install package
 * 2) (without any extension, ie: .tar.bz2)

WGETFEATURES="--prefix=${SRCDIR}/installtmp \ --with-libssl=${LIBSSL} \ --disable-debug"
 * 1) What features of WGET do you wish to enable or disable?
 * 2) ***Probably best not to change anything here!***
 * 1) Note: Debuging isn't really necessary so it's currently removed.

sleep 1s
 * 1) DO NOT MODIFY BELOW ##########

export PATH=${BINDIR}:$PATH
 * 1) Push the bin directory into the path.

if [ -d ${SRCDIR} ]; then echo "Source directory already exists! Cleaning it..." rm -rf ${SRCDIR}/* else echo "Creating source directory..." mkdir -p ${SRCDIR} fi if [ -d ${SRCDIR}/installtmp ]; then echo "Something in the script is broken. Aborting..." exit else echo "Creating the temporary install directory..." mkdir -p ${SRCDIR}/installtmp fi
 * 1) Pre-download clean-up and checking.
 * 2) Clear and/or create the source directory.
 * 1) Create the installtmp directory (needed for custom install locations).

if [ -d ${BINDIR} ]; then echo "Deleting wget binary if it exists..." if [ -a ${BINDIR}/wget ]; then rm ${BINDIR}/wget >/dev/null 2>&1 else echo "   Wget binary does not exist." fi else echo "Creating BINDIR..." mkdir -p ${BINDIR} >/dev/null 2>&1 fi if [ -d ${CONFIGDIR} ]; then echo "Config directory exists! Doing nothing..." else echo "Creating Config directory..." mkdir -p ${CONFIGDIR} fi
 * 1) Check for existing wget install and remove it if exists. Else create it.
 * 1) Check for existing wget config directory and create it if it doesn't exist.

set +e cd ${DISTDIR} WGETOPT="-t1 -T10 -w5 -q -c"
 * 1) Grab the required source archives.
 * 1) Wget options

if [ -a ${DISTDIR}/${WGT}.tar.gz ]; then echo "Skipping wget of ${WGT}.tar.gz" else wget $WGETOPT ftp://ftp.ucsb.edu/pub/mirrors/linux/gentoo/distfiles/${WGT}.tar.gz # If primary mirror fails, use the alternative mirror. if [ -a ${DISTDIR}/${WGT}.tar.gz ]; then echo "Got ${WGT}.tar.gz" else wget $WGETOPT http://ftp.gnu.org/gnu/wget/${WGT}.tar.gz # Check to make sure the alternative mirror worked. if [ -a ${DISTDIR}/${WGT}.tar.gz ]; then echo "Got ${WGT}.tar.gz" else echo "Failed to get ${WGT}.tar.gz. Aborting install!" exit 0 fi fi fi
 * 1) Do a bit of error checking while grabbing the sources.

set -e cd ${SRCDIR} echo "Extracting ${WGT}..." tar zxf ${DISTDIR}/${WGT}.tar.gz echo "Done."
 * 1) Unpack the source archives.

cd ${SRCDIR}/${WGT} ./configure ${WGETFEATURES} nice -n ${NICE} make make install
 * 1) Compile and install the package(s).
 * 1) make clean

sleep 2s cd ${HOME} && clear
 * 1) Post-install configuration.

mv ${SRCDIR}/installtmp/bin/wget ${BINDIR}/wget mv ${SRCDIR}/installtmp/etc/wgetrc ${CONFIGDIR}/wgetrc if [ ${MINIMALINSTALL} == "Yes" ]; then echo "Minimal Install selected. Wiping additional content." # Content is wiped during the post-install clean-up phase. elif [ ${MINIMALINSTALL} == "No" ]; then echo "Full Install selected. Installing additional content." mv ${SRCDIR}/installtmp/share ${INSTALLDIR}/share mv ${SRCDIR}/installtmp/man ${INSTALLDIR}/man mv ${SRCDIR}/installtmp/info ${INSTALLDIR}/info else echo "Unknown MINIMALINSTALL option! Keeping all content." mv ${SRCDIR}/installtmp/share ${INSTALLDIR}/share mv ${SRCDIR}/installtmp/man ${INSTALLDIR}/man mv ${SRCDIR}/installtmp/info ${INSTALLDIR}/info fi
 * 1) Minimal Install check

rm -rf ${SRCDIR}/*
 * 1) Post-install clean-up.
 * 2) Kill some Lemmings...

if [ ${DISTDEL} == "Yes" ]; then rm -rf ${DISTDIR} elif [ ${DISTDEL} == "No" ]; then echo "Your DISTDIR will not be cleaned." else echo "Unknown DISTDEL option! Keeping the contents of your DISTDIR by default." sleeps 1s rm -rf ${DISTDIR} fi

echo "" echo "  Post-Install Notes:" echo " =======================" echo "Please be sure to modify the .bash_profile file to reflect your binary directory's path." echo "See the wiki article for an example." echo ""
 * 1) Post-Install Notes

echo "Installation completed!" `date +%r`
 * 1) End of install


 * EOF

Now modify your .bash_profile to include your binary path directory (ie. /home/username/bin)

so that your custom Wget install will be used by default: umask 002 PS1='[\h]$ ' PATH=/home/username/bin:$PATH;

Done!