The GNU utility wget can be used to access files through the Internet using HTTP, HTTPS, or FTP. The best thing about the utility is that if the process is interrupted and started again, it continues from where it left off.
Go to http://examples.oreilly.com/upt3 for more information on: wget
The wget utility is installed by default in a lot of systems, but if you can't find it, it can be downloaded from GNU, at http://www.gnu.org/software/wget/wget.html.
The basic syntax for wget is very simple: type wget followed by the URL of the file or files you're trying to download:
wget http://www.somefile.com/somefile.htm wget ftp://www.somefile.com/somefile
The file is downloaded and saved and a status is printed out to the screen:
--16:51:58-- http://dynamicearth.com:80/index.htm => `index.htm' Connecting to dynamicearth.com:80... connected! HTTP request sent, awaiting response... 200 OK Length: 9,144 [text/html] 0K -> ........ [100%] 16:51:58 (496.09 KB/s) - `index.htm' saved [9144/9144]
The default use of wget downloads the file into your current location. If the download is interrupted, by default wget does not resume at the point of interruption. You need to specify an option for this behavior. The wget options can be found in Table 40-2. Short and long forms of each option are specified, and options that don't require input can be grouped together:
> wget -drc URL
For those options that do require an input, you don't have to separate the option and the input with whitespace:
> wget -ooutput.file URL
Option |
Purpose |
Examples |
---|---|---|
-V |
Get version of wget |
wget -V |
-h or --help |
Get listing of wget options |
wget -help |
-b or --background |
Got to background after start |
wget -b url |
-e or --execute=COMMAND |
Execute command |
wget -e COMMAND url |
-o or --output-file=file |
Log messages to file |
wget -o filename url |
-a or --append-output=file |
Appends to log file |
wget -a filename url |
-d or --debug |
Turn on debug output |
wget -d url |
-q or --quiet |
Turn off wget's output |
wget -q url |
-v or --verbose |
Turn on verbose output |
wget -v url |
-nv or -non-verbose |
Turn off verbose output |
wget -nv url |
-i or --input-file=file |
Read urls from file |
wget -I inputfile |
-F or --force-html |
Force input to be treated as HTML |
wget -F url |
-t or --tries=number |
Number of re-tries to get file |
wget -t 3 url |
-O or --output-document=file |
Forces all documents into specified |
wget -O savedfile -i inputfile |
-nc or --no-clobber |
Don't clobber existing file |
wget -nc url |
-c or --continue |
Continue getting file |
wget -c url |
--dot-style=style |
Retrieval indicator |
wget -dot-style=binary url |
-N or --timestamping |
Turn on time-stamping |
wget -N url |
-S or --server-response |
Print HTTP headers, FTP responses |
wget -S url |
--spider |
Wget behaves as a web spider, doesn't download |
wget --spider url |
-T or --timeout=seconds |
Set the time out |
-wget -T 30 url |
-w or --wait=seconds |
Wait specified number of seconds |
wget -w 20 url |
-Y or --proxy=on/off |
Turn proxy on or off |
wget -Y on url |
-Q or --quota=quota |
Specify download quota size |
wget -Q2M url |
-nd or --no-directories |
Do not create directories in recursive download |
wget -nd url |
-x or -- force-directories |
Opposite of -nd |
wget -x url |
-nh or --no-host-directories |
Disable host-prefixed directories |
wget -nh url |
--cut-dirs=number |
Ignore number directories |
wget -cur-dirs=3 url |
-P or --directory-prefix=prefix |
Set directory to prefix |
wget -P test url |
--http-user=user --http-passwd=passwd |
Set username and password |
wget --http-user=user --http-passwd=password url |
Copyright © 2003 O'Reilly & Associates. All rights reserved.