GNU Wget 1.18 Manual

global

wget [option]… [URL]…

http://host[:port]/directory/file
ftp://host[:port]/directory/file

ftp://user:password@host/path
http://user:password@host/path

ftp://host/directory/file;type=a

host:/dir/file

host[:port]/dir/file

wget -r --tries=10 http://fly.srk.fer.hr/ -o log

wget -drc URL

wget -d -r -c URL

wget -o log -- -x

wget -X '' -X /~nobody,/~somebody

wget -c ftp://sunsite.doc.ic.ac.uk/ls-lR.Z

wget --spider --force-html -i bookmarks.html

No options        -> ftp.xemacs.org/pub/xemacs/
-nH               -> pub/xemacs/
-nH --cut-dirs=1  -> xemacs/
-nH --cut-dirs=2  -> .

--cut-dirs=1      -> ftp.xemacs.org/xemacs/
...

wget --no-cookies --header "Cookie: name=value"

wget --header='Accept-Charset: iso-8859-2' \
     --header='Accept-Language: hr'        \
       http://fly.srk.fer.hr/

wget --header="Host: foo.bar" http://localhost/

# Log in to the server.  This can be done only once.
wget --save-cookies cookies.txt \
     --post-data 'user=foo&password=bar' \
     http://example.com/auth.php

# Now grab the page or pages we care about.
wget --load-cookies cookies.txt \
     -p http://example.com/interesting/article.php

wget ftp://gnjilux.srk.fer.hr/*.msg

wget -r -nd --delete-after http://whatever.com/~popular/page/

wget -r -l 2 http://site/1.html

wget -r -l 2 -p http://site/1.html

wget -r -l 1 -p http://site/1.html

wget -r -l 0 -p http://site/1.html

wget -p http://site/1.html

wget -E -H -k -K -p http://site/document

wget --ignore-tags=a,area -H -k -K -r http://site/document

wget -rH -Dexample.com http://www.example.com/

wget -rH -Dfoo.edu --exclude-domains sunsite.foo.edu \
    http://www.foo.edu/

wget -I /people,/cgi-bin http://host/people/bozo/

wget -r --no-parent http://somehost/~luzer/my-archive/

<a href="foo.gif">
<a href="foo/bar.gif">
<a href="../foo/bar.gif">

<a href="/foo.gif">
<a href="/foo/bar.gif">
<a href="http://www.example.com/foo/bar.gif">

wget -S http://www.gnu.ai.mit.edu/

wget -N http://www.gnu.ai.mit.edu/

wget "ftp://ftp.ifi.uio.no/pub/emacs/gnus/*"

wget --timestamping -r ftp://ftp.gnu.org/pub/gnu/

variable = value

reject =

###
### Sample Wget initialization file .wgetrc
###

## You can use this file to change the default behaviour of wget or to
## avoid having to type many many command-line options. This file does
## not contain a comprehensive list of commands -- look at the manual
## to find out what you can put into this file. You can find this here:
##   $ info wget.info 'Startup File'
## Or online here:
##   https://www.gnu.org/software/wget/manual/wget.html#Startup-File
##
## Wget initialization file can reside in /usr/local/etc/wgetrc
## (global, for all users) or $HOME/.wgetrc (for a single user).
##
## To use the settings in this file, you will have to uncomment them,
## as well as change them, in most cases, as the values on the
## commented-out lines are the default values (e.g. "off").
##
## Command are case-, underscore- and minus-insensitive.
## For example ftp_proxy, ftp-proxy and ftpproxy are the same.

##
## Global settings (useful for setting up in /usr/local/etc/wgetrc).
## Think well before you change them, since they may reduce wget's
## functionality, and make it behave contrary to the documentation:
##

# You can set retrieve quota for beginners by specifying a value
# optionally followed by 'K' (kilobytes) or 'M' (megabytes).  The
# default quota is unlimited.
#quota = inf

# You can lower (or raise) the default number of retries when
# downloading a file (default is 20).
#tries = 20

# Lowering the maximum depth of the recursive retrieval is handy to
# prevent newbies from going too "deep" when they unwittingly start
# the recursive retrieval.  The default is 5.
#reclevel = 5

# By default Wget uses "passive FTP" transfer where the client
# initiates the data connection to the server rather than the other
# way around.  That is required on systems behind NAT where the client
# computer cannot be easily reached from the Internet.  However, some
# firewalls software explicitly supports active FTP and in fact has
# problems supporting passive transfer.  If you are in such
# environment, use "passive_ftp = off" to revert to active FTP.
#passive_ftp = off

# The "wait" command below makes Wget wait between every connection.
# If, instead, you want Wget to wait only between retries of failed
# downloads, set waitretry to maximum number of seconds to wait (Wget
# will use "linear backoff", waiting 1 second after the first failure
# on a file, 2 seconds after the second failure, etc. up to this max).
#waitretry = 10

##
## Local settings (for a user to set in his $HOME/.wgetrc).  It is
## *highly* undesirable to put these settings in the global file, since
## they are potentially dangerous to "normal" users.
##
## Even when setting up your own ~/.wgetrc, you should know what you
## are doing before doing so.
##

# Set this to on to use timestamping by default:
#timestamping = off

# It is a good idea to make Wget send your email address in a `From:'
# header with your request (so that server administrators can contact
# you in case of errors).  Wget does *not* send `From:' by default.
#header = From: Your Name <username@site.domain>

# You can set up other headers, like Accept-Language.  Accept-Language
# is *not* sent by default.
#header = Accept-Language: en

# You can set the default proxies for Wget to use for http, https, and ftp.
# They will override the value in the environment.
#https_proxy = http://proxy.yoyodyne.com:18023/
#http_proxy = http://proxy.yoyodyne.com:18023/
#ftp_proxy = http://proxy.yoyodyne.com:18023/

# If you do not want to use proxy at all, set this to off.
#use_proxy = on

# You can customize the retrieval outlook.  Valid options are default,
# binary, mega and micro.
#dot_style = default

# Setting this to off makes Wget not download /robots.txt.  Be sure to
# know *exactly* what /robots.txt is and how it is used before changing
# the default!
#robots = on

# It can be useful to make Wget wait between connections.  Set this to
# the number of seconds you want Wget to wait.
#wait = 0

# You can force creating directory structure, even if a single is being
# retrieved, by setting this to on.
#dirstruct = off

# You can turn on recursive retrieving by default (don't do this if
# you are not sure you know what it means) by setting this to on.
#recursive = off

# To always back up file X as X.orig before converting its links (due
# to -k / --convert-links / convert_links = on having been specified),
# set this variable to on:
#backup_converted = off

# To have Wget follow FTP links from HTML files by default, set this
# to on:
#follow_ftp = off

# To try ipv6 addresses first:
#prefer-family = IPv6

# Set default IRI support state
#iri = off

# Force the default system encoding
#localencoding = UTF-8

# Force the default remote server encoding
#remoteencoding = UTF-8

# Turn on to prevent following non-HTTPS links when in recursive mode
#httpsonly = off

# Tune HTTPS security (auto, SSLv2, SSLv3, TLSv1, PFS)
#secureprotocol = auto

wget http://fly.srk.fer.hr/

wget --tries=45 http://fly.srk.fer.hr/jpg/flyweb.jpg

wget -t 45 -o log http://fly.srk.fer.hr/jpg/flyweb.jpg &

wget ftp://gnjilux.srk.fer.hr/welcome.msg

wget ftp://ftp.gnu.org/pub/gnu/
links index.html

wget -i file

wget -r https://www.gnu.org/ -o gnulog

wget --convert-links -r https://www.gnu.org/ -o gnulog

wget -p --convert-links http://www.example.com/dir/page.html

wget -p --convert-links -nH -nd -Pdownload \
     http://www.example.com/dir/page.html

wget -S http://www.lycos.com/

wget --save-headers http://www.lycos.com/
more index.html

wget -r -l2 -P/tmp ftp://wuarchive.wustl.edu/

wget -r -l1 --no-parent -A.gif http://www.example.com/dir/

wget -nc -r https://www.gnu.org/

wget ftp://hniksic:mypassword@unix.example.com/.emacs

wget -O - http://jagor.srce.hr/ http://www.srce.hr/

wget -O - http://cool.list.com/ | wget --force-html -i -

crontab
0 0 * * 0 wget --mirror https://www.gnu.org/ -o /home/me/weeklog

wget --mirror --convert-links --backup-converted  \
     https://www.gnu.org/ -o /home/me/weeklog

wget --mirror --convert-links --backup-converted \
     --html-extension -o /home/me/weeklog        \
     https://www.gnu.org/

wget -m -k -K -E https://www.gnu.org/ -o /home/me/weeklog

http://hniksic:mypassword@proxy.company.com:8001/

$ wget http://www.gnus.org/dist/gnus.tar.gz &
...
$ kill -HUP %%
SIGHUP received, redirecting output to `wget-log'.

wget -r http://www.example.com/

<meta name="robots" content="nofollow">

Copyright © 2000-2002, 2007-2008, 2015, 2018 Free Software
Foundation, Inc.
http://fsf.org/

Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.

  Copyright (C)  year  your name.
  Permission is granted to copy, distribute and/or modify this document
  under the terms of the GNU Free Documentation License, Version 1.3
  or any later version published by the Free Software Foundation;
  with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
  Texts.  A copy of the license is included in the section entitled ``GNU
  Free Documentation License''.

    with the Invariant Sections being list their titles, with
    the Front-Cover Texts being list, and with the Back-Cover Texts
    being list.

• Overview:		Features of Wget.
• Invoking:		Wget command-line arguments.
• Recursive Download:		Downloading interlinked pages.
• Following Links:		The available methods of chasing links.
• Time-Stamping:		Mirroring according to time-stamps.
• Startup File:		Wget’s initialization file.
• Examples:		Examples of usage.
• Various:		The stuff that doesn’t fit anywhere else.
• Appendices:		Some useful references.
• Copying this manual:		You may give out copies of this manual.
• Concept Index:		Topics covered by this manual.

• URL Format:
• Option Syntax:
• Basic Startup Options:
• Logging and Input File Options:
• Download Options:
• Directory Options:
• HTTP Options:
• HTTPS (SSL/TLS) Options:
• FTP Options:
• Recursive Retrieval Options:
• Recursive Accept/Reject Options:
• Exit Status:

• Spanning Hosts:		(Un)limiting retrieval based on host name.
• Types of Files:		Getting only certain files.
• Directory-Based Limits:		Getting only certain directories.
• Relative Links:		Follow relative links only.
• FTP Links:		Following FTP links.

• Wgetrc Location:		Location of various wgetrc files.
• Wgetrc Syntax:		Syntax of wgetrc.
• Wgetrc Commands:		List of available commands.
• Sample Wgetrc:		A wgetrc example.

• Simple Usage:		Simple, basic usage of the program.
• Advanced Usage:		Advanced tips.
• Very Advanced Usage:		The hairy stuff.

• Time-Stamping Usage:
• HTTP Time-Stamping Internals:
• FTP Time-Stamping Internals:

• Proxies:		Support for proxy servers.
• Distribution:		Getting the latest version.
• Web Site:		GNU Wget’s presence on the World Wide Web.
• Mailing Lists:		Wget mailing list for announcements and discussion.
• Internet Relay Chat:		Wget’s presence on IRC.
• Reporting Bugs:		How and where to report bugs.
• Portability:		The systems Wget works on.
• Signals:		Signal-handling performed by Wget.

• Robot Exclusion:		Wget’s support for RES.
• Security Considerations:		Security with Wget.
• Contributors:		People who helped.

	Index Entry	Section

#
	#wget:	Internet Relay Chat

.
	.css extension:	HTTP Options
	.html extension:	HTTP Options
	.listing files, removing:	FTP Options
	.netrc:	Startup File
	.wgetrc:	Startup File

A
	accept directories:	Directory-Based Limits
	accept suffixes:	Types of Files
	accept wildcards:	Types of Files
	append to log:	Logging and Input File Options
	arguments:	Invoking
	authentication:	Download Options
	authentication:	HTTP Options
	authentication:	HTTP Options
	authentication credentials:	Download Options

B
	backing up converted files:	Recursive Retrieval Options
	backing up files:	Download Options
	bandwidth, limit:	Download Options
	base for relative links in input file:	Logging and Input File Options
	bind address:	Download Options
	bind DNS address:	Download Options
	bug reports:	Reporting Bugs
	bugs:	Reporting Bugs

C
	cache:	HTTP Options
	caching of DNS lookups:	Download Options
	case fold:	Recursive Accept/Reject Options
	client DNS address:	Download Options
	client IP address:	Download Options
	clobbering, file:	Download Options
	command line:	Invoking
	comments, HTML:	Recursive Retrieval Options
	connect timeout:	Download Options
	Content On Error:	HTTP Options
	Content-Disposition:	HTTP Options
	Content-Encoding, choose:	HTTP Options
	Content-Length, ignore:	HTTP Options
	continue retrieval:	Download Options
	continue retrieval:	Download Options
	contributors:	Contributors
	conversion of links:	Recursive Retrieval Options
	cookies:	HTTP Options
	cookies, loading:	HTTP Options
	cookies, saving:	HTTP Options
	cookies, session:	HTTP Options
	cut directories:	Directory Options

D
	debug:	Logging and Input File Options
	default page name:	HTTP Options
	delete after retrieval:	Recursive Retrieval Options
	directories:	Directory-Based Limits
	directories, exclude:	Directory-Based Limits
	directories, include:	Directory-Based Limits
	directory limits:	Directory-Based Limits
	directory prefix:	Directory Options
	DNS cache:	Download Options
	DNS IP address, client, DNS:	Download Options
	DNS IP address, client, DNS:	Download Options
	DNS server:	Download Options
	DNS timeout:	Download Options
	dot style:	Download Options
	downloading multiple times:	Download Options

E
	EGD:	HTTPS (SSL/TLS) Options
	entropy, specifying source of:	HTTPS (SSL/TLS) Options
	examples:	Examples
	exclude directories:	Directory-Based Limits
	execute wgetrc command:	Basic Startup Options

F
	FDL, GNU Free Documentation License:	GNU Free Documentation License
	features:	Overview
	file names, restrict:	Download Options
	file permissions:	FTP Options
	filling proxy cache:	Recursive Retrieval Options
	follow FTP links:	Recursive Accept/Reject Options
	following ftp links:	FTP Links
	following links:	Following Links
	force html:	Logging and Input File Options
	ftp authentication:	FTP Options
	ftp password:	FTP Options
	ftp time-stamping:	FTP Time-Stamping Internals
	ftp user:	FTP Options

G
	globbing, toggle:	FTP Options

H
	hangup:	Signals
	header, add:	HTTP Options
	hosts, spanning:	Spanning Hosts
	HSTS:	HTTPS (SSL/TLS) Options
	HTML comments:	Recursive Retrieval Options
	http password:	HTTP Options
	http referer:	HTTP Options
	http time-stamping:	HTTP Time-Stamping Internals
	http user:	HTTP Options

I
	idn support:	Download Options
	ignore case:	Recursive Accept/Reject Options
	ignore length:	HTTP Options
	include directories:	Directory-Based Limits
	incomplete downloads:	Download Options
	incomplete downloads:	Download Options
	incremental updating:	Time-Stamping
	index.html:	HTTP Options
	input-file:	Logging and Input File Options
	input-metalink:	Logging and Input File Options
	Internet Relay Chat:	Internet Relay Chat
	invoking:	Invoking
	IP address, client:	Download Options
	IPv6:	Download Options
	IRC:	Internet Relay Chat
	iri support:	Download Options

K
	Keep-Alive, turning off:	HTTP Options
	keep-badhash:	Logging and Input File Options

L
	latest version:	Distribution
	limit bandwidth:	Download Options
	link conversion:	Recursive Retrieval Options
	links:	Following Links
	list:	Mailing Lists
	loading cookies:	HTTP Options
	local encoding:	Download Options
	location of wgetrc:	Wgetrc Location
	log file:	Logging and Input File Options

M
	mailing list:	Mailing Lists
	metalink-index:	Logging and Input File Options
	metalink-over-http:	Logging and Input File Options
	mirroring:	Very Advanced Usage

N
	no parent:	Directory-Based Limits
	no-clobber:	Download Options
	nohup:	Invoking
	number of tries:	Download Options

O
	offset:	Download Options
	operating systems:	Portability
	option syntax:	Option Syntax
	Other HTTP Methods:	HTTP Options
	output file:	Logging and Input File Options
	overview:	Overview

P
	page requisites:	Recursive Retrieval Options
	passive ftp:	FTP Options
	password:	Download Options
	pause:	Download Options
	Persistent Connections, disabling:	HTTP Options
	portability:	Portability
	POST:	HTTP Options
	preferred-location:	Logging and Input File Options
	progress indicator:	Download Options
	proxies:	Proxies
	proxy:	Download Options
	proxy:	HTTP Options
	proxy authentication:	HTTP Options
	proxy filling:	Recursive Retrieval Options
	proxy password:	HTTP Options
	proxy user:	HTTP Options

Q
	quiet:	Logging and Input File Options
	quota:	Download Options

R
	random wait:	Download Options
	randomness, specifying source of:	HTTPS (SSL/TLS) Options
	rate, limit:	Download Options
	read timeout:	Download Options
	recursion:	Recursive Download
	recursive download:	Recursive Download
	redirect:	HTTP Options
	redirecting output:	Advanced Usage
	referer, http:	HTTP Options
	reject directories:	Directory-Based Limits
	reject suffixes:	Types of Files
	reject wildcards:	Types of Files
	relative links:	Relative Links
	remote encoding:	Download Options
	reporting bugs:	Reporting Bugs
	required images, downloading:	Recursive Retrieval Options
	resume download:	Download Options
	resume download:	Download Options
	retries:	Download Options
	retries, waiting between:	Download Options
	retrieving:	Recursive Download
	robot exclusion:	Robot Exclusion
	robots.txt:	Robot Exclusion

S
	sample wgetrc:	Sample Wgetrc
	saving cookies:	HTTP Options
	security:	Security Considerations
	server maintenance:	Robot Exclusion
	server response, print:	Download Options
	server response, save:	HTTP Options
	session cookies:	HTTP Options
	signal handling:	Signals
	spanning hosts:	Spanning Hosts
	specify config:	Logging and Input File Options
	spider:	Download Options
	SSL:	HTTPS (SSL/TLS) Options
	SSL certificate:	HTTPS (SSL/TLS) Options
	SSL certificate authority:	HTTPS (SSL/TLS) Options
	SSL certificate type, specify:	HTTPS (SSL/TLS) Options
	SSL certificate, check:	HTTPS (SSL/TLS) Options
	SSL CRL, certificate revocation list:	HTTPS (SSL/TLS) Options
	SSL protocol, choose:	HTTPS (SSL/TLS) Options
	SSL Public Key Pin:	HTTPS (SSL/TLS) Options
	start position:	Download Options
	startup:	Startup File
	startup file:	Startup File
	suffixes, accept:	Types of Files
	suffixes, reject:	Types of Files
	symbolic links, retrieving:	FTP Options
	syntax of options:	Option Syntax
	syntax of wgetrc:	Wgetrc Syntax

T
	tag-based recursive pruning:	Recursive Accept/Reject Options
	time-stamping:	Time-Stamping
	time-stamping usage:	Time-Stamping Usage
	timeout:	Download Options
	timeout, connect:	Download Options
	timeout, DNS:	Download Options
	timeout, read:	Download Options
	timestamping:	Time-Stamping
	tries:	Download Options
	Trust server names:	HTTP Options
	types of files:	Types of Files

U
	unlink:	Download Options
	updating the archives:	Time-Stamping
	URL:	URL Format
	URL syntax:	URL Format
	usage, time-stamping:	Time-Stamping Usage
	user:	Download Options
	user-agent:	HTTP Options

V
	various:	Various
	verbose:	Logging and Input File Options

W
	wait:	Download Options
	wait, random:	Download Options
	waiting between retries:	Download Options
	WARC:	HTTPS (SSL/TLS) Options
	web site:	Web Site
	Wget as spider:	Download Options
	wgetrc:	Startup File
	wgetrc commands:	Wgetrc Commands
	wgetrc location:	Wgetrc Location
	wgetrc syntax:	Wgetrc Syntax
	wildcards, accept:	Types of Files
	wildcards, reject:	Types of Files
	Windows file names:	Download Options

GNU Wget 1.18 Manual

Table of Contents

Wget 1.20

1 Overview

2 Invoking

2.1 URL Format

2.2 Option Syntax

2.3 Basic Startup Options

2.4 Logging and Input File Options

2.5 Download Options

2.6 Directory Options

2.7 HTTP Options

2.8 HTTPS (SSL/TLS) Options

2.9 FTP Options

2.10 FTPS Options

2.11 Recursive Retrieval Options

2.12 Recursive Accept/Reject Options

2.13 Exit Status

3 Recursive Download

4 Following Links

4.1 Spanning Hosts

4.2 Types of Files

4.3 Directory-Based Limits

4.4 Relative Links

4.5 Following FTP Links

5 Time-Stamping

5.1 Time-Stamping Usage

5.2 HTTP Time-Stamping Internals

5.3 FTP Time-Stamping Internals

6 Startup File

6.1 Wgetrc Location

6.2 Wgetrc Syntax

6.3 Wgetrc Commands

6.4 Sample Wgetrc

7 Examples

7.1 Simple Usage

7.2 Advanced Usage

7.3 Very Advanced Usage

8 Various

8.1 Proxies

8.2 Distribution

8.3 Web Site

8.4 Mailing Lists

Primary List

Obsolete Lists

8.5 Internet Relay Chat

8.6 Reporting Bugs

8.7 Portability

8.8 Signals

9 Appendices

9.1 Robot Exclusion

9.2 Security Considerations

9.3 Contributors

Appendix A Copying this manual

A.1 GNU Free Documentation License

ADDENDUM: How to use this License for your documents

Concept Index

Table of Contents

Footnotes

(1)

(2)