Wget Wizard

Introduction

Wget is an amazing open source tool which helps you download files from the internet - it's very powerful and configurable. But it's hard to remember all the configuration options!

This form simplifies some of the common use-cases of wget into an easy to understand user interface. Choose your options and it will suggest the wget command to run on your own server or computer.

Specify a URL to download - can be a single file or a whole directory. If you want the contents of the whole directory; choose additional options below.
Great for downloading the contents of "Index of..." The contents of the directory and its subdirectories in the URL will be neatly downloaded in to your current directory, without indexes. Make sure the URL ends with a forward-slash for best results!
Create a full mirror of the website: wget will do its best to create a local version of the specified website (get all assets, sub-pages etc and rewrite links to work locally)
If crawling the site, wget won't go any deeper than this many levels (in the url path)
Specify a file to read a list of urls from (one per line)
Only download files with these extensions - separate with spaces or commas - all other file types will be ignored by wget
Never download files with these extensions - separate with spaces or commas
What to do with files that already exist on your computer. ^NOTE: "Only download if file on server is newer" relies on the server providing the "last-modified" header; otherwise with that choice, it always overwrites.
Specify a location on your computer to put downloaded files into
Resume a partial download (if the server supports it)
Disregard what robots.txt on the server specifies as "off-limits"
Some servers have misconfigured TLS certs which causes problems; this setting tells wget to ignore those problems.
Include request headers to try to bypass server-side caches
Skip various operating system meta files (DS_Store, Thumbs.db etc)
Specify a file on your computer to log wget's output to (instead of printing it to the terminal)
If logging activity to a file, this will add new output to the end of that file (instead of overwriting it and creating a new file)
The number of times to retry downloading each file
Add a small wait between requests, to help not overload the server.
Introduce a small random delay between requests, based on the specified "Wait between requests" time.
Specify a referrer for the wget requests; useful for pages/files that only work when Referer is set to one of the pages that point to them.
Should the command use the "long syntax" for commands or not (where possible)? eg: -h vs --help
By submitting this form, you indicate that you have read and agree to our Terms and Conditions, in particular the Wget Wizard section.

Comments, questions?

Constructive feedback on this system is always welcome. If you have any ideas for additions or changes, we'd love to hear about it.

Things we'd love your feedback on:

We welcome all feedback, but here's a few items we'd particularly love your thoughts on:

  • We've included some basic "meta data" filenames; (DS_Store, Thumbs.db, thumbcache.db)... but are there any more we should include in the sample command? Please suggest some!
  • Are there other command options that you use frequently or find confusing and want added here as well?
  • Is the user interface for this wizard clear enough? Is there anything you were initially confused by, or couldn't figure out?

Is there anything you want to let us know? Get in touch!

Problems with Wget

These are some things we can't seem to get wget to do properly; if you have any ideas, let us know!

  • Maximum crawl depth is ignored by wget if Get complete mirror is chosen! It seems like it should work; instead of adding the --mirror option (which according to the docs is equivalent to `-r -N -l inf --no-remove-listing`, we set those parameters manually and omit `-l inf`, instead using whatever value is specified in Maximum crawl depth. However wget doesn't respect this value and still crawls many levels deep.

Usage notes

This isn't meant to be an exhaustive list of every single configuration option or variation of configuration option... it just provides a nice balance of usability without over-complicating the user interface for the "average" user. For example, really advanced options like timeouts, quotas, IP families, encoding, POST data, globbing, content-disposition and so on, are omitted from this wizard. If you use them, you should really read the documentation and understand it enough to build your own command settings.

Only use wget if you know what you're doing. If you have questions; ask; we're always happy to help people who have done as much research as they're able to; your questions might also help us improve this tool for everyone. Don't use it to pirate stuff or do other illegal things. Don't overload servers with it. Be respectful to the sysadmins and developers who run the sites you love to use and don't overload them.

Make sure you have read and agree to our Terms and Conditions, in particular the Wget Wizard section before using this wizard.

You are encouraged to use the sample command this tool generates as the foundation of the command you actually need... for example, if you want to add other "meta-data" file types to ignore, copy the command and edit it before running it! We hope you find this wizard useful.

Use our invite code and get a $10 credit for StickerMule!