Copy website for offline browsing using HTTrack website copier.
Install HTTrack.
$ sudo apt-get install -y httrack
Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: libhttrack2 Suggested packages: webhttrack httrack-doc The following NEW packages will be installed: httrack libhttrack2 0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded. Need to get 302 kB of archives. After this operation, 798 kB of additional disk space will be used. Get:1 http://ftp.task.gda.pl/debian stretch/main amd64 libhttrack2 amd64 3.48.24-1+b2 [263 kB] Get:2 http://ftp.task.gda.pl/debian stretch/main amd64 httrack amd64 3.48.24-1+b2 [39.8 kB] Fetched 302 kB in 0s (609 kB/s) Selecting previously unselected package libhttrack2. (Reading database ... 27239 files and directories currently installed.) Preparing to unpack .../libhttrack2_3.48.24-1+b2_amd64.deb ... Unpacking libhttrack2 (3.48.24-1+b2) ... Selecting previously unselected package httrack. Preparing to unpack .../httrack_3.48.24-1+b2_amd64.deb ... Unpacking httrack (3.48.24-1+b2) ... Processing triggers for libc-bin (2.24-11+deb9u3) ... Processing triggers for man-db (2.7.6.1-2) ... Setting up libhttrack2 (3.48.24-1+b2) ... Setting up httrack (3.48.24-1+b2) ... Processing triggers for libc-bin (2.24-11+deb9u3) ...
I will use version 3.48
.
$ httrack --version HTTrack version 3.48-24
Download single article to article-x
directory using near
parameter to also get files linked inside downloaded page.
$ httrack --mirror --ext-depth=0 --depth=1 --near --stay-on-same-address --keep-links=0 --path article-x --quiet https://example.com/article-x/
Mirror the whole website to example.org
directory using filters to limit downloaded files.
$ httrack --mirror --robots=0 --stay-on-same-domain --keep-links=0 --path example.org --quiet https://example.org/ -* +example.org/*
Continue download located in example.org
directory.
$ httrack --continue --path example.org
Update website located in article-x
directory.
$ httrack --update --path article-x
Mirror the whole website to example.net
directory using filters to limit downloaded files using 8 concurrent connections, 400KB/s transfer rate limit and maximum 4 connections per second.
$ httrack --mirror --robots=0 --stay-on-same-domain --keep-links=0 --path example.net --max-rate=409600 --connection-per-second=4 --sockets=8 --quiet https://example.net/ -* +example.org/*
Mirror single article to article-y
directory using Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0
user agent, <a href="https://example.com/list/" rel="nofollow">https://example.com/list/</a>
referer and pl
preferred language.
$ httrack --mirror --ext-depth=0 --depth=1 --near --stay-on-same-address --keep-links=0 --user-agent "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0" --referer "https://example.com/list/" --language "pl" --path article-y --quiet https://example.net/article-y/