Categories
WebOps

How to install ArchiveBox on RaspberryPi

Install beloved ArchiveBox on RaspberryPi3.

ArchiveBox

Operating system

I will use DietPi operating system as it suits my personal preference.

Perform initial setup and pure minimal installation.

I will use builtin dietpi user to run archivebox service.

Storage

I will connect an external USB disk to store data on it, as it is very easy to outgrow the SD card.

Install btrfs-progs.

$ sudo apt install btrfs-progs

Format external USB disk.

$ sudo mkfs.btrfs -f /dev/sda

Ensure that it will be mounted at boot.

$ cat <<EOF | sudo tee -a /etc/fstab
/dev/sda /mnt/dietpi_userdata btrfs compress 0 0
EOF

Mount it.

$ sudo mount -a

Fix permissions.

$ sudo chown dietpi:dietpi /mnt/dietpi_userdata

Install NPM packages

Installing prerequisites.

$ sudo apt install nodejs npm git

Create a directory for npm packages.

$ mkdir /mnt/dietpi_userdata/.npm-global/

Configure npm to use created directory.

$ npm config set prefix '/mnt/dietpi_userdata/.npm-global/'

Ensure that executable files will be available in PATH (also for Python packages).

$ echo export PATH=/mnt/dietpi_userdata/.npm-global/bin:/home/dietpi/.local/bin:$PATH | tee -a ~/.profile

Update PATH in current shell.

$ source ~/.profile

Install required packages.

$ npm install -g git+https://github.com/postlight/mercury-parser.git
$ npm install -g git+https://github.com/gildas-lormeau/SingleFile.git
$ npm install -g git+https://github.com/pirate/readability-extractor.git

You will see some missing dependencies.

npm WARN jsdom@16.5.3 requires a peer of canvas@^2.5.0 but none is installed. You must install peer dependencies yourself.
npm WARN ws@7.4.5 requires a peer of bufferutil@^4.0.1 but none is installed. You must install peer dependencies yourself.
npm WARN ws@7.4.5 requires a peer of utf-8-validate@^5.0.2 but none is installed. You must install peer dependencies yourself.

canvas requires more dependencies to be built and installed.

$ sudo apt-get install build-essential libcairo2-dev libpango1.0-dev libjpeg-dev libgif-dev librsvg2-dev 

Install these peer dependencies.

$ npm install -g utf-8-validate@5.0.2
$ npm install -g bufferutil@4.0.1
$ npm install -g canvas@2.5.0

Do not worry if you missed these as you can see missing peer dependencies when you list installed packages.

$ npm list -g
[...]
npm ERR! peer dep missing: canvas@^2.5.0, required by jsdom@16.5.3

Install ArchiveBox

Create a dedicated directory for ArchiveBox.

$ mkdir /mnt/dietpi_userdata/archivebox

Install package dependencies.

$ sudo apt install python3 python3-pip python3-distutils wget curl youtube-dl ffmpeg ripgrep lsb-release chromium-browser

Install archivebox Python package.

$ pip3 install --user archivebox 

Configure

Change working directory to the base of all operations.

$ cd /mnt/dietpi_userdata/archivebox/

Initialize ArchiveBox.

$ archivebox init

At least for now do not save to archive.org.

$ archivebox config --set SAVE_ARCHIVE_DOT_ORG=False

Define browser.

$ archivebox config --set CHROME_BINARY=chromium-browser 

Increase timeout as RaspberryPi3 sometimes needs more time to crunch data.

$ archivebox config --set TIMEOUT=1200

Slightly increase screenshot resolution.

$ archivebox config --set RESOLUTION=1440,4320

Define superuser for the web interface.

$ archivebox manage createsuperuser

Create systemd service.

$ cat <<EOF | sudo tee /etc/systemd/system/archivebox.service
[Unit]
Description=ArchiveBox (DietPi)

[Service]
User=dietpi
WorkingDirectory=/mnt/dietpi_userdata/archivebox/
Environment=PATH=/mnt/dietpi_userdata/.npm-global/bin:/home/dietpi/.local/bin:/usr/local/bin:/usr/bin:/bin
ExecStart=/home/dietpi/.local/bin/archivebox server --quick-init 0.0.0.0:8000

[Install]
WantedBy=multi-user.target
EOF

Reload systemd configuration.

$ sudo systemctl daemon-reload

Enable and start archivebox service.

$ sudo systemctl enable --now archivebox

Archive first website.

$ archivebox add 'https://sleeplessbeastie.eu'

List archived websites.

$ archivebox list                                                                                       

Display help.

$ archivebox help  

Additional notes

You can ignore the following message.

dpkg-query: no packages found matching bluealsa

Create artificial package if that bothers you.

You will see the following message when installed npm is not compatible with current node version.

npm WARN npm npm does not support Node.js v10.24.0
npm WARN npm You should probably upgrade to a newer version of node as we
npm WARN npm can't make any promises that npm will work with this version.
npm WARN npm Supported releases of Node.js are the latest release of 4, 6, 7, 8, 9.
npm WARN npm You can find the latest version at https://nodejs.org/

In this specific case, everything will work as expected despite this warning. Update npm to the latest version to get rid of this message.

$ npm install -g npm@latest