Using tar, rclone, bash, AWS CLI, systemd services, timers - to schedule a website back up to an S3 bucket

Using AWS CLI to create an S3 bucket and to attach IAM policies
Configuring rclone to access the AWS S3 bucket
Creating a bash script to back up the website folder and upload it to an S3 bucket
Using a systemd service and timer to schedule website backup

Using AWS CLI to create an S3 bucket and to attach IAM policies

In this tutorial, we will back up the web site to a file and we will copy the resulting backup file to an AWS S3 bucket with rclone, at 3AM every day, via a systemd timer and service.

We will use AWS CLI to list and create a bucket to upload the backup archive.
You can add temporary credentials to the file~/.aws/credentials; make sure they allow creation of a bucket and attaching IAM policies:

[default]
aws_access_key_id=xx
aws_secret_access_key=xx
aws_session_token=xxxx

After which, you can run commands - list buckets for example

$ aws s3 ls
2024-11-20 20:05:58 cf-templates-1gphkm0byc8yz-eu-west-2
2024-11-18 20:18:16 elasticbeanstalk-eu-west-2-590184137142
2024-11-13 05:10:36 george-aws1-bucket-1
2024-11-20 22:23:03 s3bucket-yaml-mys3bucket-teepjyjfkdjb

Let’s create a bucket (bucket names have to be unique across AWS, as they are accessed via URLs like web sites):

$ aws s3api create-bucket --bucket georgetech-backup --region eu-west-2 --create-bucket-configuration LocationConstraint=eu-west-2
{
    "Location": "http://georgetech-backup.s3.amazonaws.com/"
}

Run the following command to create a new IAM user with programmatic access:

$ aws iam create-user --user-name rclone-s3-bucket
{
    "User": {
        "Path": "/",
        "UserName": "rclone-s3-bucket",
        "UserId": "xx",
        "Arn": "arn:aws:iam::590184137142:user/rclone-s3-bucket",
        "CreateDate": "2025-03-30T12:43:16+00:00"
    }
}

Generate access keys (Access Key ID and Secret Access Key) for the user rclone-s3-bucket:

$ aws iam create-access-key --user-name rclone-s3-bucket
{
    "AccessKey": {
        "UserName": "rclone-s3-bucket",
        "AccessKeyId": "xxx",
        "Status": "Active",
        "SecretAccessKey": "t/xx+JHV9ptOY",
        "CreateDate": "2025-03-30T12:47:40+00:00"
    }
}

Next, we will create a policy that defines the permissions needed by rclone to access the georgetech-backup bucket and to list all buckets in the account (one long line):

$ aws iam create-policy --policy-name RcloneS3BucketAccess --policy-document '{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:ListAllMyBuckets",
            "Resource": "arn:aws:s3:::*"
        },
        {
            "Effect": "Allow",
            "Action": "s3:ListBucket",
            "Resource": "arn:aws:s3:::*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:CreateBucket"
            ],
            "Resource": [
                "arn:aws:s3:::georgetech-backup",
                "arn:aws:s3:::georgetech-backup/*"
            ]
        }
    ]
}'

The result of the command will be similar to the below:

{
    "Policy": {
        "PolicyName": "RcloneS3BucketAccess",
        "PolicyId": "xx",
        "Arn": "arn:aws:iam::account-id:policy/RcloneS3BucketAccess",
        "Path": "/",
        "DefaultVersionId": "v1",
        "AttachmentCount": 0,
        "PermissionsBoundaryUsageCount": 0,
        "IsAttachable": true,
        "CreateDate": "2025-03-30T12:50:49+00:00",
        "UpdateDate": "2025-03-30T12:50:49+00:00"
    }
}

We will attach the policy to the account, you need to get the <account-id> from the AWS console (IAM > Users > user rclone-s3-bucket):

$ aws iam attach-user-policy --user-name rclone-s3-bucket --policy-arn arn:aws:iam::<account-id>:policy/RcloneS3BucketAccess

Configuring rclone to access the AWS S3 bucket

We will install rclone:

sudo -v ; curl https://rclone.org/install.sh | sudo bash

Net we will add the AccessKeyId and SecretAccessKey to rclone via rclone config:

$ rclone config
Configuration complete.
Options:
- type: s3
- provider: AWS
- access_key_id: xx
- secret_access_key: t/xx+xx
- region: eu-west-2
- location_constraint: eu-west-2
- acl: private

The corresponding ~/.config/rclone/rclone.conf file will look like this:

[s3]
type = s3
provider = AWS
access_key_id = xx
secret_access_key = xxx
region = eu-west-2
location_constraint = eu-west-2
acl = private

We can list the files in the backup bucket:

$ rclone ls s3:georgetech-backup/
      548 .bashrc

Creating a bash script to back up the website folder and upload it to an S3 bucket

Let’s create a bash script to automatically back up the Hugo site with tar and copy it to the AWS S3 bucket with rclone.
The backup_date variable will be the current date, and with backuppath it will be used in the script twice, to avoid mistyping.
We will also check for errors returned by tar or rclone and will add log entries with logger.
The rclone --config /path/to/rclone.conf should be added, as the commands will be run by systemd as the root user, and we are the admin user: Don’t forget to make the file executable with chmod +x backup.sh:

#!/usr/bin/env bash

backup_date=$(date '+%d%m%Y_%H%M')
filename=hugo-site-$backup_date.tgz
backuppath=/home/admin/backups/
hugopath=/home/admin/quickstart/
rcloneconfigpath=/home/admin/.config/rclone/rclone.conf

tar czf $backuppath$filename $hugopath
if [ $? -eq 0 ]; then
        logger -p user.info "Hugo backup - $filename - successfully created."
else
        logger -p user.err "Hugo backup - $filename\ - creation failed."
fi

rclone --config $rcloneconfigpath copy $backuppath$filename s3:georgetech-backup/
if [ $? -eq 0 ]; then
        logger -p user.info "Hugo backup - $filename - successfully copied to s3 bucket."
else
        logger -p user.err "Hugo backup - $filename - copy to s3 bucket failed."
fi

We can check the result in the systemd log file via journalctl -r, which will list the latest log entries first:

Mar 30 14:35:48 ip-172-31-34-177 admin[5520]: Hugo backup - hugo-site-30032025_1435.tgz - successfully copied to s3 bucket.
Mar 30 14:35:48 ip-172-31-34-177 admin[5512]: Hugo backup - hugo-site-30032025_1435.tgz - successfully created.

We can also use the AWS CLI to list files in the bucket:

$ aws s3 ls s3://georgetech-backup/
2025-03-30 15:27:08   11971349 hugo-site-30032025_1427.tgz

Creating a systemd service and timer to schedule website backup

We will create a systemd service to run a few commands:

  • archive the /home/admin/quickstart/ folder and its files to a tar archive whose name include the date and time
  • use the logger command to add a system log entry that the site has been archived and uploaded to the bucket; an error log entry will be added upon failure.
    We can use the editor vim to create the file:
sudo vim /etc/systemd/system/hugo-site-backup.service

The contents of the systemd unit:

  • oneshot is used for a series of sequential tasks that terminate after running
  • we specify each task to run with ExecStart
  • we use bash -c to start the shell, and between quotes we use absolute paths (/home/admin/backups instead of ~/backups, because the unit will be run as a different user, root:
[Unit]
Description=Backup Hugo Site

[Service]
Type=oneshot
ExecStart=/bin/bash -c "/home/admin/backups/backup.sh"

We will create a timer unit named /etc/systemd/system/hugo-site-backup.timer, to run the service above every day at 3:00 AM:

[Unit]
Description=Backup Hugo Site

[Timer]
OnCalendar=*-*-* 03:00:00
Persistent=true

[Install]
WantedBy=timers.target

We will the enable timer, but it will only run when scheduled:

sudo systemctl enable hugo-site-backup.timer

We can get its status, it is waiting for the trigger (3AM) to start the hugo-site-backup.service:

$ systemctl status hugo-site-backup.timer
● hugo-site-backup.timer - Backup Hugo Site
     Loaded: loaded (/etc/systemd/system/hugo-site-backup.timer; enabled; preset: enabled)
     Active: active (waiting) since Sun 2025-03-30 14:45:39 UTC; 11min ago
    Trigger: Mon 2025-03-31 03:00:00 UTC; 12h left
   Triggers: ● hugo-site-backup.service

If we want to manually start the associated service to test its functionality, we can do so:

sudo systemctl start hugo-site-backup.service

The journalctl -r command shows that the hugo-site-backup.service ran successfully:

Mar 30 14:52:31 host systemd[1]: Finished hugo-site-backup.service - Backup Hugo Site.
Mar 30 14:52:31 host systemd[1]: hugo-site-backup.service: Deactivated successfully.
Mar 30 14:52:31 host root[5806]: Hugo backup - hugo-site-30032025_1452.tgz - successfully copied to s3 bucket.
Mar 30 14:52:30 host root[5796]: Hugo backup - hugo-site-30032025_1452.tgz - successfully created.
Mar 30 14:52:30 host bash[5793]: tar: Removing leading `/' from member names
Mar 30 14:52:30 host systemd[1]: Starting hugo-site-backup.service - Backup Hugo Site...
Mar 30 14:52:30 host sudo[5787]: pam_unix(sudo:session): session opened for user root(uid=0) by admin(uid=1000)
Mar 30 14:52:30 host sudo[5787]:    admin : TTY=pts/0 ; PWD=/home/admin/backups ; USER=root ; COMMAND=/usr/bin/systemctl start hugo-site-backup.service

We can list the systemd timers, ours is in the list, waiting to be triggered at the scheduled time:

$ systemctl list-timers
NEXT                        LEFT          LAST                        PASSED       UNIT                         ACTIVATES
Mon 2025-03-31 02:19:03 BST 4h 39min left Sun 2025-03-30 07:05:51 BST 14h ago      apt-daily.timer              apt-daily.service
[..]
Mon 2025-03-31 03:00:00 BST 5h 20min left -                           -            hugo-site-backup.timer       hugo-site-backup.service
[..]