Skip to main content

Data archiving with AWS S3 Glacier storage class

547 words·
AWS S3 Glacier Data Archiving

In this tutorial I’m going to use an AWS S3 Bucket to archive data and define the storage class as S3 Glacier. This practice is a lot easier then using an AWS Glacier Vault.

Prerequisites
#

IAM User & Access Keys
#

AWS Console Link: https://us-east-1.console.aws.amazon.com/iamv2/

  • Create an IAM User and Access Keys for the User
  • Add IAM Permission to user

Necessary IAM Permission: AmazonS3FullAccess

Install AWS CLIv2
#

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" Download
sudo apt install unzip Install zip tool
unzip awscliv2.zip unzip
sudo ./aws/install Install AWS CLIv2
/usr/local/bin/aws --version Check Installation / Version
aws configure ADD access keys and default region

The configuration should look like this, the default region name defines where the bucket will be created:

AWS Access Key ID [None]: AKIARCHUALINS5VYIF6W
AWS Secret Access Key [None]: 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
Default region name [None]: eu-west-1
Default output format [None]:

AWS S3 Glacier
#

Storage Classes Overview
#

Here a what I think the most useful storage classes:

GLACIER = S3 Glacier Flexible Retrieval
DEEP_ARCHIVE = S3 Glacier Deep Archive


Create S3 Bucket
#

Create new S3 Bucket to upload the files:
aws s3 mb s3://jklug.work-glacier

Shell output

make_bucket: jklug.work-glacier
Optional Commands
aws s3 ls List available S3 Buckets
aws s3 rb --force s3://jklug.work-glacier Remove S3 Bucket

Upload File
#

Example: S3 Glacier Flexible Retrieval

Copy file to S3 Bucket and define the storage class as “S3 Glacier Flexible Retrieval”:

aws s3 cp --storage-class GLACIER \
/path/to/file \
s3://jklug.work-glacier/

Example: S3 Glacier Deep Archive

Copy file to S3 Bucket and define the storage class as “S3 Glacier Deep Archive”:

aws s3 cp --storage-class DEEP_ARCHIVE \
/path/to/file \
s3://jklug.work-glacier/

Shell output:

upload: ./jklug.work_data1 to s3://jklug.work-glacier/jklug.work_data1

List files
#

After the file was uploaded use the following command to list the files from the S3 Bucket:
aws s3api list-objects --bucket jklug.work-glacier --query 'Contents[].{Key: Key, Size: Size, StorageClass: StorageClass}' --output table

Shell output:

--------------------------------------------------
|                   ListObjects                  |
+-------------------+-----------+----------------+
|        Key        |   Size    | StorageClass   |
+-------------------+-----------+----------------+
|  jklug.work_data1 |  10485760 |  DEEP_ARCHIVE  |
+-------------------+-----------+----------------+

It is also available via the AWS Webconsole:


Retrieve File
#

To download a file from the AWS Glacier storage class, it is first necessary to retrieve the file. Depending on the storage class and the retrieval type, this can take up to 48 hours.

Expedited Retrival

Works only with “S3 Glacier Flexible Retrieval”:

aws s3api restore-object --bucket jklug.work-glacier \
--key jklug.work_data11 \
--restore-request Days=1,,GlacierJobParameters={"Tier"="Expedited"}

Standard Retrival

Works with “S3 Glacier Deep Archive”:

aws s3api restore-object --bucket jklug.work-glacier \
--key jklug.work_data1 \
--restore-request Days=1,,GlacierJobParameters={"Tier"="Standard"}

AWS Webconsole

The retrieval process can also be started via the AWS Webconsole:

Retrival Parameters

The Days parameter: Specifies the number of days that the object should be available for download after the retrival.

The Tier parameter: Specifies the retrieval option, which can be “Expedited”, “Standard”, or “Bulk”.

Expedited: 1-5 min
Standard: Up to 12 hours
Bulk: 48 hours (Cheapest Version)


Check Retrieval Status
#

aws s3api head-object --bucket jklug.work-glacier \
 --key jklug.work_data1

Shell Output

{
    "AcceptRanges": "bytes",
    "Restore": "ongoing-request=\"true\"",
    "LastModified": "2023-05-07T14:16:21+00:00",
    "ContentLength": 10485760,
    "ETag": "\"669fdad9e309b552f1e9cf7b489c1f73-2\"",
    "ContentType": "binary/octet-stream",
    "ServerSideEncryption": "AES256",
    "Metadata": {},
    "StorageClass": "DEEP_ARCHIVE"
}

Wait till "Restore": "ongoing-request=\"true\"" jumps to false


Download File
#

After the file is retrieved it’s ready to download:
aws s3 cp s3://jklug.work-glacier/jklug.work_data1 /destination/path

Note that the file can also be downloaded via the AWS Webconsole.