I had the objects of about 12TB of data and was trying to get rid of unnecessary data. I implemented a simple life cycle policy that moves the objects older than five years to move to a glacier.
Everything was going well for about 2 days and some objects are still related to the production images even though we haven't touched them for 5 years. The decision was to revert the operation.
I stopped the life cycle rule immediately to prevent further objects from going into the glacier storage class.
I read through the AWS documentation (https://docs.aws.amazon.com/AmazonS3/latest/userguide/restoring-objects.html) and found out that A few options I can use are using AWS CLI and AWS Batch Operations.
I decided to go with AWS Batch Operations since that seems like a reasonable choice but I do not have the manifest file. Therefore, I decided to use a AWS CLI in order to get the glacier items.
I prepare some scripts to get the list of items from the glacier_storage_class. however, due to the fact that there could be a large amount of objects in the list, it is always better to do it in the background.
2 things to note:
Provide the permission to the script
sudo chmod +x your_script.sh
Run it in background and make sure it is outputted the status to a log file.
./create_manifest.sh > manifest_log.txt 2>&1 & // run in background and this will show the processId
disown // disown the process
tail -f manifest_log.txt // you can see the status in the log files
Once I finished that, I prepare the manifest file in a format that AWS suggest
your_bucket_name/{object_key}
Make sure your object keys are encoded or else you will have a problem when you are running the batch processes.
Once you have the manifest_file, all you need now is to follow the steps to restore the glacier files and copy them to the standard bucket as in documentation.
This took me about 6 ~ 8 hours of back and forth debugging and checking up, so I hope it can help someone.
Top comments (0)