How to delete all glacier data?
I was using a tool on Mac OS X called Arq to backup my data, but i found it so hard to upload all my stuff since I don't and can't have an internet connection that is fast enough for it.
So I decided to delete all my backups, but whenever I try from the software itself it does nothing.
I also tried FastGlacier on my other windows machine, it hangs up and takes too much resources.
I was wondering if there is an easy way to do this.
P.S. My glacier has ~450 GB in 341907 archives
The purge-vault from this project works nicely: https://github.com/vsespb/mt-aws-glacier
Install, then run these commands (replace vault-name with the name of your vault):
mtglacier retrieve-inventory --config glacier.cfg --vault vault-name
wait for about 2 hours, and then
mtglacier download-inventory --config glacier.cfg --vault vault-name --new-journal vault-name.log
mtglacier purge-vault --config glacier.cfg --vault vault-name --journal vault-name.log
https://github.com/leeroybrun/glacier-vault-remove was created for this exact purpose.
To remove a vault, first install the dependencies:
$ git clone https://github.com/leeroybrun/glacier-vault-remove.git
$ cd glacier-vault-remove
$ python setup.py install
Then create a credentials file, credentials.json
in the same directory:
{
"AWSAccessKeyId": "YOURACCESSKEY",
"AWSSecretKey": "YOURSECRETKEY"
}
Then run the script like this
$ python removeVault.py REGION-NAME VAULT-NAME
Example :
$ python removeVault.py us-east-1 my_vault
If you remove a Glacier-backed folder in Arq it goes into Arq's trash. If you select it in Arq's trash and click "Delete Permanently", Arq will delete all the Glacier archives and attempt to delete the Glacier vault. The vault delete might fail because Amazon has to update its "inventory", which it does once/day. The next day, browse under "Other Backup Sets" in Arq, find that vault, select it and click "Delete" to delete it.
If you have a vault that's not associated with any Arq backups, pick "Legacy Glacier Vaults" from Arq's menu, select the vault, and click the button to delete.
You can use a freeware product like CloudBerry Explorer http://www.cloudberrylab.com/free
Note, Glacier data doesn't become available immediately. you need to wait 24 hours for the global inventory to occur on the Amazon side, then you should click Get Inventory button and wait another 5 hours to get the inventory for your account.
Thanks
How to delete Vault (AWS Glacier)
This Gist give some tips in order to remove AWS Glacier Vault
with AWS CLI (ie. https://aws.amazon.com/en/cli/).
Step 1 / Retrive inventory
$ aws glacier initiate-job --job-parameters "{\"Type\": \"inventory-retrieval\"}" --vault-name YOUR_VAULT_NAME --account-id YOUR_ACCOUNT_ID --region YOUR_REGION
Wait during 3/5 hours… :-(
For the new step you need to get the JobId
. When the retrive inventory is done you can get it with the following command: aws glacier list-jobs --vault-name YOUR_VAULT_NAME --region YOUR_REGION
Step 2 / Get the ArchivesIds
$ aws glacier get-job-output --job-id YOUR_JOB_ID --vault-name YOUR_VAULT_NAME --region YOUR_REGION ./output.json
See. Downloading a Vault Inventory in Amazon Glacier
You can get all the ArchiveId
in the ./output.json
file.
Step 3 / Delete Archives
Powershell
from @vinyar
$input_file_name = 'output.json'
$vault_name = 'my_vault'
# $account_id = 'AFDKFKEKF9EKALD' #not used. using - instead
$a = ConvertFrom-Json $(get-content $input_file_name)
$a.ArchiveList.archiveid | %{
write "executing: aws glacier delete-archive --archive-id=$_ --vault-name $vault_name --account-id -"
aws glacier delete-archive --archive-id=$_ --vault-name $vault_name --account-id - }
Python
from @robweber
ijson, which reads in the file as a stream instead. You can install it with pip
import ijson, subprocess
input_file_name = 'output.json'
vault_name = ''
account_id = ''
f = open(input_file_name)
archive_list = ijson.items(f,'ArchiveList.item')
for archive in archive_list:
print("Deleting archive " + archive['ArchiveId'])
command = "aws glacier delete-archive --archive-id='" + archive['ArchiveId'] + "' --vault-name " + vault_name + " --acc$
subprocess.run(command, shell=True, check=True)
f.close()
PHP
from @Remiii
<?php
$file = './output.json' ;
$accountId = 'YOUR_ACCOUNT_ID' ;
$region = 'YOUR_REGION' ;
$vaultName = 'YOUR_VAULT_NAME' ;
$string = file_get_contents ( $file ) ;
$json = json_decode($string, true ) ;
foreach ( $json [ 'ArchiveList' ] as $jsonArchives )
{
echo 'Delete Archive: ' . $jsonArchives [ 'ArchiveId' ] . "\n" ;
exec ( 'aws glacier delete-archive --archive-id="' . $jsonArchives [ 'ArchiveId' ] . '" --vault-name ' . $vaultName . ' --account-id ' . $accountId . ' --region ' . $region , $output ) ;
echo $output ;
}
Mark: After you delete an archive, if you immediately download the vault inventory, it might include the deleted archive in the list because Amazon Glacier prepares vault inventory only about once a day.
See. Deleting an Archive in Amazon Glacier
Step 4 / Delete a Vault
$ aws glacier delete-vault --vault-name YOUR_VAULT_NAME --account-id YOUR_ACCOUNT_ID --region YOUR_REGION
Gist originally by @Remiii
Ok So a few years ago I closed my account and just reopened it a few month ago and guess what amazon still has my 3TB there on my account and now I got billed for them for the last few months.
So I came back to this question and found that:
- mt-aws-glacier is almost impossible to setup on the latest ubuntu then went to 12.04 awscli is not there, then when to 14.04 got an error about my signature...
- The Arq Answer is no longer relevant in Arq 5
- Then I found the above gist and copied it here because it is better for the community
- Tried cloudberry and it looks like it should work I will update here in 4~10 hours