How to delete all glacier data?

I was using a tool on Mac OS X called Arq to backup my data, but i found it so hard to upload all my stuff since I don't and can't have an internet connection that is fast enough for it.

So I decided to delete all my backups, but whenever I try from the software itself it does nothing.

I also tried FastGlacier on my other windows machine, it hangs up and takes too much resources.

I was wondering if there is an easy way to do this.

P.S. My glacier has ~450 GB in 341907 archives


The purge-vault from this project works nicely: https://github.com/vsespb/mt-aws-glacier

Install, then run these commands (replace vault-name with the name of your vault):

mtglacier retrieve-inventory --config glacier.cfg --vault vault-name

wait for about 2 hours, and then

mtglacier download-inventory --config glacier.cfg --vault vault-name --new-journal vault-name.log
mtglacier purge-vault --config glacier.cfg --vault vault-name --journal vault-name.log

https://github.com/leeroybrun/glacier-vault-remove was created for this exact purpose.

To remove a vault, first install the dependencies:

$ git clone https://github.com/leeroybrun/glacier-vault-remove.git
$ cd glacier-vault-remove
$ python setup.py install

Then create a credentials file, credentials.json in the same directory:

{
  "AWSAccessKeyId": "YOURACCESSKEY",
  "AWSSecretKey":   "YOURSECRETKEY"
}

Then run the script like this

$ python removeVault.py REGION-NAME VAULT-NAME

Example :

$ python removeVault.py us-east-1 my_vault

If you remove a Glacier-backed folder in Arq it goes into Arq's trash. If you select it in Arq's trash and click "Delete Permanently", Arq will delete all the Glacier archives and attempt to delete the Glacier vault. The vault delete might fail because Amazon has to update its "inventory", which it does once/day. The next day, browse under "Other Backup Sets" in Arq, find that vault, select it and click "Delete" to delete it.

If you have a vault that's not associated with any Arq backups, pick "Legacy Glacier Vaults" from Arq's menu, select the vault, and click the button to delete.


You can use a freeware product like CloudBerry Explorer http://www.cloudberrylab.com/free

Note, Glacier data doesn't become available immediately. you need to wait 24 hours for the global inventory to occur on the Amazon side, then you should click Get Inventory button and wait another 5 hours to get the inventory for your account.

Thanks


How to delete Vault (AWS Glacier)

This Gist give some tips in order to remove AWS Glacier Vault with AWS CLI (ie. https://aws.amazon.com/en/cli/).

Step 1 / Retrive inventory

$ aws glacier initiate-job --job-parameters "{\"Type\": \"inventory-retrieval\"}" --vault-name YOUR_VAULT_NAME --account-id YOUR_ACCOUNT_ID --region YOUR_REGION

Wait during 3/5 hours… :-(

For the new step you need to get the JobId. When the retrive inventory is done you can get it with the following command: aws glacier list-jobs --vault-name YOUR_VAULT_NAME --region YOUR_REGION

Step 2 / Get the ArchivesIds

$ aws glacier get-job-output --job-id YOUR_JOB_ID --vault-name YOUR_VAULT_NAME --region YOUR_REGION ./output.json

See. Downloading a Vault Inventory in Amazon Glacier

You can get all the ArchiveId in the ./output.json file.

Step 3 / Delete Archives

Powershell

from @vinyar

$input_file_name = 'output.json'
$vault_name = 'my_vault'
# $account_id = 'AFDKFKEKF9EKALD' #not used. using - instead

$a = ConvertFrom-Json $(get-content $input_file_name)

$a.ArchiveList.archiveid | %{
write "executing: aws glacier delete-archive --archive-id=$_ --vault-name $vault_name --account-id -"
aws glacier delete-archive --archive-id=$_ --vault-name $vault_name --account-id - }

Python

from @robweber

ijson, which reads in the file as a stream instead. You can install it with pip

import ijson, subprocess

input_file_name = 'output.json'
vault_name = ''
account_id = ''

f = open(input_file_name)
archive_list = ijson.items(f,'ArchiveList.item')

for archive in archive_list:
    print("Deleting archive " + archive['ArchiveId'])
    command = "aws glacier delete-archive --archive-id='" + archive['ArchiveId'] + "' --vault-name " + vault_name + " --acc$
    subprocess.run(command, shell=True, check=True)

f.close()

PHP

from @Remiii

<?php

$file = './output.json' ;
$accountId = 'YOUR_ACCOUNT_ID' ;
$region = 'YOUR_REGION' ;
$vaultName = 'YOUR_VAULT_NAME' ;

$string = file_get_contents ( $file ) ;
$json = json_decode($string, true ) ;
foreach ( $json [ 'ArchiveList' ] as $jsonArchives )
{
    echo 'Delete Archive: ' . $jsonArchives [ 'ArchiveId' ] . "\n" ;
    exec ( 'aws glacier delete-archive --archive-id="' . $jsonArchives [ 'ArchiveId' ] . '" --vault-name ' . $vaultName . ' --account-id ' . $accountId . ' --region ' . $region , $output ) ;
    echo $output ;
}

Mark: After you delete an archive, if you immediately download the vault inventory, it might include the deleted archive in the list because Amazon Glacier prepares vault inventory only about once a day.

See. Deleting an Archive in Amazon Glacier

Step 4 / Delete a Vault

$ aws glacier delete-vault --vault-name YOUR_VAULT_NAME --account-id YOUR_ACCOUNT_ID --region YOUR_REGION

Gist originally by @Remiii

Ok So a few years ago I closed my account and just reopened it a few month ago and guess what amazon still has my 3TB there on my account and now I got billed for them for the last few months.

So I came back to this question and found that:

  • mt-aws-glacier is almost impossible to setup on the latest ubuntu then went to 12.04 awscli is not there, then when to 14.04 got an error about my signature...
  • The Arq Answer is no longer relevant in Arq 5
  • Then I found the above gist and copied it here because it is better for the community
  • Tried cloudberry and it looks like it should work I will update here in 4~10 hours