I was going through our spending in AWS and I found that we spent a lot in Sydney region (?). In details most of the bill there is because of stored snapshots. There is 7508 of them dated to back to 2013. For volumes that does not exists any more.
The only instances (and volumes) we have in Sydney today are:
* mref1.aps2.stream.centos.org
* mref2.apse2.stream.centos.org
With no tags or description.
I can easily remove the acrued snapshots. But I do not know the details. I can setup Recycle Bin rule to delete snapshots older than 1-365 days. I can set it to 365. Objections?
And one of the latest snapshot is
snap-04716c8d5d0f23c6e (fedora-coreos-38.20230609.3.0-x86_64) https://ap-southeast-2.console.aws.amazon.com/ec2/home?region=ap-southeast-2#SnapshotDetails:snapshotId=snap-04716c8d5d0f23c6e
which is for volume
vol-ffffffff https://ap-southeast-2.console.aws.amazon.com/ec2/home?region=ap-southeast-2#VolumeDetails:volumeId=vol-ffffffff
that does not exist any more. There is more of such snapshots. So whoever is making some process, you are good in deleting volumes, but you are leaving snapshots behinds. Likely the ones no one needs.
On 03/07/2023 06:39, Miroslav Suchý wrote:
I was going through our spending in AWS and I found that we spent a lot in Sydney region (?). In details most of the bill there is because of stored snapshots. There is 7508 of them dated to back to 2013. For volumes that does not exists any more.
The only instances (and volumes) we have in Sydney today are:
mref1.aps2.stream.centos.org
mref2.apse2.stream.centos.org
With no tags or description.
I can easily remove the acrued snapshots. But I do not know the details. I can setup Recycle Bin rule to delete snapshots older than 1-365 days. I can set it to 365. Objections?
And one of the latest snapshot is
snap-04716c8d5d0f23c6e (fedora-coreos-38.20230609.3.0-x86_64) https://ap-southeast-2.console.aws.amazon.com/ec2/home?region=ap-southeast-2#SnapshotDetails:snapshotId=snap-04716c8d5d0f23c6e
which is for volume
vol-ffffffff https://ap-southeast-2.console.aws.amazon.com/ec2/home?region=ap-southeast-2#VolumeDetails:volumeId=vol-ffffffff
that does not exist any more. There is more of such snapshots. So whoever is making some process, you are good in deleting volumes, but you are leaving snapshots behinds. Likely the ones no one needs.
-- Miroslav Suchy, RHCA Red Hat, Manager, Packit and CPT, #brno, #fedora-buildsys
Hi Miroslav,
As you can see, there are indeed two EC2 instances running there but unrelated to your snapshot issue, so worth verifying with the CoreOS folks ?
On Mon, Jul 03, 2023 at 04:02:02PM +0200, Fabian Arrotin wrote:
On 03/07/2023 06:39, Miroslav Suchý wrote:
I was going through our spending in AWS and I found that we spent a lot in Sydney region (?). In details most of the bill there is because of stored snapshots. There is 7508 of them dated to back to 2013. For volumes that does not exists any more.
The only instances (and volumes) we have in Sydney today are:
mref1.aps2.stream.centos.org
mref2.apse2.stream.centos.org
With no tags or description.
I can easily remove the acrued snapshots. But I do not know the details. I can setup Recycle Bin rule to delete snapshots older than 1-365 days. I can set it to 365. Objections?
And one of the latest snapshot is
snap-04716c8d5d0f23c6e (fedora-coreos-38.20230609.3.0-x86_64) https://ap-southeast-2.console.aws.amazon.com/ec2/home?region=ap-southeast-2#SnapshotDetails:snapshotId=snap-04716c8d5d0f23c6e
which is for volume
vol-ffffffff https://ap-southeast-2.console.aws.amazon.com/ec2/home?region=ap-southeast-2#VolumeDetails:volumeId=vol-ffffffff
that does not exist any more. There is more of such snapshots. So whoever is making some process, you are good in deleting volumes, but you are leaving snapshots behinds. Likely the ones no one needs.
-- Miroslav Suchy, RHCA Red Hat, Manager, Packit and CPT, #brno, #fedora-buildsys
Hi Miroslav,
As you can see, there are indeed two EC2 instances running there but unrelated to your snapshot issue, so worth verifying with the CoreOS folks ?
Yeah, lets get the Fedora coreos folks to look, it seems like those are their images. Perhaps they are making them, but not cleaning up old ones in that region?
kevin
Dne 05. 07. 23 v 21:17 Kevin Fenzi napsal(a):
On Mon, Jul 03, 2023 at 04:02:02PM +0200, Fabian Arrotin wrote:
On 03/07/2023 06:39, Miroslav Suchý wrote:
I was going through our spending in AWS and I found that we spent a lot in Sydney region (?). In details most of the bill there is because of stored snapshots. There is 7508 of them dated to back to 2013. For volumes that does not exists any more.
Yeah, lets get the Fedora coreos folks to look, it seems like those are their images. Perhaps they are making them, but not cleaning up old ones in that region?
It has been week with no response (I know holiday season...). I will give it one more week. If no one raise a voice, I will create Recycle-bin rule that will automatically delete **ALL** volume snapshots older than one year. In ALL AWS regions where we have some snapshots. I will work on that next Monday.
If you need to preserve some snapshot longer than one year, please let me know.
On 7/11/23 04:21, Miroslav Suchý wrote:
Dne 05. 07. 23 v 21:17 Kevin Fenzi napsal(a):
On Mon, Jul 03, 2023 at 04:02:02PM +0200, Fabian Arrotin wrote:
On 03/07/2023 06:39, Miroslav Suchý wrote:
I was going through our spending in AWS and I found that we spent a lot in Sydney region (?). In details most of the bill there is because of stored snapshots. There is 7508 of them dated to back to 2013. For volumes that does not exists any more.
Yeah, lets get the Fedora coreos folks to look, it seems like those are their images. Perhaps they are making them, but not cleaning up old ones in that region?
It has been week with no response (I know holiday season...). I will give it one more week. If no one raise a voice, I will create Recycle-bin rule that will automatically delete **ALL** volume snapshots older than one year. In ALL AWS regions where we have some snapshots. I will work on that next Monday.
If you need to preserve some snapshot longer than one year, please let me know.
Apologies for not responding sooner. Actually for some reason this is the first email I've seen in the thread so maybe I need to check my spam filters. Either way, apologies.
The reason you are seeing snapshots but no volumes is because these snapshots are used as backing storage for AMIs. If the snapshot is still associated with an AMI AWS won't let you delete it.
On the Fedora CoreOS side we need to implement garbage collection so that we delete all AMIs and snapshots from our development streams. For our production streams we'll probably take a more conservative approach to garbage collection, but we'll need to start garbage collecting those too.
For Fedora Cloud, that working group will also need to look at their processes and implement garbage collection too. It could either be a separate process or it could be working with you to set a policy directly in AWS to clean up after some time.
For Fedora CoreOS we'd like to implement the GC outside of AWS since we'd like to have the same GC policy for all clouds we create resources in.
Does this make sense?
Dne 11. 07. 23 v 15:53 Dusty Mabe napsal(a):
Apologies for not responding sooner. Actually for some reason this is the first email I've seen in the thread so maybe I need to check my spam filters. Either way, apologies.
The reason you are seeing snapshots but no volumes is because these snapshots are used as backing storage for AMIs. If the snapshot is still associated with an AMI AWS won't let you delete it.
On the Fedora CoreOS side we need to implement garbage collection so that we delete all AMIs and snapshots from our development streams. For our production streams we'll probably take a more conservative approach to garbage collection, but we'll need to start garbage collecting those too.
For Fedora Cloud, that working group will also need to look at their processes and implement garbage collection too. It could either be a separate process or it could be working with you to set a policy directly in AWS to clean up after some time.
For Fedora CoreOS we'd like to implement the GC outside of AWS since we'd like to have the same GC policy for all clouds we create resources in.
Can you create issue for each of the case. So it does not get lost.
Does this make sense?
Sure. Do you expect this soonish? Or should I manually remove the old ones now?
On 7/12/23 16:17, Miroslav Suchý wrote:
Dne 11. 07. 23 v 15:53 Dusty Mabe napsal(a):
Apologies for not responding sooner. Actually for some reason this is the first email I've seen in the thread so maybe I need to check my spam filters. Either way, apologies.
The reason you are seeing snapshots but no volumes is because these snapshots are used as backing storage for AMIs. If the snapshot is still associated with an AMI AWS won't let you delete it.
On the Fedora CoreOS side we need to implement garbage collection so that we delete all AMIs and snapshots from our development streams. For our production streams we'll probably take a more conservative approach to garbage collection, but we'll need to start garbage collecting those too.
For Fedora Cloud, that working group will also need to look at their processes and implement garbage collection too. It could either be a separate process or it could be working with you to set a policy directly in AWS to clean up after some time.
For Fedora CoreOS we'd like to implement the GC outside of AWS since we'd like to have the same GC policy for all clouds we create resources in.
Can you create issue for each of the case. So it does not get lost.
For Fedora CoreOS we have this issue to track: https://github.com/coreos/fedora-coreos-tracker/issues/99
For the cloud working group we'd need to reach out to them to see how they want to proceed.
Does this make sense?
Sure. Do you expect this soonish? Or should I manually remove the old ones now?
If you say it's an issue then hopefully we can give this some priority and get to it soon.
Removing the "old ones" isn't that easy to do. Our production AMIs and development AMIs are all mixed together so it would be hard to come up with a criteria without implementing the garbage collection I linked to above.
Dusty
Dne 13. 07. 23 v 0:26 Dusty Mabe napsal(a):
If you say it's an issue then hopefully we can give this some priority and get to it soon.
Removing the "old ones" isn't that easy to do. Our production AMIs and development AMIs are all mixed together so it would be hard to come up with a criteria without implementing the garbage collection I linked to above.
But AWS will not allow you to delete snapshot that is associated with AMI (you said). So we can delete everything older than one year. And these that will errors are these that we still want to keep. So we will just ignore the errors.
On 7/12/23 19:03, Miroslav Suchý wrote:
Dne 13. 07. 23 v 0:26 Dusty Mabe napsal(a):
If you say it's an issue then hopefully we can give this some priority and get to it soon.
Removing the "old ones" isn't that easy to do. Our production AMIs and development AMIs are all mixed together so it would be hard to come up with a criteria without implementing the garbage collection I linked to above.
But AWS will not allow you to delete snapshot that is associated with AMI (you said). So we can delete everything older than one year. And these that will errors are these that we still want to keep. So we will just ignore the errors.
Indeed. Thanks for clarifying. Yes that should be OK I think. though it would be good to test your process against a few snapshots that back AMIs we know we don't care about first to make sure it fails. I tried in the web interface and it wouldn't let me so I'm pretty sure that's the case.
Dusty
infrastructure@lists.fedoraproject.org