Question: Does the current livecd installer inefficiently write lots of 0's to the destination drive that it doesn't need to?
I think it might. The os.img on the F7 livecd is a 4G sparse file with about 2.3G of data. Anaconda's livecdcopy backend uses python's os.read/write. I would guess that that means that 4G of data is getting written, when theoretically only 2.3G needs to.
The solution that comes to mind is this-
in livecd-tools, create the os.img as a 7G (or 700G??) sparse file. Basically just way big. Then take care to make the ext3fs be the exact correct size for the data (i.e. 2.3G). Then, in the initramfs, just after mounting it (after snapshotting it), do a resize2fs to 7G (or 700G).
Then when anaconda does the copy, only copy the first 2.3G of the sparse file.
It is however late in the 'day' for me, so maybe someone can chime in with confirmation or refutation of my logic here.
-dmc
Douglas McClendon wrote:
Question: Does the current livecd installer inefficiently write lots of 0's to the destination drive that it doesn't need to?
I think it might. The os.img on the F7 livecd is a 4G sparse file with about 2.3G of data. Anaconda's livecdcopy backend uses python's os.read/write. I would guess that that means that 4G of data is getting written, when theoretically only 2.3G needs to.
The solution that comes to mind is this-
in livecd-tools, create the os.img as a 7G (or 700G??) sparse file. Basically just way big. Then take care to make the ext3fs be the exact correct size for the data (i.e. 2.3G). Then, in the initramfs, just after mounting it (after snapshotting it), do a resize2fs to 7G (or 700G).
To clarify a bit- Clearly the resize2fs should probably happen during boot (long after initramfs). No need to bloat the initramfs with resize2fs.
Also, the mechanism that comes to mind for the ext3fs creation is this-
Take the existing image built as is, but after final install, resize2fs it to the smallest possible (nearly), then truncate the file, then do the dd seek trick to re-sparsify it vastly larger.
Or perhaps just throw in an entire extra tarcopy of the system to a new fs image file created the exact right size from the beginning. This is more work, but will possibly save space on any files that got created and deleted during the installation process.
-dmc
Douglas McClendon wrote:
Douglas McClendon wrote:
Question: Does the current livecd installer inefficiently write lots of 0's to the destination drive that it doesn't need to?
I think it might. The os.img on the F7 livecd is a 4G sparse file with about 2.3G of data. Anaconda's livecdcopy backend uses python's os.read/write. I would guess that that means that 4G of data is getting written, when theoretically only 2.3G needs to.
The solution that comes to mind is this-
in livecd-tools, create the os.img as a 7G (or 700G??) sparse file. Basically just way big. Then take care to make the ext3fs be the exact correct size for the data (i.e. 2.3G). Then, in the initramfs, just after mounting it (after snapshotting it), do a resize2fs to 7G (or 700G).
To clarify a bit- Clearly the resize2fs should probably happen during boot (long after initramfs). No need to bloat the initramfs with resize2fs.
Yeah, I really gotta stop posting when I'm sleep deprived. Clearly in initramfs you have access to sysroot thus no bloat.
and...
Also, the mechanism that comes to mind for the ext3fs creation is this-
Take the existing image built as is, but after final install, resize2fs it to the smallest possible (nearly), then truncate the file, then do the dd seek trick to re-sparsify it vastly larger.
Or perhaps just throw in an entire extra tarcopy of the system to a new fs image file created the exact right size from the beginning. This is more work, but will possibly save space on any files that got created and deleted during the installation process.
clearly the resize2fs to minimal will take care of the created/deleted issue.
-dmc
I see someone besides me uses mailing lists to keep public notes for themselves to remember later. :)
--g
On Tue, 10 Jul 2007, Douglas McClendon wrote:
Douglas McClendon wrote:
Douglas McClendon wrote:
Question: Does the current livecd installer inefficiently write lots of 0's to the destination drive that it doesn't need to?
I think it might. The os.img on the F7 livecd is a 4G sparse file with about 2.3G of data. Anaconda's livecdcopy backend uses python's os.read/write. I would guess that that means that 4G of data is getting written, when theoretically only 2.3G needs to.
The solution that comes to mind is this-
in livecd-tools, create the os.img as a 7G (or 700G??) sparse file. Basically just way big. Then take care to make the ext3fs be the exact correct size for the data (i.e. 2.3G). Then, in the initramfs, just after mounting it (after snapshotting it), do a resize2fs to 7G (or 700G).
To clarify a bit- Clearly the resize2fs should probably happen during boot (long after initramfs). No need to bloat the initramfs with resize2fs.
Yeah, I really gotta stop posting when I'm sleep deprived. Clearly in initramfs you have access to sysroot thus no bloat.
and...
Also, the mechanism that comes to mind for the ext3fs creation is this-
Take the existing image built as is, but after final install, resize2fs it to the smallest possible (nearly), then truncate the file, then do the dd seek trick to re-sparsify it vastly larger.
Or perhaps just throw in an entire extra tarcopy of the system to a new fs image file created the exact right size from the beginning. This is more work, but will possibly save space on any files that got created and deleted during the installation process.
clearly the resize2fs to minimal will take care of the created/deleted issue.
-dmc
-- Fedora-livecd-list mailing list Fedora-livecd-list@redhat.com https://www.redhat.com/mailman/listinfo/fedora-livecd-list
Greg Dekoenigsberg wrote:
I see someone besides me uses mailing lists to keep public notes for themselves to remember later. :)
Well, I would hope you do it for the same reason I do. I.e. they aren't just personal notes, but rather technical ideas. And hopefully leveraging exposure to the community might save you time and effort if someone catches your mental/technical errors before either you catch them yourself, or you go implement something that should have been done differently.
-dmc
Douglas,
so, why don't you try 'em yourself? I guess it would save lots of traffic for you if you send just patch(es) here ;)
Regards, Vladimir
Douglas McClendon пишет:
Greg Dekoenigsberg wrote:
I see someone besides me uses mailing lists to keep public notes for themselves to remember later. :)
Well, I would hope you do it for the same reason I do. I.e. they aren't just personal notes, but rather technical ideas. And hopefully leveraging exposure to the community might save you time and effort if someone catches your mental/technical errors before either you catch them yourself, or you go implement something that should have been done differently.
-dmc
-- Fedora-livecd-list mailing list Fedora-livecd-list@redhat.com https://www.redhat.com/mailman/listinfo/fedora-livecd-list
Vladimir Shebordaev wrote:
Douglas,
so, why don't you try 'em yourself? I guess it would save lots of traffic for you if you send just patch(es) here ;)
mail filters (e.g. procmail) are wonderful things. I promise that all patches of mine will have PATCH in the subject line ;)
-dmc
Regards, Vladimir
Douglas McClendon пишет:
Greg Dekoenigsberg wrote:
I see someone besides me uses mailing lists to keep public notes for themselves to remember later. :)
Well, I would hope you do it for the same reason I do. I.e. they aren't just personal notes, but rather technical ideas. And hopefully leveraging exposure to the community might save you time and effort if someone catches your mental/technical errors before either you catch them yourself, or you go implement something that should have been done differently.
-dmc
-- Fedora-livecd-list mailing list Fedora-livecd-list@redhat.com https://www.redhat.com/mailman/listinfo/fedora-livecd-list
-- Fedora-livecd-list mailing list Fedora-livecd-list@redhat.com https://www.redhat.com/mailman/listinfo/fedora-livecd-list
Douglas McClendon wrote:
Douglas McClendon wrote:
Question: Does the current livecd installer inefficiently write lots of 0's to the destination drive that it doesn't need to?
I think it might. The os.img on the F7 livecd is a 4G sparse file with about 2.3G of data. Anaconda's livecdcopy backend uses python's os.read/write. I would guess that that means that 4G of data is getting written, when theoretically only 2.3G needs to.
The solution that comes to mind is this-
in livecd-tools, create the os.img as a 7G (or 700G??) sparse file. Basically just way big. Then take care to make the ext3fs be the exact correct size for the data (i.e. 2.3G). Then, in the initramfs, just after mounting it (after snapshotting it), do a resize2fs to 7G (or 700G).
To clarify a bit- Clearly the resize2fs should probably happen during boot (long after initramfs). No need to bloat the initramfs with resize2fs.
Assuming my prior logic is valid, there is another aspect to consider-
The above recommendation would add some time to every livecd boot sequence. However long resize2fs would take to run. Also, whatever metadata resize2fs would write to the overlay, would be a permanent ram penalty (albeit small hopefully).
An alternate idea I had, is that you could build the minimized ext3fs image as described, but then resize2fs it back to the larger size - during spin composition. Then, at livecd install time, (note people, we are now talking about the normal livecd installer, not my crazy rebootless stuff), you could create a _second_ snapshot of the big read only ext3fs image, and then minimize-resize2fs it, before starting the copy to destination volume.
This is much more complicated, but probably the better solution, as it takes the delay and ram hit out of every livecd boot, and adds it only at install time (and the ram hit gets freed after installation completes). The livecd build time resize2fs minimization(and remaximization) might still be a good idea, if it helps minimize the ram and processing time required to do the anaconda install time minimize-resize2fs.
-dmc
Also, the mechanism that comes to mind for the ext3fs creation is this-
Take the existing image built as is, but after final install, resize2fs it to the smallest possible (nearly), then truncate the file, then do the dd seek trick to re-sparsify it vastly larger.
Or perhaps just throw in an entire extra tarcopy of the system to a new fs image file created the exact right size from the beginning. This is more work, but will possibly save space on any files that got created and deleted during the installation process.
-dmc
-- Fedora-livecd-list mailing list Fedora-livecd-list@redhat.com https://www.redhat.com/mailman/listinfo/fedora-livecd-list
Hi, I've lost track of the all the details here but ...
On Tue, 2007-07-10 at 06:07 -0500, Douglas McClendon wrote:
Question: Does the current livecd installer inefficiently write lots of 0's to the destination drive that it doesn't need to?
I think it might. The os.img on the F7 livecd is a 4G sparse file with about 2.3G of data. Anaconda's livecdcopy backend uses python's os.read/write. I would guess that that means that 4G of data is getting written, when theoretically only 2.3G needs to.
... how about something like this:
http://www.gnome.org/~markmc/code/e2cp.c
i.e. read the ext3 metadata and copy everything but the unallocated blocks.
(One thing from the above code you don't want to do is the check_all_zeros() bit - if an allocated block is all zeros, you *do* want to copy the zeros to the disk)
Cheers, Mark.
Mark McLoughlin wrote:
Hi, I've lost track of the all the details here but ...
On Tue, 2007-07-10 at 06:07 -0500, Douglas McClendon wrote:
Question: Does the current livecd installer inefficiently write lots of 0's to the destination drive that it doesn't need to?
I think it might. The os.img on the F7 livecd is a 4G sparse file with about 2.3G of data. Anaconda's livecdcopy backend uses python's os.read/write. I would guess that that means that 4G of data is getting written, when theoretically only 2.3G needs to.
... how about something like this:
http://www.gnome.org/~markmc/code/e2cp.c
i.e. read the ext3 metadata and copy everything but the unallocated blocks.
I'm not as into low level filesystem internals as you are. Tell me if this paraphrasing is accurate- e2cp is sort of like a dd for filesystems, that understands sparseness and/or unused/unallocated blocks, and handles them efficiently (ignores them).
The main downside I think it has as a solution to the issue, is that it doesn't fix the case of a destination volume of say 3.0G. I.e. there is no reason why the livecd installer shouldn't be able to install it's 2.3G payload onto a 3.0G destination. The solution I outlined covers that* case, while I don't think using e2cp does.
* _IF_ there aren't any flaws with anything I theorized
The most recent solution I was proposing, involved taking a second snapshot of the 4.0G (or perhaps now 1T) sparse ext3 os.img file, and resize2fs-ing it down to minimal, and then copying it. The big question which I haven't tested yet, is whether or not such a resize2fs will happen quickly, with very few actual changes written to the image (i.e. the new overlay in ram). If it does take less than a minute and use less than a 1MB of ram, it seems like a good solution to me.
Perhaps with your low level knowledge of ext2(/resize2fs?) you can answer that. I.e. if you take an image, resize it to minimal, then resize it to 1TB, then how long and how many changes will it take to resize it back down to minimal? Of course I can just test it myself, and probably will soon enough.
-dmc
-dmc
(One thing from the above code you don't want to do is the check_all_zeros() bit - if an allocated block is all zeros, you *do* want to copy the zeros to the disk)
Cheers, Mark.
-- Fedora-livecd-list mailing list Fedora-livecd-list@redhat.com https://www.redhat.com/mailman/listinfo/fedora-livecd-list
On Wed, 2007-07-11 at 14:50 -0500, Douglas McClendon wrote:
The main downside I think it has as a solution to the issue, is that it doesn't fix the case of a destination volume of say 3.0G. I.e. there is no reason why the livecd installer shouldn't be able to install it's 2.3G payload onto a 3.0G destination.
I thought that was already covered with a resize2fs ?
The most recent solution I was proposing, involved taking a second snapshot of the 4.0G (or perhaps now 1T) sparse ext3 os.img file, and resize2fs-ing it down to minimal, and then copying it.
This sounds like a way of doing the same thing as e2cp ... by resizing it down to its minimal size, you'd be removing all unallocated data blocks and so e2cp would have nothing to ignore.
So, they're different solutions to the "let's not copy unallocated blocks" problem.
Not sure why a snapshot is needed ...
I.e. if you take an image, resize it to minimal, then resize it to 1TB, then how long and how many changes will it take to resize it back down to minimal? Of course I can just test it myself, and probably will soon enough.
I would imagine it would be similar to how long it would take to mke2fs a 1TB sparse file.
Cheers, Mark.
Mark McLoughlin wrote:
On Wed, 2007-07-11 at 14:50 -0500, Douglas McClendon wrote:
The main downside I think it has as a solution to the issue, is that it doesn't fix the case of a destination volume of say 3.0G. I.e. there is no reason why the livecd installer shouldn't be able to install it's 2.3G payload onto a 3.0G destination.
I thought that was already covered with a resize2fs ?
If you mean already as in 'exists in the f7livecd' then the answer is no. The resize2fs that the f7livecd anaconda does is _after_ the fs image copy, and thus only effective for expansion, not shrink.
the f7livecd anaconda installer will red-error-of-death you if you try to install to a 3G destination volume.
The most recent solution I was proposing, involved taking a second snapshot of the 4.0G (or perhaps now 1T) sparse ext3 os.img file, and resize2fs-ing it down to minimal, and then copying it.
This sounds like a way of doing the same thing as e2cp ... by resizing it down to its minimal size, you'd be removing all unallocated data blocks and so e2cp would have nothing to ignore.
So, they're different solutions to the "let's not copy unallocated blocks" problem.
Not sure why a snapshot is needed ...
Because you haven't yet added in-flight-resize2fs support to e2cp.
But if you don't care about fixing the 2.4G->3.9G destination volume problem, then e2cp is sufficient.
-dmc
I.e. if you take an image, resize it to minimal, then resize it to 1TB, then how long and how many changes will it take to resize it back down to minimal? Of course I can just test it myself, and probably will soon enough.
I would imagine it would be similar to how long it would take to mke2fs a 1TB sparse file.
Cheers, Mark.
-- Fedora-livecd-list mailing list Fedora-livecd-list@redhat.com https://www.redhat.com/mailman/listinfo/fedora-livecd-list
Hey, Okay, I follow you now. Doing:
- Create snapshot of os.img - Resize it down to the smallest possible size - Copy it to disk - Resize it back up to the size of the disk
has the dual advantages of being able to install to the small possible disk size and not copying unallocated blocks.
Sounds like a reasonable plan ... go ahead and give it a shot.
The reservation I'd have is if you're resizing down from 4G to 3G, the 1G of data blocks which have to be moved could potentially be at the end of the image. In that case, you'd need a 1G COW area for the snapshot.
It'd be much better if you could resize filesystem as you're copying it, and that should be possible, but you'd have to e.g. add support for copying to resize2fs.
Cheers, Mark.
Mark McLoughlin wrote:
Hey, Okay, I follow you now. Doing:
- Create snapshot of os.img
- Resize it down to the smallest possible size
- Copy it to disk
- Resize it back up to the size of the disk
has the dual advantages of being able to install to the small possible disk size and not copying unallocated blocks.
Sounds like a reasonable plan ... go ahead and give it a shot.
The reservation I'd have is if you're resizing down from 4G to 3G, the 1G of data blocks which have to be moved could potentially be at the end of the image. In that case, you'd need a 1G COW area for the snapshot.
The part of my theory that is supposed to cover that, is an extra minimize/truncate/sparsify_expand/maximize cycle on the image just before it gets burned in the squashfs. Fingers crossed...
It'd be much better if you could resize filesystem as you're copying it, and that should be possible, but you'd have to e.g. add support for copying to resize2fs.
Sort of another way of saying "add in-flight-resize2fs support to e2cp" ;)
-dmc
Cheers, Mark.
-- Fedora-livecd-list mailing list Fedora-livecd-list@redhat.com https://www.redhat.com/mailman/listinfo/fedora-livecd-list
Douglas McClendon wrote:
Mark McLoughlin wrote:
Hey, Okay, I follow you now. Doing:
Create snapshot of os.img
Resize it down to the smallest possible size
Copy it to disk
Resize it back up to the size of the disk
has the dual advantages of being able to install to the small
possible disk size and not copying unallocated blocks.
Sounds like a reasonable plan ... go ahead and give it a shot. The reservation I'd have is if you're resizing down from 4G to 3G,the 1G of data blocks which have to be moved could potentially be at the end of the image. In that case, you'd need a 1G COW area for the snapshot.
The part of my theory that is supposed to cover that, is an extra minimize/truncate/sparsify_expand/maximize cycle on the image just before it gets burned in the squashfs. Fingers crossed...
Actually even before testing this, I'm 90% sure it won't work... But I'm 99% sure I figured out something that will. You'll love it, trust me ;)
-dmc
Mark McLoughlin wrote:
Hey, Okay, I follow you now. Doing:
- Create snapshot of os.img
- Resize it down to the smallest possible size
- Copy it to disk
- Resize it back up to the size of the disk
has the dual advantages of being able to install to the small possible disk size and not copying unallocated blocks.
Sounds like a reasonable plan ... go ahead and give it a shot.
turboLiveInst has been posted, polished, reposted with real world IMPRESSIVE performance results, and has now been collecting dust for a couple weeks. I don't suppose you'd be kind enough to give it a review for me Mark? Especially since you seem to understand the basic gist of the devicemapper snapshot resize2fs technique.
I'd really like to get it into rawhide/f8t2 ASAP so that if there is some fundamental problem with the approach that we discover and deal with it (or trash it*) sooner rather than later.
* obviously I'm kidding. There is no problem. Its solid, elegant, and the right answer, and nobody has suggested any reason to the contrary.
Regards,
-dmc
Cheers, Mark.
livecd@lists.fedoraproject.org