On 16/11/18 22:07 +0000, Jonathan Dieter wrote:
The core idea behind zchunk is that a file is split into independently compressed chunks and the checksum of each compressed chunk is stored in the zchunk header. When downloading a new version of the file, you download the zchunk header first, check which chunks you already have, and then download the rest.
Just one more thought I reliazed in hindsight, there are ways to cut down the installed files in RPM ecosystem, currently with a request to omit documentation (%doc tagged files, see --nodocs/--excludedocs).
Indeed, that's a sort of files you can usually omit without hesitation in containers/VMs. Perhaps there are some more bits that are de facto optional without losing anything from the functionality.
So with clever separation, such bits wouldn't even need to be downloaded when they will not eventually make it to the disk. That might make things like customizing a base container image tiny bit more swift, e.g. in CI/CD context without many connectivity guarantees (up to mirrors anyway). But might not be worth it if the trade-off is already predictably suboptimal in other aspects.