Hi everyone,
I've spent some time trying to minimize the footprint of the Fedora docker base image. Overall, I managed to reduce its size by 39.9%.
A summary of the work I did can be found here: https://gist.github.com/iamcourtney/1a4af7c4289014f57080
If you're interested, you can find a more detailed version of the above work here: https://gist.github.com/iamcourtney/b8709ed897b7ecc9ac0f
I essentially looked at which packages were being installed to the base image and tried to determine which of those packages could be turned into weak dependencies and which of those packages we could possibly break up.
If possible, I'd like some feedback on the work I did. Comments and criticism are more than welcomed! I realize there may be some controversy in terms of what I chose to remove and what I chose to turn into weak dependencies, but I would like to hear your thoughts either way.
As a side note, I originally posted my preliminary results to atomic-devel list a few weeks ago, but I used some of the feedback I got to make improvements. I also was able to reduce glibc with the help of Carlos O'Donell.
Thanks!
On Mon, 2016-02-22 at 09:54 -0500, Courtney Pacheco wrote:
If possible, I'd like some feedback on the work I did. Comments and criticism are more than welcomed! I realize there may be some controversy in terms of what I chose to remove and what I chose to turn into weak dependencies, but I would like to hear your thoughts either way.
Removing tzdata seems like a bad idea. I think a small amount of code change could make the cost of keeping tzdata much lower. Virtually all of the tzdata files are less than 4 kilobytes, so most of the on-disk storage cost is block size overhead:
dmt:~% du -s --apparent-size /usr/share/zoneinfo 1720 /usr/share/zoneinfo dmt:~% du -s /usr/share/zoneinfo 4780 /usr/share/zoneinfo
Possible options include:
a) Glue all the compiled zone info together in a single file, teach glibc and friends about it b) Glue the zone info together in a romfs image, mount it from a systemd unit c) Both of the above: glue them all together in a romfs, add a fuse/overlay fs to expose the individual files d) Mount a zoneinfo fs exported from the host
A somewhat similar criticism applies to removing gconv. Pretending that applications don't have to deal with multiple character encodings is likely to be wrong, and we don't currently have any metadata to track whether a binary calls iconv() so there's no way to express the need for the gconv modules. Here again, most of these libraries are relatively small, and gluing them all together would be a decent size win:
dmt:/usr/lib64/gconv% du -c *.so | tail -1 7352 total dmt:/usr/lib64/gconv% du --apparent-size -c *.so | tail -1 6448 total dmt:/usr/lib64/gconv% size -t *.so | tail -1 4778516 161368 2016 4941900 4b684c (TOTALS)
Both things are possible here: we could teach rpm's find-requires to know about iconv, _and_ link all the gconv modules together.
- ajax
On Mon, Feb 22, 2016 at 11:04:40AM -0500, Adam Jackson wrote:
On Mon, 2016-02-22 at 09:54 -0500, Courtney Pacheco wrote:
If possible, I'd like some feedback on the work I did. Comments and criticism are more than welcomed! I realize there may be some controversy in terms of what I chose to remove and what I chose to turn into weak dependencies, but I would like to hear your thoughts either way.
Removing tzdata seems like a bad idea. I think a small amount of code change could make the cost of keeping tzdata much lower. Virtually all of the tzdata files are less than 4 kilobytes, so most of the on-disk storage cost is block size overhead:
dmt:~% du -s --apparent-size /usr/share/zoneinfo 1720 /usr/share/zoneinfo dmt:~% du -s /usr/share/zoneinfo 4780 /usr/share/zoneinfo
Possible options include:
a) Glue all the compiled zone info together in a single file, teach glibc and friends about it b) Glue the zone info together in a romfs image, mount it from a systemd unit c) Both of the above: glue them all together in a romfs, add a fuse/overlay fs to expose the individual files d) Mount a zoneinfo fs exported from the host
The 'posix' and 'right' subdirectories in /usr/share/zoneinfo could be dumped and then whichever one we want could just be installed as /usr/share/zoneinfo. The 'posix' collection are the zones without counting leap seconds normally, the 'right' collection are the zones with leap seconds counted normally.
Courtney Pacheco (cpacheco@redhat.com) said:
Hi everyone,
I've spent some time trying to minimize the footprint of the Fedora docker base image. Overall, I managed to reduce its size by 39.9%.
A summary of the work I did can be found here: https://gist.github.com/iamcourtney/1a4af7c4289014f57080
If you're interested, you can find a more detailed version of the above work here: https://gist.github.com/iamcourtney/b8709ed897b7ecc9ac0f
May be a dumb question...
If we're excluding DNF, RPM, etc. for a slimmer base image during runtime, how are we describing the best practices for build? Is the intention that you should always be pulling in a separate tool container to assist with the build process?
Bill
On 02/22/2016 11:26 AM, Bill Nottingham wrote:
Courtney Pacheco (cpacheco@redhat.com) said:
Hi everyone,
I've spent some time trying to minimize the footprint of the Fedora docker base image. Overall, I managed to reduce its size by 39.9%.
A summary of the work I did can be found here: https://gist.github.com/iamcourtney/1a4af7c4289014f57080
If you're interested, you can find a more detailed version of the above work here: https://gist.github.com/iamcourtney/b8709ed897b7ecc9ac0f
May be a dumb question...
If we're excluding DNF, RPM, etc. for a slimmer base image during runtime, how are we describing the best practices for build? Is the intention that you should always be pulling in a separate tool container to assist with the build process?
Bill
devel mailing list devel@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/devel@lists.fedoraproject.org
Hi Bill,
Ideally, we'd like to remove dnf and other tools that build the image because it's like shipping the "BuildRequires" fields as part of a binary rpm.
In place of dnf, rpm, etc., we'd like to have a "builder tool" that volume mounts dnf or other tools during the build, or builds and produces an image using outside tools. So yes, the intention would be to pull in a separate tool container to assist with the build process.
Hope that helps
Courtney
On Mon, 2016-02-22 at 11:26 -0500, Bill Nottingham wrote:
Courtney Pacheco (cpacheco@redhat.com) said:
Hi everyone,
I've spent some time trying to minimize the footprint of the Fedora docker base image. Overall, I managed to reduce its size by 39.9%.
A summary of the work I did can be found here: https://gist.github.com/iamcourtney/1a4af7c4289014f57080
If you're interested, you can find a more detailed version of the above work here: https://gist.github.com/iamcourtney/b8709ed897b7ecc9ac0f
May be a dumb question...
If we're excluding DNF, RPM, etc. for a slimmer base image during runtime, how are we describing the best practices for build? Is the intention that you should always be pulling in a separate tool container to assist with the build process?
Bill
devel mailing list devel@lists.fedoraproject.org http://lists.fedoraproject.org/admin/lists/devel@lists.fedoraproject. org
I have no problem removing dnf, but removing rpm is going to far. For now we still need rpm for looking at the contents of a container. While external rpm would probably work, I don't think we are redy for this, nor is the benefit enough.
On Mon, Feb 22, 2016 at 9:54 AM, Courtney Pacheco cpacheco@redhat.com wrote:
Hi everyone,
I've spent some time trying to minimize the footprint of the Fedora docker base image. Overall, I managed to reduce its size by 39.9%.
Thanks for doing this. It is great to see someone working on minimization.
A summary of the work I did can be found here: https://gist.github.com/iamcourtney/1a4af7c4289014f57080
If you're interested, you can find a more detailed version of the above work here: https://gist.github.com/iamcourtney/b8709ed897b7ecc9ac0f
I essentially looked at which packages were being installed to the base image and tried to determine which of those packages could be turned into weak dependencies and which of those packages we could possibly break up.
If possible, I'd like some feedback on the work I did. Comments and criticism are more than welcomed! I realize there may be some controversy in terms of what I chose to remove and what I chose to turn into weak dependencies, but I would like to hear your thoughts either way.
On the "Kernel Packages" section, I tend to agree that kmod and kmod-libs likely don't make sense in a docker container. However, libseccomp should likely remain. The library is there to make use of the in-kernel seccomp functionality, and systemd and other applications use it to limit their syscall interface to the kernel. This reduces the potential attack surface, and in essence at least helps containers actually contain.
josh
On 22/02/16 06:54, Courtney Pacheco wrote:
Hi everyone,
I've spent some time trying to minimize the footprint of the Fedora docker base image. Overall, I managed to reduce its size by 39.9%.
Excellent.
You might want to keep in mind the use of "coreutils-single" for Fedora 24 (or backport that packaging change to F23). It reduces coreutils from about 15MB to 1MB.
cheers, Pádraig.