https://bugzilla.redhat.com/show_bug.cgi?id=1140405
Bug ID: 1140405 Summary: systemctl start docker fails because systemd continuously restarts the daemon Product: Fedora Version: 20 Component: docker-io Assignee: lsm5@fedoraproject.org Reporter: a.badger@gmail.com QA Contact: extras-qa@fedoraproject.org CC: admiller@redhat.com, golang@lists.fedoraproject.org, hushan.jia@gmail.com, jperrin@centos.org, lsm5@fedoraproject.org, mattdm@redhat.com, mgoldman@redhat.com, s@shk.io, vbatts@redhat.com
Description of problem:
I've installed docker for the first time and tried to start it with "systemctl start docker". The systemctl command returns successfully but then trying to run docker client commands against the daemon timed out. After some poking around I discovered that systemd was starting docker. Docker was taking quite a while to do various initialization tasks including invoking mkfs. systemd decided that docker was unresponsive and terminated it and then restarted it. Because mkfs hadn't finished, docker had to try running mkfs again. This cycle kept continuing and would probably have prevented docker from fully starting up forever.
I worked around the problem by telling systemd not to start docker, running the docker daemon manually from a shell, waiting until the mkfs had completed, then shutting down my daemon and rerunning systemctl start docker. After that, the docker service runs fine.
Version-Release number of selected component (if applicable):
docker-io-1.2.0-2.fc20.x86_64
How reproducible: Everytime for me until after I ran docker as a daemon manually. I don't know how to reproduce once docker has initialized (Probably removing some file or volume but I don't know what it would be).
Steps to Reproduce: 1. On a system that hasn't had docker running before 2. yum install docker-io 3. systemctl start docker 4. watch the output of systemctl status docker -l
Actual results:
systemctl status docker -l will report that docker is in state Activating for several minutes, then show that systemd decided docker wasn't responding, terminate it, and restart. The -l output will also show that docker is running mkfs for most of that time and is still running it when docker is terminated.
Expected results:
systemctl status docker -l will show that the state has gone to active (running)
Additional info:
* My filesystem is ext4. The docker initialization is running mkfs.ext4 * I'm using a 4-5 year old laptop with platter HDs. A faster machine or SSD drives might run mkfs quickly enough to not see this issue. * This might be "fixed" by adding some documentation that says to perform certain steps to initialize docker before running systemctl start docker rather than changing docker code to finish initialization sooner.
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
--- Comment #1 from Toshio Ernie Kuratomi a.badger@gmail.com --- Created attachment 936326 --> https://bugzilla.redhat.com/attachment.cgi?id=936326&action=edit Output of journalctl -u docker --no-pager -l
Here's output from journalctl. You can see that at first systemd is starting docker, deciding that it timed out, terminating it, and then restarting it.
The eventual successful start by systemd at the bottom of the log comes after I manually ran the daemon so that the mkfs would complete.
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
--- Comment #2 from Toshio Ernie Kuratomi a.badger@gmail.com --- Created attachment 936327 --> https://bugzilla.redhat.com/attachment.cgi?id=936327&action=edit Some systemctl status -l output
here's a copy and paste of some runs of systemctl status -l while I was still debugging this. You can see that docker starts up and by 59s it's invoked mkfs.ext4 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 /dev/mapper/docker-253:2-7757935-base
At 1min 26s, the same mkfs is still running. Sometime after that, systemd has terminated that docker daemon and tried to start a new one.
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
Daniel Walsh dwalsh@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |dwalsh@redhat.com
--- Comment #3 from Daniel Walsh dwalsh@redhat.com --- Is it possible to tell systemd to not restart docker? Not sure why we would want this autorestarted.
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
--- Comment #4 from Toshio Ernie Kuratomi a.badger@gmail.com --- just to note -- it's okay for systemd to try starting docker if it's not running; we just don't want it to assume docker is hung and kill it (at least during this initialization step).
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
--- Comment #5 from Toshio Ernie Kuratomi a.badger@gmail.com --- Confirmed that running stemctl start docker for the first time on an SSD machine was fine. So it seems to be related to how quickly the mkfs is run on the specific hardware.
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
Lokesh Mandvekar lsm5@fedoraproject.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Assignee|lsm5@fedoraproject.org |lsm5@redhat.com
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
Lokesh Mandvekar lsm5@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED
--- Comment #6 from Lokesh Mandvekar lsm5@redhat.com --- Hi Toshio, sorry to get back so late on this, could you please retry this with docker-io-1.4.1-5 ?
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
--- Comment #7 from Toshio Ernie Kuratomi a.badger@gmail.com --- Still happening. docker-io-1.4.1-5.fc21.x86_64
I'm guessing there's no way to solve this unless you can do one of the following:
* speed up the mkfs that docker is using in its initial run * push that initialization into something besides service startup * Tell systemd that starting docker should have a longer than normal timeout
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
--- Comment #8 from Daniel Walsh dwalsh@redhat.com --- Lokesh can you see about extending the systemd timeout?
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
Fedora Admin XMLRPC Client fedora-admin-xmlrpc@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Assignee|lsm5@redhat.com |ichavero@redhat.com
--- Comment #9 from Fedora Admin XMLRPC Client fedora-admin-xmlrpc@redhat.com --- This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
Daniel Walsh dwalsh@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Assignee|ichavero@redhat.com |lsm5@redhat.com
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
Daniel Walsh dwalsh@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Priority|unspecified |medium
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
--- Comment #10 from Fedora End Of Life endoflife@fedoraproject.org --- This message is a reminder that Fedora 20 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 20. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '20'.
Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version.
Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 20 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above.
Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
Daniel Walsh dwalsh@redhat.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |a.badger@gmail.com Flags| |needinfo?(a.badger@gmail.co | |m)
--- Comment #11 from Daniel Walsh dwalsh@redhat.com --- Toshio are you still seeing this problem?
https://bugzilla.redhat.com/show_bug.cgi?id=1140405
Fedora End Of Life endoflife@fedoraproject.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |CLOSED Resolution|--- |EOL Last Closed| |2015-06-29 21:08:40
--- Comment #12 from Fedora End Of Life endoflife@fedoraproject.org --- Fedora 20 changed to end-of-life (EOL) status on 2015-06-23. Fedora 20 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug.
Thank you for reporting this bug and we are sorry it could not be fixed.
golang@lists.fedoraproject.org