Updated to this morning's rawhide. Rebooted. Now, I get to mounting my filesystems, which works succesfully, and then it just sits there.
If I hit control-alt-delete, some text comes up, ending with "Starting Notify Audit System and Update UTMP about System Shutdown..." and it just hangs there. I can hit ctrl-alt-delete and get more, different output.
If I try to boot to runlevel 1, it gets to "starting Wait for storage scan" and then "Starting Initialize storage subsystems (RAID, LVM, etc.)..." and then hangs. This shouldn't be hard on this system, since, while I am using software raid, I'm not doing anything crazy. If I hit ctrl-alt-delete there, after a long wait, I get some more text ending with "Rebooting..." on the screen alone, but no *actual* rebooting.
I can boot with /bin/bash, but then what?
How do I even go about diagnosing this?
On 02/21/2011 06:38 AM, Matthew Miller wrote:
How do I even go about diagnosing this?
https://fedoraproject.org/wiki/How_to_debug_Systemd_problems
JBG
On Mon, 21.02.11 01:38, Matthew Miller (mattdm@mattdm.org) wrote:
Updated to this morning's rawhide. Rebooted. Now, I get to mounting my filesystems, which works succesfully, and then it just sits there.
I need more information:
Is the plymouth screen shown?
Does it react to Esc?
Can you switch to another VT?
Does it timeout after 60s?
Any interesting output if you boot with "systemd.log_level=debug systemd.log_target=kmsg" on the kernel cmdline?
Lennart
On Mon, Feb 21, 2011 at 03:54:14PM +0100, Lennart Poettering wrote:
Updated to this morning's rawhide. Rebooted. Now, I get to mounting my filesystems, which works succesfully, and then it just sits there.
I need more information: Is the plymouth screen shown? Does it react to Esc? Can you switch to another VT? Does it timeout after 60s?
Turns out that it was fscking a filesystem that had been scrolled off the screen by other status messages. I left it go overnight and it was up in the morning and I could see the output in the log. The old init system used to give some indication of progress while this was happening (via a hack in fsck, I believe). Is it a systemd change that is suppressing that?
Any interesting output if you boot with "systemd.log_level=debug systemd.log_target=kmsg" on the kernel cmdline?
That's useful to know for the future.
On Mon, 21.02.11 22:45, Matthew Miller (mattdm@mattdm.org) wrote:
On Mon, Feb 21, 2011 at 03:54:14PM +0100, Lennart Poettering wrote:
Updated to this morning's rawhide. Rebooted. Now, I get to mounting my filesystems, which works succesfully, and then it just sits there.
I need more information: Is the plymouth screen shown? Does it react to Esc? Can you switch to another VT? Does it timeout after 60s?
Turns out that it was fscking a filesystem that had been scrolled off the screen by other status messages. I left it go overnight and it was up in the morning and I could see the output in the log. The old init system used to give some indication of progress while this was happening (via a hack in fsck, I believe). Is it a systemd change that is suppressing that?
Well, this is a difficult problem, unfortunately.
We already connect stdout/stderr of fsck with syslog and the console at the same time. That should give you a minimal idea on what is going on. However no progress bar. In the syslog output the progress bar would not make much sense probably. fsck (at least in the ext234 implementation) supports the -C parameter to direct progress bar information (and only the progress bar) to a specific fd. However, that information is intended for applications to parse it, not to show on the screen. I am not really sure what to do about this. One option would be to extend -C to show a "human readable" progress bar on the file name passed. Then we could just invoke fsck with "-C /dev/console" and would get a progress bar printed on the console, and the console only. But it keeps me wondering how that would look like if multiple fs are handled in parallel.
Another option would be to parse fsck's output and forward it in some form to Plymouth to show in the normal progress bar. But I am not sure if Plymouth can actually do that. (Ray?) Also, this doesn't solve the problem that we might get multiple streams of progress bar information at the same time and for presentation in plymouth we'd need to somehow integrate them into one, and whose responsibility would that be? Plymouth?
I think not having the progress bar for F15 is acceptable, but I am all ears for suggestions how to implement this best post-F15. Especially if somebody wants to do the work... ;-)
Or maybe the whole problem set goes away by doing nothing since btrfs has no fsck? ;-)
Lennart
On Tue, 2011-02-22 at 13:40 +0100, Lennart Poettering wrote:
Or maybe the whole problem set goes away by doing nothing since btrfs has no fsck? ;-)
When I spoke with Josef a couple of weeks ago he said btrfs would be getting a fsck very soon.
borne out by this comment from Chris: http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg08383.html
-sv
On Tue, Feb 22, 2011 at 01:40:51PM +0100, Lennart Poettering wrote:
the same time. That should give you a minimal idea on what is going on. However no progress bar. In the syslog output the progress bar would not make much sense probably. fsck (at least in the ext234 implementation) supports the -C parameter to direct progress bar information (and only the progress bar) to a specific fd. However, that information is intended for applications to parse it, not to show on the screen. I am not really sure what to do about this. One option would be to extend -C to show a "human readable" progress bar on the file name passed. Then we could just invoke fsck with "-C /dev/console" and would get a progress bar printed on the console, and the console only. But it keeps me wondering how that would look like if multiple fs are handled in parallel.
It would look pretty ugly. But having no output at all is very undesirable, because it makes the system look hung, and particularly because (as in this case) the most recently-printed messages may have nothing at all to do with what is blocking the boot.
Another option would be to parse fsck's output and forward it in some form to Plymouth to show in the normal progress bar. But I am not sure if Plymouth can actually do that. (Ray?) Also, this doesn't solve the
Also, it doesn't solve the problem in cases where one isn't running plymouth, which I hope is a use case that systemd intends to take seriously.
I think not having the progress bar for F15 is acceptable, but I am all ears for suggestions how to implement this best post-F15. Especially if somebody wants to do the work... ;-)
I don't think it's a blocker, but it should probably get put into the release notes. Things like this cause end-user and support-center pain.
On Tue, 22.02.11 12:55, Matthew Miller (mattdm@mattdm.org) wrote:
On Tue, Feb 22, 2011 at 01:40:51PM +0100, Lennart Poettering wrote:
the same time. That should give you a minimal idea on what is going on. However no progress bar. In the syslog output the progress bar would not make much sense probably. fsck (at least in the ext234 implementation) supports the -C parameter to direct progress bar information (and only the progress bar) to a specific fd. However, that information is intended for applications to parse it, not to show on the screen. I am not really sure what to do about this. One option would be to extend -C to show a "human readable" progress bar on the file name passed. Then we could just invoke fsck with "-C /dev/console" and would get a progress bar printed on the console, and the console only. But it keeps me wondering how that would look like if multiple fs are handled in parallel.
It would look pretty ugly. But having no output at all is very undesirable, because it makes the system look hung, and particularly because (as in this case) the most recently-printed messages may have nothing at all to do with what is blocking the boot.
What we could do is start fsck always with "-V", i.e. verbose mode. Should we?
Another option would be to parse fsck's output and forward it in some form to Plymouth to show in the normal progress bar. But I am not sure if Plymouth can actually do that. (Ray?) Also, this doesn't solve the
Also, it doesn't solve the problem in cases where one isn't running plymouth, which I hope is a use case that systemd intends to take seriously.
Well, I think if we don't show progress bars if Plymouth isn't used then that's completly OK. Also, AFAIK we don't officially support Plymouth-less boots on Fedora right now, even though it should actually work quite well and in systemd in general we definitely want to support boots both with and without Plymouth.
Lennart
Hi,
On Tue, Feb 22, 2011 at 7:40 AM, Lennart Poettering mzerqung@0pointer.de wrote:
Another option would be to parse fsck's output and forward it in some form to Plymouth to show in the normal progress bar. But I am not sure if Plymouth can actually do that. (Ray?) Also, this doesn't solve the problem that we might get multiple streams of progress bar information at the same time and for presentation in plymouth we'd need to somehow integrate them into one, and whose responsibility would that be? Plymouth?
Yea, fsck integration is something i've wanted to add to plymouth for a long time.
--Ray
On 02/24/2011 01:43 AM, Ray Strode wrote:
Yea, fsck integration is something i've wanted to add to plymouth for a long time.
Yeah I might say it's rather needed....
The novice end user behaviour I saw when fsck was running in the background was *interesting* but not *surprising*..
The novice end user waited for a little over what normally took him to reach GDM then he pressed the reboot button, then he repeated the same behaviour about 3 - 5 times depending on that particular user patience then he started to power off/on until he finally gave up and called for help..
I think poor fsck have never suffer as much as then and all it wanted to do was to do the job it was program for, check the consistency of a file system for his overlord the end user.
So please Ray please save fsck from the abuse of the novice end users all it wants to do is to perform the task it is programmed for *uninterrupted*. . .
he he =)
JBG