Fedora 16 beta vice Knoppix

Wed Oct 5 13:56:39 UTC 2011

On Wed, Oct 5, 2011 at 15:28, Lennart Poettering <mzerqung at 0pointer.de> wrote:
> On Tue, 04.10.11 19:40, Adam Williamson (awilliam at redhat.com) wrote:
>> On Tue, 2011-10-04 at 16:55 -0800, Jef Spaleta wrote:
>> > On Tue, Oct 4, 2011 at 3:32 PM, JB <jb.1234abcd at gmail.com> wrote:
>> > >  13837ms udev-settle.service
>> > >  11392ms plymouth-start.service
>> >
>> >
>> > if you use the plot option instead of blame option and produce the svg
>> > of the service timing you get a better feel for what Lennart was
>> > talking about with regard to the udev settle being problematic.
>> >
>> > I'll try to break it down for you. Keep the following in mind when you
>> > look over the svgs produced in susequent testing.
>> >
>> > udev-settle.service essentially calls udevadm settle and that's all it does.
>> > udevadm settle  takes FOREVER (15 seconds) to return during boot up on
>> > my live media run  But its returns more quickly on on F15 install (3
>> > seconds). I'll check a full F16 beta install soonish.
>>
>> And remember that all udevadm settle does is wait for the udev event
>> queue to empty.
>>
>> So essentially all that's going on here is 'wait for udev to be done',
>> which is a fairly sensible prerequisite for all manner of other bits of
>> boot.
>
> Nah, this is not a sensible prerequisite. User code should *not* have to
> wait for udev to be settled.
>
> They key message Kay and I and everybody else involved in udev/systemd
> and related technologies want to get into everybodies head is that
> applications should never ever expect that "everything is settled",
> since that is simply not possible (i.e. USB init times are unbounded --
> so how do you know that the usb disk fully initialized when you settled
> the udev queue?) and all attempts to fake that are major sources of
> slowness at boot (to deal with USB and stuff people basically just wait
> a couple of seconds, which doesn't fix the problem, just tapes over it).
>
> Or in other words: "udev settle" is a hack and is not part of our boot
> anymore -- unless LVM pulls it in. And the fact that it pulls it in is
> sad, and has been a constant source of complaining from us to the LVM
> folks over the years.
>
> They major point to make here is that all components of the system
> should wait exactly as long as they have to and not longer. More
> specifically: they should wait for the specific hardware they are
> needing but not any longer. Example: when mounting the file systems
> systemd will wait exactly until the point all devices listed in
> /etc/fstab have shown up -- but not any longer before continuing.
>
> And again, in short words:
>
> "udev settle" is a hack. Only broken code needs it. It has no place in a
> modern system.
>
>> The reasons why udev takes a while to be 'done' are more interesting and
>> Lennart went into some of them.
>
> It is completely fine if some probing done by udev rules takes a long
> time. It's not just fine, it's even expected. For example, I have a 3G
> card in my laptop I don't use. Since it has no SIM card it takes about
> 8s seconds to probe (i.e. the firmware finds it funny to reply with an
> 8s delay to AT commands if no SIM card is in the card). Now, there's no
> sensible way around this, since the the hw just takes that long to
> probe. As long as these 8s are spent in the background they shouldn't
> matter at all. Except that LVM requires settling of all devices, and
> hence simply enabling LVM means my boot is delayed for a whole 8s. Now
> thankfully, I opted out of LVM when I installed Fedora on my
> machine. That way the 8s probing of the modem continues in the
> background long after gdm is already up.
>
> That's why I mentioned in that earlier mail to ajax that I am not
> concerned that EDID takes so long: because it is OK. What isn't OK is
> that LVM has to wait for EDID and for my 3G modem probing to complete,
> and thus delays our entire boot.
>
> LVM needs to be ported to listen to hotplug events, and make use of
> devices as they show up, instead of expecting that all hardware has
> already shown up and has been probed before LVM is started. For a number
> of reasons: to not slow down the boot artificially, to fix the
> enumeration race and become fully compatible with today's storage
> technology that is much more dynamic than 10 years ago, and to become
> robust.
>
> Please, don't claim that "udev settle" was a sensible prerequisite. It
> isn't. It has no place in today's dynamic hardware.

Just to make sure that the message is clearly understood and there is
nothing sensible in making any assumptions ever, like: 'all devices
are there / we have settled'. That can never be true on today's
systems.

Any system service that today relies in its core on 'udevadm settle'
or scsi-wait-scan module, or any of the other bad hacks in that
category, anything that uses these barriers as a checkpoint to block
on, to do its synchronous actions, should be considered non-hotplug
capable, need to be fixed or legacy. The Fedora storage assembly shell
scripts really need to be replaced by something that fits into today's
reality.

There are a few valid cases though, like if tools manually
re-partition a disk from a script or the command line, 'udevadm
settle' can be a useful tool to block until the async events of udev
have created all the symlinks for a partition. There are a few other
cases, like when shutting down the udev daemon from initramfs, where
the use of 'udevadm settle' is justified.

There is no general rule, but anything that calls 'udevadm settle' is
suspicious and should be carefully checked if it does not rely on
assumptions which just bet on luck and can't reliably work in hotplug
setups.

Kay