Chris Murphy writes:
I suggested earlier in the thread that it automatically put itself
in
its own scope. But I don't know if that solves this problem, and even
if it does it's only one part of a much bigger set of problems that
can cause updates to implode. In this case, sure maybe dnf survives,
but X is gone, the user has no idea what just happened, they have no
idea dnf is still applying their updates in the background, so they
don't know they shouldn't reboot.
Well, perhaps I misunderstood the situation at hand. But I'm under the
impression that, right now, dnf just blows up and leaves things in an
inconsistent state.
Maybe this is too much of a radical idea, but it makes sense to me to
address this first, and then maybe worry about what is the best way for dnf
to forge ahead when it realizes that its terminal is gone.
Right now, if I'm in a middle of a dnf update, and I try to start another
dnf operation, the second operation will fail. So dnf is smart enough to
exclusive-lock itself. I also have a dim recollection of some plugin that
said something about blocking system shutdowns, in some context. I'm a bit
hazy on that, but in any case, this is not something that must be solved on
day 1. A non-sophisticated user is not going to drop to a terminal prompt,
and execute dnf manually. I think that it's a reasonable proposition that
someone who does do that, and ends up killing X for some reason, will be
able to figure out if dnf is still running, or not, in the background.
When you start making the list of things that have to work exactly
right for system updates in a GUI , you get basically two things:
1. fuck it, we're making the user log out and do the update offline, and;
2. fuck it, we're doing something completely different ala OSTree, ala
Snapper, so the user can do a rollback if things go wrong.
The first is what happens on the other 90% of the world's computers,
including Android system updates.
But ordinary regular app updates will happily run on cruise control, without
bringing the system down into single user mode. If Android can do that, I
see no reason why Fedora can't, either. The only time you need to reboot an
Android device is for a kernel-level update.
Strictly speaking Fedora doesn't make you do the first one, but
it's
*well* understood for a long time how fragile this is which is why
offline updates was created.
Well, this is a surprise to me. I guess my faith in dnf was misplaced.
I spent my time in the trenches. I spent a lot of time writing system
daemons that I expected to recover from SIGKILL automatically. It wasn't
exactly easy, but it was not an impossible task either. I'm not expecting
any medals for that, just mentioning the fact that it's not, and should not
be, considered to be a lost art. As I'm told it is now, apparently.
>> It's why openSUSE has spent a ton of resources, and a
few bloody
>> noses, getting completely atomic updates working with Btrfs and
>> snapper, with very fine rollback capabilities.
>
>
> You do not need atomic updates to install a signal handler for SIGHUP or
> SIGTERM. And maybe issue a setsid() call, beforehand.
>
> This shouldn't be rocket science.
OK you clearly don't understand the complexity. So go ahead and do
Yes, I do not. But I am more than willing to be educated. Please explain to
me the complexity in `setsid()`, and a few calls to `sigaction()`, and maybe
an occasional `fork()`, here and there. Granted, when bleep hits the fan,
you wouldn't immediately know where things stand without further digging.
But, at least you are not going to end up with an unbootable brick, as long
as you don't panic, and take things one step at a time.
things your way, and when you don't like the result from hitting
the
Hurt Me Button, complain and criticize people all you want, wait for
the silence from those who could not give two shits, which means: stop
doing things wrong, start doing them correctly, or fix it all yourself
Mr. Genius. And then go write your own updater.
You know, actually I did, about ten years ago, I think – don't recall the
exact timeframe. Nothing earth-shattering; just the same basic functionality
as rpm: basic install/update/delete, and some dependency tracking.
And, by the way, an atomic package commit. If the process was SIGKILL-ed in
a middle of a transaction – which might've involved a multi-package
update – the install would finish automatically, when it got restarted. I
spent some time on that because I recall that at the time rpm would do that
too. I am certain that at least some point when you started rpm after
something blew up, it would yell at you that a previous transaction hasn't
been finished, and offered to clean it up for you.
So, it seems that this doesn't happen anymore. I guess that's the price of
progress.