A fuse based initfs

Douglas McClendon dmc.fedora at filteredperception.org
Thu Aug 23 02:20:25 UTC 2007


Jon Nettleton wrote:
> This all started with a some what simple task.  I wanted to start rhgb/gdm
> as early in the boot process as possible.  Basically kernel->disk->gui.  How
> hard could it be?  Well not fun really.  My finally solution, which is
> unacceptable for fedora right now is patching the kernel with unionfs and
> using that as an overlay for /var and /tmp.  That gave me the transparent
> filesystem overlay I needed to be able to start up a nice gui and allow
> things like fsck to happen underneath without disturbing things.  Even with
> this solution I still don't have init restarting gdm if it dies.
> 
> So I thought, and discarded and thought some more.  Now I want anyone
> willing to comment on my thoughts.


Using unionfs for your rootfs sounds like a currently bad, but 
ultimately very desirable feature.  I would love to have my rootfs have 
a dozen different layers, some of them coming from peer2peer distributed 
filesystems.

But at the moment, using the current unionfs and/or fuse for the rootfs 
seems like a bad idea.  Admittedly, all I can do is wave my hands and 
point to vague memories of lots of issues, but I suspect the issues are 
real.

In general, it sounds like you are outlining several different 
problem/solutions, but I don't think they all need to be as tightly 
integrated as you suggest.

For instance, the gdm in initramfs (or very very early).  Why do you 
need this copy-on-write rootfs stuff?  Why not just have a tmpfs, and a 
gdm configuration that looks there.  Likewise for the early logging 
stuff.  Later during boot, the early-boot logfiles in tmpfs can be 
copied to /var/log.  This isn't as nice as the magic 
unionfs/dm-snapshot-merge automagic merging.  But I don't think that is 
necessary and worth the steps you are taking to get it.

On the issue of not starting services that might not need to be 
(bluetooth, networking, smartcards, ...), I think in various fedora 
wikis there is talk of DBus as a solution to that problem.

I think there are some flaws with your strategy (specifically wrt 
unionfs).  I.e. in (4) you mention flushing the fs down to lower layers 
and disappearing.  Can you actually make the unionfs disappear?  Aren't 
there some obscure limitations of unionfs (even when only containing a 
single layer) that will make it unpalatable for the general case?  (I 
have vague memories of something called sendfile and apache, and some 
types of symlinks).

Of course if you are talking about devicemapper snapshot merging ala 
markmc's patchset, then I am all for it (just because I have other plans 
for that functionality).  But then I am also confused about using both 
that and unionfs and what exactly you are using for what.

But I definitely get the feeling that you are moving in the right 
general direction towards things that I agree should be improved.

-dmc


> 
> My proposal is a user-land based filesystem that is specifically built to
> work with sysvinit to give it more functionality without changing it.  You
> want a standard sysvinit Unix boot just don't pass a parameter to the kernel
> commandline, no problemo.  However, with it enabled you would
> "theoretically" get the following.
> 
> 1)  Basic cached ram overlay.  This could possibly be used to replace our
> readahead scripts for disk caching.  The more immediate need is a temporary
> ram file-system to allow system processes to write logs, status, pipes to
> before we have had a chance to verify disk integrity.  This should get us
> the ability to provide nice X based gui tools for first boot, system
> recovery, and possibly encryption unlocking.
> 
> 2)  Better init logging.  With /var writable ( at-least in ram ) we can
> start syslog nice and early.
> 
> Just those two things give us a nicer gui boot screen and possibly cut the
> time of launching X twice off our boot sequence.  Now we go one step
> further.
> 
> 3)  We use the abstraction layer to manipulate the startup scripts that init
> sees in /etc/rcX.d .
>      This would require
>      A)  Netlink support.  Do we or don't we have a network interface.  If
> we don't then automatically remove all network
>           dependant services from init.  If Network comes up later in the
> process and init is still running ( we know that
>           because we can keep track of /var/lock/subsys ) the filesystem
> re-adds them later in the process.
>      B)  General dependencies.  Like I mentioned we can keep track of what
> has started using /var/lock/subsys or
>           /var/run.  If   Something fails remove the dependent scripts out
> of the way so init doesn't try to start them.
>      C)  Ability to maximize IO throughput.  Well this is just a thought.
> Right now we see one of the major bottlenecks
>           in our init process as overloading the IO subsystem.  With an
> intelligent read only overlay we could do basic
>           metrics and possibly wait a second longer to start the next
> process knowing it will shorten the time to launch
>           the next service by 2 seconds.  I have no proof this will work,
> but after looking at those bootchart graphs
>           enough some crazy ideas cross your mind.
> 
> 4)  After the init process is done, the filesystem flushes itself to the
> lower layer writables disks and disappears.
> 
> First, sorry if this is wrapped horribly.  I am using gmail and it doesn't
> lend itself to formatting long mails like this.
> Second,  Let's talk about it.  Like I said this just came to my mind as
> something that doesn't exist, and might possibly help us build a better
> system around what we already have.
> 
> Jon
> 
> 




More information about the devel mailing list