[CC'd to fedora-virt]
On Wed, Jun 17, 2009 at 05:25:26PM -0400, Christopher Johnston wrote:
I had a question about febootstrap and a specific use case of mine at my company where I am implementing a stateless solution for our grid. I wanted to be able to take the initramfs image that is spun up out of febootstrap and put a custom linuxrc in there which will essentially take the contents of the initramfs and copy them directly into a tmpfs filesystem thats mounted as /sysroot then do a pivot_root into it and then do a full execution of init.
I'm a bit confused why you'd want to do this, but maybe I'm missing something. Why copy the initramfs image into a tmpfs? The initramfs is already loaded into kernel memory at boot time and so it has all the same properties / benefits of tmpfs.
I have been using nash currently to do this and I have ran into a few issue when the system boots up once init forks off where it segfaults. Below is my custom linuxrc:
I would tend to avoid using nash. If it's possible to add bash to the image, just use bash.
You might find it helpful to look at what we do in libguestfs, here:
http://git.et.redhat.com/?p=libguestfs.git;a=blob;f=appliance/make.sh.in;hb=...
#!/sbin/nash
mount -t proc /proc /proc echo Mounting proc filesystem echo Mounting sysfs filesystem mount -t sysfs /sys /sys echo Creating /dev mount -o mode=0755 -t tmpfs /dev /dev mkdir /dev/pts mount -t devpts -o gid=5,mode=620 /dev/pts /dev/pts mkdir /dev/shm mkdir /dev/mapper echo Creating initial device nodes mknod /dev/null c 1 3 mknod /dev/zero c 1 5 mknod /dev/systty c 4 0 mknod /dev/tty c 5 0 mknod /dev/console c 5 1 mknod /dev/ptmx c 5 2 mknod /dev/fb c 29 0 mknod /dev/tty0 c 4 0 mknod /dev/tty1 c 4 1 mknod /dev/tty12 c 4 12 mknod /dev/ttyS0 c 4 64 mknod /dev/ttyS1 c 4 65 mknod /dev/ttyS2 c 4 66 mknod /dev/ttyS3 c 4 67 /lib/udev/console_init tty0 mkchardevs mkblkdevs echo Creating tmpfs filesystem mkdir -p /sysroot mkrootdev -t tmpfs -o defaults,ro /dev/root mount -o mode=0755 -t tmpfs /dev/root /sysroot mkdir -p /sysroot/proc mkdir -p /sysroot/sys mkdir -p /sysroot/.oldroot echo Copying rootfs->tmpfs cp -a bin /sysroot cp -a dev /sysroot cp -a etc /sysroot cp -a home /sysroot cp -a lib /sysroot cp -a lib64 /sysroot cp -a mnt /sysroot cp -a sbin /sysroot cp -a tmp /sysroot cp -a usr /sysroot cp -a var /sysroot cp -a root /sysroot echo Setting up the new root tmpfs filesystem setuproot echo Switching from rootfs to tmpfs switchroot
-Chris
Rich.
On Wed, Jun 17, 2009 at 5:38 PM, Richard W.M. Jones rjones@redhat.comwrote:
[CC'd to fedora-virt]
On Wed, Jun 17, 2009 at 05:25:26PM -0400, Christopher Johnston wrote:
I had a question about febootstrap and a specific use case of mine at my company where I am implementing a stateless
solution
for our grid. I wanted to be able to take the initramfs image that is
spun
up out of febootstrap and put a custom linuxrc in there which will essentially take the contents of the initramfs and copy them directly
into a
tmpfs filesystem thats mounted as /sysroot then do a pivot_root into it
and
then do a full execution of init.
I'm a bit confused why you'd want to do this, but maybe I'm missing something. Why copy the initramfs image into a tmpfs? The initramfs is already loaded into kernel memory at boot time and so it has all the same properties / benefits of tmpfs.
I have been using nash currently to do this and I have ran into a few issue when the system boots up once init forks off where it segfaults. Below is my custom linuxrc:
I would tend to avoid using nash. If it's possible to add bash to the image, just use bash.
You might find it helpful to look at what we do in libguestfs, here:
http://git.et.redhat.com/?p=libguestfs.git;a=blob;f=appliance/make.sh.in;hb=...
#!/sbin/nash
mount -t proc /proc /proc echo Mounting proc filesystem echo Mounting sysfs filesystem mount -t sysfs /sys /sys echo Creating /dev mount -o mode=0755 -t tmpfs /dev /dev mkdir /dev/pts mount -t devpts -o gid=5,mode=620 /dev/pts /dev/pts mkdir /dev/shm mkdir /dev/mapper echo Creating initial device nodes mknod /dev/null c 1 3 mknod /dev/zero c 1 5 mknod /dev/systty c 4 0 mknod /dev/tty c 5 0 mknod /dev/console c 5 1 mknod /dev/ptmx c 5 2 mknod /dev/fb c 29 0 mknod /dev/tty0 c 4 0 mknod /dev/tty1 c 4 1 mknod /dev/tty12 c 4 12 mknod /dev/ttyS0 c 4 64 mknod /dev/ttyS1 c 4 65 mknod /dev/ttyS2 c 4 66 mknod /dev/ttyS3 c 4 67 /lib/udev/console_init tty0 mkchardevs mkblkdevs echo Creating tmpfs filesystem mkdir -p /sysroot mkrootdev -t tmpfs -o defaults,ro /dev/root mount -o mode=0755 -t tmpfs /dev/root /sysroot mkdir -p /sysroot/proc mkdir -p /sysroot/sys mkdir -p /sysroot/.oldroot echo Copying rootfs->tmpfs cp -a bin /sysroot cp -a dev /sysroot cp -a etc /sysroot cp -a home /sysroot cp -a lib /sysroot cp -a lib64 /sysroot cp -a mnt /sysroot cp -a sbin /sysroot cp -a tmp /sysroot cp -a usr /sysroot cp -a var /sysroot cp -a root /sysroot echo Setting up the new root tmpfs filesystem setuproot echo Switching from rootfs to tmpfs switchroot
-Chris
Rich.
-- Richard Jones, Emerging Technologies, Red Hat http://et.redhat.com/~rjones virt-df http://et.redhat.com/%7Erjones%0Avirt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://et.redhat.com/~rjones/virt-df/http://et.redhat.com/%7Erjones/virt-df/
Rich,
The properties may be the same, but AFAIK you cannot limit the size of rootfs so it will use all of memory. With tmpfs you can specifiy how large the filesystem will be to keep the root filesystem at a fixed size. There are also some new features/options where you can specifc which numa node to use memory from or to interleave across. There are some benefits to that for our grid/hpc workloads.
Now to get this going for my testing I am comfortable using rootfs (pivotroot and switchroot dont seem to work in nash).
What I have done in my testing here is generate a linuxrc which I posted. I attmpted to use bash instead of nash and have it just fork /sbin/init but that also did not work (upstarts init does not seem to work well here).
I looked over the script you are using here for your virtual machine, but what are you actually doing to start init so the usual RC stuff can start running?
-Chris
On Fri, Jun 19, 2009 at 05:44:52PM -0400, Christopher Johnston wrote:
Now to get this going for my testing I am comfortable using rootfs (pivotroot and switchroot dont seem to work in nash).
I'm a bit surprised about that, because obviously switchroot is used all the time in mkinitrd-built initrds, eg. look at the following lines in mkinitrd:
emit "echo Switching to new root and running init." emit "switchroot" emit "echo Booting has failed." emit "sleep -1"
When you say dont seem to work, is there any diagnostic?
What I have done in my testing here is generate a linuxrc which I posted. I attmpted to use bash instead of nash and have it just fork /sbin/init but that also did not work (upstarts init does not seem to work well here).
I looked over the script you are using here for your virtual machine, but what are you actually doing to start init so the usual RC stuff can start running?
Well in the init script we use, we _don't_ run any rc stuff. It's just for a small appliance, so we set up what we need (ifconfig lo and eth0), modprobe a few modules, and then run our daemon directly.
Having said that, it should easily be possible to use febootstrap to run the ordinary scripts, but the problem probably isn't febootstrap, but some bug/difference/sensitivity in /init, nash, permissions etc.
Is the upstart %post script failing? febootstrap runs all %post scripts under fakeroot+fakechroot, and they can sometimes behave differently.
Are you booting these on real hardware? I suggest testing it under qemu - it's much more controllable, and allows you to test things faster.
Rich.
Rich, Wanted to updated you on what I was working on using febootstrap. I managed to get a fully booted F10 system that can be PXE booted and pivot the rootfs filesystem into tmpfs. It then forks init off to start the usual boot process. Very happy so far with how it is coming along with the project. The source of the problem turned out to be the upx binary minimize process, once I disabled that it worked great.
I am running into another smaller issue where the rpmdb gets corrupted after the system boots. I am also not running the minimizer to remove the rpmdb as I need this at boot time when the system is up and running to install some required RPMs.
Here is the chroot environment for the image, strangely rpm -qa doesnt report anything yet the files are all in tack (strings Packages even shows data). Have you seen this where the rpmdb gets corrupted at boot time?
CHROOT: [root@ns4 /]# cd /var/lib/rpm/ [root@ns4 rpm]# ls Basenames Dirnames Group Name Providename Pubkeys Requireversion Sigmd5 __db.000 __db.002 __db.004 Conflictname Filedigests Installtid Packages Provideversion Requirename Sha1header Triggername __db.001 __db.003 [root@ns4 rpm]# rm -f __db.00* [root@ns4 rpm]# rpm --rebuilddb [root@ns4 rpm]# rpm -qa [root@ns4 rpm]# ls -ltr total 5920 -rw-r--r-- 1 7103 users 12288 Jul 2 17:01 Pubkeys -rw-r--r-- 1 7103 users 12288 Jul 2 17:02 Conflictname -rw-r--r-- 1 7103 users 12288 Jul 2 17:02 Triggername -rw-r--r-- 1 7103 users 4526080 Jul 2 17:02 Packages -rw-r--r-- 1 7103 users 12288 Jul 2 17:02 Name -rw-r--r-- 1 7103 users 684032 Jul 2 17:02 Basenames -rw-r--r-- 1 7103 users 12288 Jul 2 17:02 Group -rw-r--r-- 1 7103 users 65536 Jul 2 17:02 Requirename -rw-r--r-- 1 7103 users 86016 Jul 2 17:02 Providename -rw-r--r-- 1 7103 users 118784 Jul 2 17:02 Dirnames -rw-r--r-- 1 7103 users 45056 Jul 2 17:02 Requireversion -rw-r--r-- 1 7103 users 32768 Jul 2 17:02 Provideversion -rw-r--r-- 1 7103 users 12288 Jul 2 17:02 Installtid -rw-r--r-- 1 7103 users 12288 Jul 2 17:02 Sigmd5 -rw-r--r-- 1 7103 users 24576 Jul 2 17:02 Sha1header -rw-r--r-- 1 7103 users 684032 Jul 2 17:02 Filedigests
BOOTED SYSTEM:
[root@core021 yum.repos.d]# rpm -qa rpmdb: /var/lib/rpm/Packages: unsupported hash version: 9 error: cannot open Packages index using db3 - Invalid argument (22) error: cannot open Packages database in /var/lib/rpm rpmdb: /var/lib/rpm/Packages: unsupported hash version: 9 error: cannot open Packages database in /var/lib/rpm
On Fri, Jun 19, 2009 at 6:57 PM, Richard W.M. Jones rjones@redhat.comwrote:
On Fri, Jun 19, 2009 at 05:44:52PM -0400, Christopher Johnston wrote:
Now to get this going for my testing I am comfortable using rootfs (pivotroot and switchroot dont seem to work in nash).
I'm a bit surprised about that, because obviously switchroot is used all the time in mkinitrd-built initrds, eg. look at the following lines in mkinitrd:
emit "echo Switching to new root and running init." emit "switchroot" emit "echo Booting has failed." emit "sleep -1"
When you say dont seem to work, is there any diagnostic?
What I have done in my testing here is generate a linuxrc which I posted.
I
attmpted to use bash instead of nash and have it just fork /sbin/init but that also did not work (upstarts init does not seem to work well here).
I looked over the script you are using here for your virtual machine, but what are you actually doing to start init so the usual RC stuff can start running?
Well in the init script we use, we _don't_ run any rc stuff. It's just for a small appliance, so we set up what we need (ifconfig lo and eth0), modprobe a few modules, and then run our daemon directly.
Having said that, it should easily be possible to use febootstrap to run the ordinary scripts, but the problem probably isn't febootstrap, but some bug/difference/sensitivity in /init, nash, permissions etc.
Is the upstart %post script failing? febootstrap runs all %post scripts under fakeroot+fakechroot, and they can sometimes behave differently.
Are you booting these on real hardware? I suggest testing it under qemu - it's much more controllable, and allows you to test things faster.
Rich.
-- Richard Jones, Emerging Technologies, Red Hat http://et.redhat.com/~rjones Read my programming blog: http://rwmj.wordpress.com Fedora now supports 75 OCaml packages (the OPEN alternative to F#) http://cocan.org/getting_started_with_ocaml_on_red_hat_and_fedora
On Tue, Jul 07, 2009 at 10:58:32AM -0400, Christopher Johnston wrote:
Wanted to updated you on what I was working on using febootstrap. I managed to get a fully booted F10 system that can be PXE booted and pivot the rootfs filesystem into tmpfs. It then forks init off to start the usual boot process. Very happy so far with how it is coming along with the project. The source of the problem turned out to be the upx binary minimize process, once I disabled that it worked great.
Yes I should probably just remove upx support, since it's nothing but trouble and saves very little space.
I am running into another smaller issue where the rpmdb gets corrupted after the system boots. I am also not running the minimizer to remove the rpmdb as I need this at boot time when the system is up and running to install some required RPMs.
For febootstrap, remember that you are using the host yum/rpm to initially create the initramfs. When you later boot the system you're using the installed yum/rpm, which could well be different versions. (Same thing happens with mock BTW).
Here is the chroot environment for the image, strangely rpm -qa doesnt report anything yet the files are all in tack (strings Packages even shows data). Have you seen this where the rpmdb gets corrupted at boot time?
CHROOT: [root@ns4 /]# cd /var/lib/rpm/ [root@ns4 rpm]# ls Basenames Dirnames Group Name Providename Pubkeys Requireversion Sigmd5 __db.000 __db.002 __db.004 Conflictname Filedigests Installtid Packages Provideversion Requirename Sha1header Triggername __db.001 __db.003 [root@ns4 rpm]# rm -f __db.00*
Is this running using 'febootstrap-run' or using a plain fakechroot or are you really running this as root?
Using 'rm' directly on files in the chroot is usually a bad idea. See the discussion here:
http://git.et.redhat.com/?p=febootstrap.git;a=blob;f=febootstrap.pod;h=e2086...
[root@ns4 rpm]# rpm --rebuilddb
Similarly, it's only safe to use 'febootstrap-run' to run commands. Or run the commands only in a booted system.
[...]
BOOTED SYSTEM:
[root@core021 yum.repos.d]# rpm -qa rpmdb: /var/lib/rpm/Packages: unsupported hash version: 9 error: cannot open Packages index using db3 - Invalid argument (22) error: cannot open Packages database in /var/lib/rpm rpmdb: /var/lib/rpm/Packages: unsupported hash version: 9 error: cannot open Packages database in /var/lib/rpm
See above on difference between host and installed RPM.
Rich.
On Tue, Jul 07, 2009 at 04:26:25PM +0100, Richard W.M. Jones wrote:
Using 'rm' directly on files in the chroot is usually a bad idea. See the discussion here:
http://git.et.redhat.com/?p=febootstrap.git;a=blob;f=febootstrap.pod;h=e2086...
[root@ns4 rpm]# rpm --rebuilddb
Similarly, it's only safe to use 'febootstrap-run' to run commands.
You might also want to consider the hoops that libguestfs goes through so it only runs commands safely:
http://git.et.redhat.com/?p=libguestfs.git;a=blob;f=appliance/make.sh.in;hb=...
Every time we run a command on the initramfs, it's wrapped in a call to febootstrap-run or febootstrap-install (the latter to install extra files).
If you don't do that you can end up with a corrupted/mismatching fakeroot.log which causes all sorts of subtle and unpredictable problems. eg. We had one where "find" would segfault at random times, and we traced it back to a single use where we had copied a file into the fakeroot without using febootstrap-install.
Rich.