Proposed new document: Kexec/Kdump Usage Guide

Jarod Wilson jwilson at redhat.com
Thu Aug 24 15:25:11 UTC 2006


I've got a basic kexec/kdump usage guide slapped together, just itching
to see the light of day. Sounds like I need to propose it here and get
approval before publishing it, as well as get added to the
DocWritersGroup (or is that necessary for a draft?).

First draft of kexec/kdump usage guide attached in text form.

-- 
Jarod Wilson
jwilson at redhat.com

-------------- next part --------------
Kexec/Kdump HOWTO

Introduction

Kexec and kdump are new features in the 2.6 mainstream kernel. Major portions of the features are now in Fedora Core 5 and later releases. The purpose of these features is to ensure faster boot up and creation of reliable kernel vmcores for diagnostic purposes.

Overview

Kexec

Kexec is a fastboot mechanism which allows booting a Linux kernel from the context of already running kernel without going through BIOS. BIOS can be very time consuming especially on the big servers with lots of peripherals. This can save a lot of time for developers who end up booting a machine numerous times.

Kdump

Kdump is a new kernel crash dumping mechanism and is very reliable because the crash dump is captured from the context of a freshly booted kernel and not from the context of the crashed kernel. Kdump uses kexec to boot into a second kernel whenever system crashes. This second kernel, often called a capture kernel, boots with very little memory and captures the dump image.

The first kernel reserves a section of memory that the second kernel uses to boot. Kexec enables booting the capture kernel without going through BIOS hence contents of first kernel's memory are preserved, which is essentially the kernel crash dump.

At this point in time, the standard kernel and capture kernel (kernel-kdump) are two different entities, but work is underway to make the standard kernel relocatable (within memory), and thus usable as a capture kernel, eliminating the need for a separate kdump kernel. This feature is currently targeted for delivery as part of Fedora Core 6's General Availability release.


How to configure kexec:

First up, install kexec-tools:

    # yum install kexec-tools

Now load a kernel with kexec:

    # kver=`uname -r`
    # kexec -l /boot/vmlinuz-$kver --initrd=/boot/initrd-$kver.img \
        --command-line="`cat /proc/cmdline`"

NOTE: The above will boot you back into the kernel you're currently running, if you want to load a different kernel, substitute it in place of `uname -r`.

Now reboot your system, taking note that it should bypass the BIOS:

    # reboot


How to configure kdump:

To start off, make sure kernel-kdump, kexec-tools and crash are installed:

    # yum install kernel-kdump kexec-tools crash

To be able to do much of anything interesting in the way of debug analysis, you'll want to install the kernel-debuginfo package:

    # yum --enablerepo=\*debuginfo install kernel-debuginfo

Next up, we need to modify some boot parameters to reserve a chunk of memory for the capture kernel. For i386 and x86_64, edit /etc/grub.conf, and append "crashkernel=128M at 16M" to the end of your kernel line. Similarly, append the same to the append line in /etc/yaboot.conf for ppc64, followed by a /sbin/ybin to load the new configuration (not needed for grub).

Examples:
  # cat /etc/grub.conf
# grub.conf generated by anaconda
#
# Note that you do not have to rerun grub after making changes to this file
# NOTICE:  You have a /boot partition.  This means that
#          all kernel and initrd paths are relative to /boot/, eg.
#          root (hd0,0)
#          kernel /vmlinuz-version ro root=/dev/VolGroup00/root
#          initrd /initrd-version.img
#boot=/dev/hda
default=0
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title Fedora Core (2.6.17-1.2570.fc6)
        root (hd0,0)
        kernel /vmlinuz-2.6.17-1.2570.fc6 ro root=/dev/VolGroup00/root crashkernel=128M at 16M
        initrd /initrd-2.6.17-1.2570.fc6.img


  # cat /etc/yaboot.conf
# yaboot.conf generated by anaconda

boot=/dev/sda1
init-message=Welcome to Fedora Core!\nHit <TAB> for boot options

partition=2
timeout=80
install=/usr/lib/yaboot/yaboot
delay=5
enablecdboot
enableofboot
enablenetboot
nonvram
fstype=raw

image=/vmlinuz-2.6.17-1.2570.fc6
        label=linux
        read-only
        initrd=/initrd-2.6.17-1.2570.fc6.img
        append="root=LABEL=/ crashkernel=128M at 16M"


After making said changes, reboot your system, so that the 128M of memory starting 16M into your memory is let untouched by the normal system, reserved for the capture kernel. Take note that the output of 'free -m' will show 128M less memory than without this parameter, which is expected. You may be able to get by with less than 128M, but testing with only 64M has proven unreliable of late.

Now that you've got that reserved memory region set up, you want to turn on the kdump init script:

    # chkconfig kdump on

Then, start up kdump as well:

    # service kdump start

This should load your kernel-kdump image via kexec, leaving the system ready to capture a vmcore upon crashing. To test this out, you can force-crash your system by echo'ing a c into /proc/sysrq-trigger:

    # echo c > /proc/sysrq-trigger

You should see some panic output, followed by the system restarting into the kdump kernel. When the boot process gets to the point where it starts the kdump service, your vmcore should be copied out to disk (by default, in /var/crash/<YYYY-MM-DD-HH:MM>/vmcore), then the system rebooted back into your normal kernel.

Once back to your normal kernel, you can use the previously installed crash kernel in conjunction with the previously installed kernel-debuginfo to perform postmortem analysis:

    # crash /usr/lib/debug/lib/modules/2.6.17-1.2570.fc6/vmlinux /var/crash/2006-08-23-15:34/vmcore

    crash> bt

and so on...


Caveats:

Console frame-buffers and X are not properly supported. If you typically run with something along the lines of "vga=791" in your kernel config line or have X running, console video will be garbled when a kernel is booted via kexec. Note that the kdump kernel should still be able to create a dump, and when the system reboots, video should be restored to normal.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.fedoraproject.org/pipermail/docs/attachments/20060824/df1681d8/attachment.bin 


More information about the docs mailing list