Action plan for koji01 reboots

Mike McGrath mmcgrath at redhat.com
Mon Apr 5 14:12:51 UTC 2010


On Sun, 4 Apr 2010, Jon Stanley wrote:

> So I'd like to put together an action plan to deal with the koji01
> reboot issues. Right now, we're not capturing crash dumps on this
> machine (or any other, but I'm not sure there's value in doing so
> unless we have an active, systemic problem like we're facing here) -
> not saying that there'd be any *to* capture, but there probably are.
> I'd like to setup kdump on this machine after the beta freeze is over,
> but I'd like buyin from other people before doing it. Here's what I'd
> propose:
>
> 1) Present another LV from bxen02 to koji01 and mount it at /var/crash
> (the rootfs on koji01 is only 10GB, and we'd need more for a crash
> dump or two - I'd say 20GB would be sufficient to hold two crashes,
> since it's an 8GB domain). It looks like VolGroup01 where koji01 lives
> has about 80G free.
> 2) Install kexec-tools and configure appropriately (includes adding
> crashkernel=128M at 16M to grub.conf)
> 3) Reboot machine, and wait for it to crash again.
> 4) Analyze the (hopefully) resulting crash dumps :)
> 5) Profit!
>
> Any objections?

I think this is fine, you planning on doing it before or after the freeze?

	-Mike


More information about the infrastructure mailing list