Some old mails that should have been sent to the list

-------- Original Message --------
Subject: Re: /bpfs
Date: Wed, 1 Feb 2012 11:04:57 -0800
From: John Hawkes <jhawkes@penguincomputing.com>
To: Adam Young <ayoung@redhat.com>


I figured it out.  Yes, an explicit mount is required.  It doesn't
look like this explicit mount occurs in RHEL 4&5, but ... I press
onward, and I'll figure that out later.  So now I've got the /bpfs to
seemingly work correctly on the RHEL 6 slave.

Now I'll work on why my simple bproc_move() program is segfaulting.
It does execute in usermode -- it can perform a syscall(__NR_bproc) to
read the bproc version number, but a syslog() causes a segfault.  If I
eliminate the syslog(), then a call to bproc_currnode() works.  And if
I add a printf() at the end, it prints the return value from
bproc_currnode().  Alas, running this a 2nd time causes a segfault.
I'm trying to puzzle out why it works sometimes and fails other times.

John

On Wed, Feb 1, 2012 at 10:42 AM, Adam Young <ayoung@redhat.com> wrote:
> I really don't remember much about bpfs at all, except for the feeling that
> is was a mistake,  and we should have been using sysfs instead.  I was half
> convinced that we should roll it back to the way it was in the bproc code
> prior to moving to the 2.6 Linux Kernel,  but thought it might be easier to
> move forward than to move back.
>
> As I recall, an explicit mount was required,  but that is really faded in my
> memory.
>
>
>
> On 01/31/2012 09:12 PM, John Hawkes wrote:
>>
>> I change the /bpfs to behave a bit more like /proc, in that after the
>> register_filesystem(), the bpfs code also does a kern_mount_data(),
>> which does the mount, which gets into bpfs_get_sb() and then (because
>> the MS_KERNMOUNT flag is set) does the bpfs_fill_super().  On the
>> master node, /etc/init.d/beowulf does an explicit mount, which I think
>> gets into bpfs_get_sb() a 2nd time, but now the MS_KERNMOUNT flag is
>> off.
>>
>> On the slave, the the same MS_KERNMOUNT flag is seen by bpfs_get_sb(),
>> and the same bpfs_fill_super(), and no 2nd mount is seen (at least not
>> until the /etc/beowulf/fstab may do it, which would be unnecessary).
>>
>> But the readdir() still looks bad on the slave.
>>
>> On the master:
>>
>>  readdir('/bpfs') ino:1 name:'.'
>>  readdir('/bpfs') ino:1 name:'..'
>>  readdir('/bpfs') ino:8 name:'self'
>>  readdir('/bpfs') ino:13 name:'-1'
>>  readdir('/bpfs') ino:11 name:'status'
>>
>> which is what I expect.
>> But on the slave (after I've registered various nodes):
>>
>>  readdir('/bpfs') ino:4018 name:.
>>  readdir('/bpfs') ino:3982 name:..
>>  stat('/bpfs/0') fails:2(No such file or directory)
>>  stat('/bpfs/-1') fails:2(No such file or directory)
>>  stat('/bpfs/self') fails:2(No such file or directory)
>>
>> So I've still got a big problem to solve.
>>
>> john
>>
>>
>> On Tue, Jan 31, 2012 at 4:15 PM, John Hawkes
>> <jhawkes@penguincomputing.com>  wrote:
>>>
>>> So I think what the slave problem is ... is that nothing does an
>>> actual mount of /bpfs.  The /bpfs filesystem gets registered, but
>>> nothing triggers a call to get the superblock or (because of that) to
>>> fill the superblock.
>>>
>>> I see that the /proc filesystem calls kern_mount_data().  Perhaps the
>>> slave side needs to do that.  On the master side, the
>>> /etc/init.d/beowulf script issues a mount command.
>>>
>>> john
>>>
>>> I rewrote much of the /bpfs code for RHEL 6.2, by the way, because
>>> there were lots of changes in the vfs layer.
>>>
>>> On Tue, Jan 31, 2012 at 3:53 PM, John Hawkes
>>> <jhawkes@penguincomputing.com>  wrote:
>>>>
>>>> Apparently I have problems with my port of the /bpfs code, too.
>>>>
>>>> Things seem to be working pretty well for the master, but on the slave
>>>> the behavior is strange.  I discovered this while trying to understand
>>>> why /bpfs/self wasn't being seen on the slave.  I've got the
>>>> kmod-bproc bpfs code instrumented to be verbose, and I do see the
>>>> various /bpfs/ namespace entries being created, but they aren't being
>>>> seen by user-level code.
>>>>
>>>> So I added a few lines to bpmaster and to bpslave, at the appropriate
>>>> spots after the /bpfs was set up, to do:
>>>>        opendir("/rootfs")
>>>> then  readdir()
>>>> and these both succeed on master and slave (although I haven't looked
>>>> at what gets returned by readdir).  However, on the master, I see a
>>>> call to bpfs_root_readdir(), as expected, but I do *not* see this on
>>>> the slave, even though both master and slave bpfs code use the same
>>>> struct and the same inode and file operations.
>>>>
>>>> As you might expect, after the readdir() I do:
>>>>         stat("/bpfs/-1")
>>>> etc. on the master, and
>>>>          stat("/bpfs/-1")
>>>> on the slave... the master sees a successful stat() of all the names
>>>> that I expect to be there, but on the slave the stat() calls fail.
>>>> And nothing in the bpfs code seems to be called.  So I conclude that
>>>> the /bpfs on the master gets set up as I expect, but on the slave it
>>>> doesn't get set up correctly... even though I'm doing the same
>>>> operations on both for the /bpfs root and the superblock.
>>>>
>>>> Weird.
>>>>
>>>> I hope this isn't part of something subtle with the /rootfs being used
>>>> as the temporary root during the initial phases of the slave boot.
>>>>
>>>> John
>
>