64-bit stat (or not) in 32-bit Fedora binaries

Eric Sandeen sandeen at redhat.com
Tue Feb 19 15:22:55 UTC 2013


On 2/19/13 4:46 AM, Richard W.M. Jones wrote:
> On Mon, Feb 18, 2013 at 03:33:33PM -0600, Eric Sandeen wrote:
>> XFS recently defaulted to allowing > 32 bit inode numbers, and btrfs
>> can let inode numbers creep past 2^32 as well.
>>
>> While most applications don't care one bit about st_ino returned
>> from a stat() call, the sad fact is that you'll get EOVERFLOW from
>> stat32 if the inode number is too big to fit in 32 bits, even if you
>> just wanted to get the file size.
>>
>> I have a script (http://sandeen.net/misc/summarise_stat.pl) which
>> Greg Banks wrote; it can check a path or list of filenames for
>> binaries which contain non-64bit-safe stat calls.  A quick look over
>> my F18 install finds the situation to be only slightly in favor of
>> executables using 64-bit variants:
>>
>> # ./summarize-stat.pl /usr
>>   270229 91.5% are scripts (shell, perl, whatever)
>>    22633  7.7% don't use any stat() family calls at all
>>      913  0.3% use 32-bit stat() family interfaces only
>>     1335  0.5% use 64-bit stat64() family interfaces only
>>       73  0.0% use both 32-bit and 64-bit stat() family interfaces
>>
>> and it's not just weird obscure packages:
>>
>> # ./summarize-stat.pl `rpm -ql sendmail`
>>      69 78.4% are scripts (shell, perl, whatever)
>>       2  2.3% don't use any stat() family calls at all
>>      17 19.3% use 32-bit stat() family interfaces only
>>
>> Anyway, if you want to check your package(s) and maybe make them
>> 64-bit-stat safe, the perl script above might help.  It's more than
>> just -DFILE_OFFSET_BITS=64, since you'll need to be sure not to
>> overflow any large values you get back from stat64 etc.
>>
>> Might be nice to get out ahead of this before, say, btrfs comes into
>> wide use.  I don't know if there could be any more of a formal
>> effort in this direction?
> 
> Eric, I've read this email and the summarise_stat script a couple of
> times, and I admit I'm confused.

I think Petr answered, but -

> (1) Just ensuring the code is compiled with -DFILE_OFFSET_BITS=64 is
> sufficient to ensure the 32 bit stat will never be called, right?

Yep, I think so.

> (2) If my code never mentions st_ino, it's safe?
> 
>   [I assume the answer to this is *no* because you seem to be saying
>   that -EOVERFLOW could be returned from an innocent-looking stat
>   call, even if the code never looks at st_ino.]

Right, it's stat() itself that gives EOVERFLOW, it doesn't care if
you want st_ino or not.

> (3) For my code that uses st_ino, I need to ensure this is never
> assigned to a 32 bit integer (eg. 'int', 'int32_t', 'long' on 32 bit, etc.)?

To be safe I'd use it in an u64 type, I guess.  The *internal* kernel stat
structure uses u64:

struct kstat {
        u64             ino;

> (4) Is doing (1) & (3) sufficient to fix all stat32-related problems
> in my code?

Yep, I think so.

> (5) With -DFILE_OFFSET_BITS=64, is st_ino a 64 bit value?

Yes. (well, a 64-bit container).  I wish I knew where the
canonical documentation was for these interfaces.

http://opengroup.org/platform/lfs.html is certainly related, as
is http://www.gnu.org/software/libc/manual/html_node/Reading-Attributes.html
although the latter only talks about file sizes. :(

feature_test_macros(7) says:

       _FILE_OFFSET_BITS
              Defining this macro with the value  64  automatically  converts
              references  to  32-bit functions and data types related to file
              I/O and file system operations into references to their  64-bit
              counterparts.  This is useful for performing I/O on large files
              (> 2 Gigabytes) on 32-bit systems.  (Defining this  macro  per-
              mits  correctly written programs to use large files with only a
              recompilation being required.)  64-bit systems naturally permit
              file  sizes greater than 2 Gigabytes, and on those systems this
              macro has no effect.


> Also a note that the current man page for stat(2) doesn't mention this
> problem, doesn't mention that EOVERFLOW could be returned in this
> surprising situation, and also has an example that casts st_ino to a
> long which I assume would be unsafe behaviour on a 32 bit
> architecture.  These are all bugs that the man-pages maintainers would
> no doubt be interested in.

*nod* since good docs seems sparse it could be improved.  I'll file a bug
against that, thanks.

-Eric

> Rich.
> 



More information about the devel mailing list