Block size

JD jd1008 at gmail.com
Sun Oct 24 19:09:29 UTC 2010


  On 10/24/2010 03:46 AM, Patrick Dupre wrote:
> On Sat, 23 Oct 2010, JD wrote:
>
>>  On 10/23/2010 02:49 PM, Patrick Dupre wrote:
>>> On Sat, 23 Oct 2010, Patrick Dupre wrote:
>>>
>>>> Hello,
>>>>
>>>> On a logical partition I am unable to set the block size to 1024 !
>>>> I formatted the partition mke2fs /dev/VG1/part1 -b 1024 -t ext4
>>>> and when I do a
>>>> blockbsz --getbsz or mke2fs -n
>>>> I get 4096 !
>>>> I also tried 2048 with the same result.
>>> But dumpe2fs gives the right answer !
>>>
>>>>
>>>> Is it related to the logical partition ?
>>>>
>>>> Thank.
>>>>
>>>>
>>>
>> You have to be aware that selecting such a small
>> block size means that the FS will be allocating
>> at least 256 bytes of inode space per 1K data block.
>> So, the amount of space for on-disk metadata is about
>> 25% of the space for data blocks. This is OK if you have
>> a lot of small files about 1 to 4k in size. Buf it most of
>> your files are much larger than 1kb, then you have
>> wasted a lot of disk space for inodes that may never
>> be used.
> Is there a tool to optimize the block size ?
> Determination of the average file size ?
>
> Thank.
>
A middle of the road selection of the block size, and
proper selection of the fragment size would probably
serve you well if your file sizes are evenly distributed
within your existing range from smallest file to largest
file.
So, you might want to run mkfs and select a block size
of say,.. 4k or 8k, and a fragment size of 1k. fragment
size has a tiny computational overhead when creating
a new file, or expanding a current file. This will support
both large files and small files. The caveat is that the
probability that the last block of a file has unused
fragments is 50%.
Which is not bad at all - since we are talking only about
the very last block occupied by a file.
Also, fragments eventually end up being used anyhow
during file expansions and new file creations.
This scheme has a slight performance overhead, but makes
maximal use of the disk space. So, if you are not into super
high performance requirements, such as realtime acquisition
and storage of large data from multiple sources, then
I believe this would be a good choice.


More information about the users mailing list