SV: Re: Most likely OT: rsync to cifs mount

birger birger at birger.sh
Wed Apr 1 19:10:54 UTC 2015


Deduplication kind of handles sparse files since all blocks containing only zero will get mapped to the same storage.

As soon as one of those blocks sharing storage get written to it will be written to a new block, and the usage counter of the shared block gets reduced by one. Once usage reaches zero the block is flagged for reuse. At least that is how it seems to work in the netapp wafl file system. Wafl never rewrites a block in place, it always writes to a new location. I don't know about OneFS.

Sendt fra min Sony Xperia™-smarttelefon

---- Rick Stevens skrev ----

>On 04/01/2015 10:57 AM, Ranjan Maitra wrote:
>> Thanks!
>>
>>
>>> That's EMC's "OneFS" filesystem (EMC bought out Isilon).
>>>
>>>> On Wed, 1 Apr 2015 08:07:34 -0500 Ranjan Maitra <maitra.mbox.ignored at inbox.com> wrote:
>>>>
>>>>> Thanks to both Cameron and you, Bob!
>>>>>
>>>>> After the transfer, here is what we have, on that filesystem:
>>>>>
>>>>> $ du -sh kmeans --apparent-size
>>>>> 154G	kmeans
>>>>>
>>>>> $ du -sh kmeans
>>>>> 628G	kmeans
>>>>>
>>>>> So, I guess that leaves me (and others) stuck.
>>>
>>> Is "kmeans" on the target or the source filesystem?
>>
>> Sorry, this is on the target (Isilon FS). Locally (on a F21 workstation and ext4 FS) it clocks in at 154G and 159G respectively.
>>
>>   If it's the source,
>>> keep in mind that OneFS can do data dedupes (assuming it's enabled),
>>> but it is a NAS device (NFS and/or SMB). I don't believe it's capable
>>> of sparse files (few NAS are). The data dedupe would reduce the actual
>>> storage on disk on the EMC device , but not report it as a sparse
>>> filesystem
>>
>>
>> Yes, I have been given this explanation, as well as that th block size is turned up on the isilon. This means that the size of a single file is probably 16K, rather than the typical 4K desktop file size. However, I do not have files that are that small where it would make a difference. So, I don't know.
>>
>> I see: the dedupe is supposed to run over weekends but I am not sure what it does.
>
>Deduping is a process by which redundant data on a storage device is
>removed. You can loosely think of it as "gzip" at the block level on
>the storage device itself (although gzip is _compression_, not
>deduping). Everything on the device will _appear_ normal, but the
>redundancies will have been removed and less physical space used.
>
>Here's a good explanation:
>
>	http://www.webopedia.com/TERM/D/data_deduplication.html
>
>----------------------------------------------------------------------
>- Rick Stevens, Systems Engineer, AllDigital    ricks at alldigital.com -
>- AIM/Skype: therps2        ICQ: 22643734            Yahoo: origrps2 -
>-                                                                    -
>-       A squeegee, by any other name, wouldn't sound as funny.      -
>----------------------------------------------------------------------
>-- 
>users mailing list
>users at lists.fedoraproject.org
>To unsubscribe or change subscription options:
>https://admin.fedoraproject.org/mailman/listinfo/users
>Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
>Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
>Have a question? Ask away: http://ask.fedoraproject.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.fedoraproject.org/pipermail/users/attachments/20150401/109fe97b/attachment-0001.html>


More information about the users mailing list