On Tue, Sep 20, 2016 at 11:55 AM, Ahmad Samir <ahmadsamir3891(a)gmail.com> wrote:
On 20 September 2016 at 13:00, Ahmad Samir
<ahmadsamir3891(a)gmail.com> wrote:
> On 20 September 2016 at 12:34, Ahmad Samir <ahmadsamir3891(a)gmail.com> wrote:
>> On 20 September 2016 at 10:33, Ahmad Samir <ahmadsamir3891(a)gmail.com>
wrote:
>>>
>>> Here's a crude way:
>>> $ find /brickA -type f -exec md5sum "{}" + | sort > brickA.txt
>>> $ find /brickB -type f -exec md5sum "{}" + | sort > brickB.txt
>>> $ diff -U 0 brickA.txt brickB.txt | sort -k 1.1,1.1 > A-B.diff
>>>
>>> Ignoring lines beginning with @@, +++ or --- , the lines beginning
>>> with - are in A but not B ... etc
>>>
>>
>> Please disregard that, it won't work...
>>
>
> More experimenting:
> $ find A -exec md5sum '{}' + > a-md5
> $ find B -exec md5sum '{}' + > b-md5
> $ cat a-md5 b-md5 > All
> $ sort -u -k 1,1 All
>
> that should output a list of files that are in one dir but not the other.
>
Doesn't work either, sorry for the noise.
I appreciate the effort. Maybe I'm overestimating how common this
situation must be, or underestimating the difficulty.
Anyway it's not super urgent. Btrfs gets in-band deduplication pretty
soon so the older volume can just have both path structures with
deduped data. The volume is too small to do out-of-band dedup which
requires copying all the data over first, and then deduping it.
--
Chris Murphy