On Mon, 19 Sep 2016 17:23:39 -0600 Chris Murphy lists@colorremedies.com wrote:
Drives A and B have many overlapping files but I want to find out what files don't exist on each. Thwarting this is directory structure differs between the two drives, and I'm fairly certain some of the file names differ on the two drives also.
Therefore I need something hash based. I started with this:
$ find /brickA -type f -exec md5sum "{}" + > brickA.txt $ find /brickB -type f -exec md5sum "{}" + > brickB.txt
What I need next is to:
Make a copy of the files, brickAcopy.txt and brickBcopy.txt Loop: Extract each md5sum in brickA.txt, grep for it in brickAcopy.txt and brickBcopy.txt, and if it's found in both, delete the line in both files.
What remains in each file are paths to files that don't exist on the other drive. This must be a solved problem, so I'm open to alternative approaches.
Ideas?
Here's some linux utilities a quick search turned up. http://www.howtogeek.com/201140/how-to-find-and-remove-duplicate-files-on-li... http://askubuntu.com/questions/3865/how-to-find-and-delete-duplicate-files
At least fslint and fdupes are in the fedora repositories, maybe others.