How to Find duplicate files



Let’s say you have a folder with 5000 MP3 files you want to check for duplicates. Or a directory containing thousands of EPUB files, all with different names but you have a hunch some of them might be duplicates. You can cd your way in the console up to that particular folder and then do a

find -not -empty -type f -printf “%s\n” | sort -rn | uniq -d | xargs -I{} -n1 find -type f -size {}c -print0 | xargs -0 md5sum | sort | uniq -w32 --all-repeated=separate

This will output a list of files that are duplicates, according tot their HASH signature.
Another way is to install fdupes and do a

fdupes -r ./folder > duplicates_list.txt

The -r is for recursivity. Check the duplicates_list.txt afterwards in a text editor for a list of duplicate files.

Digg Google Bookmarks reddit Mixx StumbleUpon Technorati Yahoo! Buzz DesignFloat Delicious BlinkList Furl

Popular Posts