From the course: Python: Working with Files

Unlock this course with a free trial

Join today to access over 25,300 courses taught by industry experts.

Solution: Find duplicate files

Solution: Find duplicate files - Python Tutorial

From the course: Python: Working with Files

Solution: Find duplicate files

- [Narrator] Let's implement a function that determines if two files are duplicates. In the sample code, we get a few helper functions. We'll use these to create files and test if they're duplicates. We also get a calculate sha256 function. This will compute a hash and we can use that hash to determine if two files are duplicates. If two files are the same, then their hashes are the same. Let's implement this function. We'll start by calculating the hash for each file. Now, what does this calculate function actually do? Let's take a look. First, we create a sha256 hash object with the hash lib module. This module provides a convenient interface to various secure hashes. Then we open up the file in binary read mode. This is required for computing the hash. Then we iterate over the file bite blocks. This reads the file and blocks of four kilobytes repeatedly until it returns an empty bite string. That's the B with the double quotes, and it indicates the end of the file. For each bite…

Contents