A collection of command-line tools for efficient file management and analysis, built with Go.
- dupfind: Find duplicate files in a directory tree by comparing file hashes. Efficiently identifies identical files regardless of filename or location.
- dirstat: Analyze directory and subdirectories for comprehensive file statistics including sizes, types, and utilization percentages.
- rename: Rename files in a directory using pattern matching and sed-like replacements.
- Multiple Output Formats: Support for text, JSON, XML, and HTML output formats
- File Output: Redirect output to files instead of stdout
- Flexible Hashing: Choose from MD5, SHA1, or SHA256 hash algorithms
- File/Directory Exclusions: Exclude files and directories from processing with pattern matching and file type filtering
- Structured Data: JSON/XML output provides machine-readable duplicate file information with metadata
- Rich HTML Reports: Generate professional HTML reports with styling, statistics, and interactive features
- Comprehensive Metadata: All structured outputs include execution context and branding information
- File organizer
- File size analyzer
- Go 1.24.5 or later
git clone https://github.com/amurru/filetools.git
cd filetools
make build
make test # Run testsThe binary will be created as bin/filetools.
go install github.com/amurru/filetools@latestmake test # Run all tests
make clean # Clean build artifacts
make run # Build and run the applicationFind duplicate files in a directory tree with flexible output options.
Find duplicate files in a directory:
filetools dupfind /path/to/directoryIf no directory is specified, it uses the current directory:
filetools dupfindChoose from multiple output formats:
# Text output (default)
filetools dupfind /path/to/directory
# JSON output
filetools dupfind -o json /path/to/directory
filetools dupfind -j /path/to/directory
# XML output
filetools dupfind -o xml /path/to/directory
filetools dupfind -x /path/to/directory
# HTML output (generates a styled web page)
filetools dupfind -o html /path/to/directory
filetools dupfind -w /path/to/directoryRedirect output to a file instead of stdout:
# Save results to a file
filetools dupfind -f results.txt /path/to/directory
filetools dupfind -o json -f duplicates.json /path/to/directory
filetools dupfind -w -f report.html /path/to/directoryChoose the hash algorithm for file comparison:
# Use different hash algorithms (default: md5)
filetools dupfind -H sha256 /path/to/directory
filetools dupfind -H sha1 /path/to/directory
filetools dupfind -H md5 /path/to/directoryCombine multiple options:
# Generate JSON report with SHA256 hashes, save to file
filetools dupfind -H sha256 -o json -f report.json /path/to/directory
# Create HTML report with MD5 hashes
filetools dupfind -H md5 -w -f analysis.html /path/to/directoryText Output (default):
Generated by filetools dupfind v1.0.0 on 2025-10-27T14:30:45Z (hash: md5, output: text)
Duplicate files found:
- file1.txt (size: 1024 bytes, hash: a1b2c3d4...)
- /path/to/dir1/file1.txt
- /path/to/dir2/file1.txt
- file2.txt (size: 2048 bytes, hash: e5f6g7h8...)
- /path/to/dir3/file2.txt
- /path/to/dir4/file2.txt
JSON Output:
{
"metadata": {
"tool_name": "filetools",
"sub_command": "dupfind",
"flags": [
{
"name": "hash",
"value": "md5"
},
{
"name": "output",
"value": "json"
}
],
"version": "1.0.0",
"generated_at": "2025-10-27T14:30:45Z"
},
"groups": [
{
"hash": "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6",
"hash_type": "md5",
"size": 1024,
"files": ["/path/to/dir1/file1.txt", "/path/to/dir2/file1.txt"]
}
],
"found": true
}XML Output:
<?xml version="1.0" encoding="UTF-8"?>
<DuplicateResult>
<metadata>
<toolName>filetools</toolName>
<subCommand>dupfind</subCommand>
<flags>
<flag>
<name>hash</name>md5
</flag>
<flag>
<name>output</name>xml
</flag>
</flags>
<version>1.0.0</version>
<generatedAt>2025-10-27T14:30:45Z</generatedAt>
</metadata>
<groups>
<hash>a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6</hash>
<hashType>md5</hashType>
<size>1024</size>
<files>/path/to/dir1/file1.txt</files>
<files>/path/to/dir2/file1.txt</files>
</groups>
<found>true</found>
</DuplicateResult>HTML Output: Generates a complete HTML page with:
- Professional styling and layout
- Summary statistics
- Color-coded file badges (original/duplicate)
- Responsive design
- Interactive features: clickable hashes (copy to clipboard), collapsible duplicate groups
- Program branding footer
Exclude files and directories from processing while still reporting them in the output.
Exclude files matching patterns or file types:
# Exclude specific file patterns (globs)
filetools dupfind --exclude-file "*.log,*.tmp,cache/*" /path/to/directory
# Exclude by file type (matches extensions)
filetools dupfind --exclude-file "*.jpg,*.png,*.gif" /path/to/directory
# Combine with other options
filetools dirstat --exclude-file "*.log,*.tmp" -o json /path/to/directoryExclude entire directories matching patterns:
# Exclude common directories
filetools dupfind --exclude-dir "node_modules,.git,build" /path/to/directory
# Exclude by pattern
filetools dirstat --exclude-dir "temp*,cache*" /path/to/directoryUse both file and directory exclusions together:
filetools dupfind --exclude-file "*.log,*.tmp" --exclude-dir "node_modules,.git" /path/to/directoryExcluded items are listed in the "Exclusions" section of all output formats:
Text Output:
Excluded files and directories:
- node_modules (dir_pattern)
- cache/file.log (file_pattern)
- temp/image.jpg (file_type)
JSON/XML Output:
Excluded items are included in the exclusions array with path and reason fields.
HTML Output: Exclusions are displayed in a sortable table with professional styling.
Analyze directory and subdirectories for comprehensive file statistics.
Rename files in a directory using pattern matching and sed-like replacements.
Analyze a directory for file statistics:
filetools dirstat /path/to/directoryIf no directory is specified, it uses the current directory:
filetools dirstatChoose from multiple output formats:
# Text output (default)
filetools dirstat /path/to/directory
# JSON output
filetools dirstat -o json /path/to/directory
filetools dirstat -j /path/to/directory
# XML output
filetools dirstat -o xml /path/to/directory
filetools dirstat -x /path/to/directory
# HTML output (generates a styled web page)
filetools dirstat -o html /path/to/directory
filetools dirstat -w /path/to/directoryRedirect output to a file instead of stdout:
# Save results to a file
filetools dirstat -f stats.txt /path/to/directory
filetools dirstat -o json -f stats.json /path/to/directory
filetools dirstat -w -f report.html /path/to/directoryCombine multiple options:
# Generate JSON statistics, save to file
filetools dirstat -o json -f stats.json /path/to/directory
# Create HTML report
filetools dirstat -w -f analysis.html /path/to/directory
# Analyze with exclusions
filetools dirstat --exclude-file "*.log,*.tmp" --exclude-dir "node_modules,.git" -w -f clean-report.html /path/to/directoryText Output (default):
Generated by filetools dirstat v1.0.0 on 2025-10-27T14:30:45Z (output: text)
Directory Statistics
===================
Total Files: 150
Total Size: 25.3 MB
Largest File: large_video.mp4 (15.2 MB)
File Types
----------
Extension Count Size Percentage
------------ -------- --------- ----------
.mp4 5 18.5 MB 73.12%
.jpg 45 4.2 MB 16.60%
.txt 30 1.8 MB 7.11%
.pdf 12 780 KB 3.01%
(no ext) 58 45 KB 0.17%
Subdirectories
--------------
Path Files Size Percentage
----------------------- -------- --------- ----------
videos 5 18.5 MB 73.12%
images 45 4.2 MB 16.60%
documents 42 2.6 MB 10.28%
...
JSON Output:
{
"metadata": {
"tool_name": "filetools",
"sub_command": "dirstat",
"flags": [
{
"name": "output",
"value": "json"
}
],
"version": "1.0.0",
"generated_at": "2025-10-27T14:30:45Z"
},
"total_files": 150,
"total_size": 26528934,
"largest_file": {
"name": "large_video.mp4",
"size": 15920000,
"path": "videos/large_video.mp4"
},
"file_types": [
{
"extension": ".mp4",
"count": 5,
"total_size": 19398656,
"percentage": 73.12
}
],
"directories": [
{
"path": "videos",
"file_count": 5,
"total_size": 19398656,
"percentage": 73.12
}
]
}HTML Output: Generates a complete HTML page with:
- Professional styling and layout
- Summary statistics dashboard
- Sortable tables for file types and directories
- Visual percentage bars
- Responsive design
- Program branding footer
Check the version and build information:
filetools versionOutput:
version: 1.0.0
date: 2025-10-27T14:30:45Z
filetools/
├── cmd/ # CLI commands
│ ├── dirstat.go # Directory statistics command
│ ├── dupfind.go # Duplicate file finder command
│ ├── dupfind_test.go # Tests for dupfind
│ ├── root.go # Root command and global flags
│ └── version.go # Version command
├── internal/
│ └── output/ # Output formatting module
│ ├── formatter.go # Core interfaces and data structures
│ ├── json.go # JSON formatter
│ ├── xml.go # XML formatter
│ ├── html.go # HTML formatter
│ ├── text.go # Text formatter
│ └── formatter_test.go # Output tests
├── main.go # Application entry point
├── go.mod # Go module definition
├── Makefile # Build automation
└── README.md # This file
The tool is built with a modular architecture:
- CLI Layer: Uses Cobra for command-line interface with persistent flags
- Core Logic: File hashing and duplicate detection algorithms
- Output Layer: Pluggable formatters for different output types
- Data Flow: Structured data flows from detection → formatting → output (stdout/file)
These flags work with all commands:
-o, --output string: Output format (text, json, xml, html) (default "text")-f, --file string: Output file (default: stdout)-j, --json: Shortcut for-o json-x, --xml: Shortcut for-o xml-w, --html: Shortcut for-o html
-H, --hash string: Hash algorithm (md5, sha1, sha256) (default "md5")
The dirstat command uses only the global flags (no command-specific flags).
Rename files in a directory using pattern matching and sed-like replacements.
Rename files in a directory:
filetools rename --match "*.jpg" --sed "s/^/vacation_/" /photosIf no directory is specified, it uses the current directory:
filetools rename --match "*.txt" --sed "s/draft/final/g"Choose from multiple output formats:
# Text output (default)
filetools rename --match "*.jpg" --sed "s/old/new/" /path
# JSON output
filetools rename -o json --match "*.jpg" --sed "s/old/new/" /path
# XML output
filetools rename -o xml --match "*.jpg" --sed "s/old/new/" /path
# HTML output
filetools rename -o html --match "*.jpg" --sed "s/old/new/" /pathAdd prefix to all JPG files:
filetools rename --match "*.jpg" --sed "s/^/vacation_/" /photosRemove suffix from files:
filetools rename --match "*_old.jpg" --sed "s/_old//" /photosGeneral replacement:
filetools rename --match "*.txt" --sed "s/draft/final/g" /docsBy default, the command runs in dry-run mode for safety:
filetools rename --match "*.jpg" --sed "s/old/new/" /photos
# Shows what would be renamed without making changesTo actually perform the renames:
filetools rename --force --match "*.jpg" --sed "s/old/new/" /photos--match string: File pattern to match (glob, required)--sed string: Sed-style replacement expression (e.g., s/old/new/g, required)--dry-run: Preview changes without executing (default: true)--force: Perform actual renames and overwrite existing files
# View help
filetools --help
filetools dupfind --help
filetools dirstat --help
# Different output combinations
filetools dupfind -j -f results.json /path
filetools dupfind -o xml -f report.xml /path
filetools dupfind -w /path > report.html
# Directory statistics
filetools dirstat -j -f stats.json /path
filetools dirstat -w -f analysis.html /pathRun the test suite:
make testRun specific tests:
go test -run TestCalculateHash ./cmd/
go test ./internal/output/The project follows Go best practices:
- Uses
gofmtfor consistent formatting - Includes comprehensive unit tests
- Follows standard Go naming conventions
- Uses Cobra for CLI framework
- Modular architecture for maintainability
To add a new output format:
- Create a new formatter in
internal/output/ - Implement the
OutputFormatterinterface - Add the format to
NewFormatter()function - Add corresponding flag if needed
- Update tests and documentation
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Write tests for your changes
- Ensure all tests pass (
make test) - Update documentation if needed
- Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Follow the existing code style and architecture
- Add tests for new functionality
- Update README.md for new features
- Ensure backward compatibility
- Use meaningful commit messages
This project is licensed under the MIT License - see the LICENSE file for details.