Script Summary
-
Purpose: Identify folders containing MP3 files with bitrates below 320 kbps.
-
Features: Multiprocessing for faster scanning, progress bar for real-time feedback, initial file counting for accurate progress tracking.
-
Requirements: Python 3.x, mutagen (pip install mutagen), tqdm (pip install tqdm).
Usage Instructions
Install Dependencies
source myenv/bin/activate
pip install mutagen tqdm
pip install --user mutagen tqdm
sudo apt install python3-mutagen python3-tqdm
Save and Run the Script:
- Save the script as scan_mp3.py.
Script:
import os
from mutagen.mp3 import MP3
from pathlib import Path
from tqdm import tqdm
import multiprocessing
from concurrent.futures import ProcessPoolExecutor, as_completed
def process_mp3_file(file_path):
"""
Process a single MP3 file and return its parent folder if bitrate < 320 kbps.
"""
try:
audio = MP3(file_path)
if audio.info.bitrate < 320000: # Bitrate in bits/sec
return str(file_path.parent)
return None
except Exception:
return None
def scan_mp3_bitrate(directory, output_file):
"""
Recursively scan a directory for MP3 files with bitrate < 320 kbps using multiprocessing,
with initial file counting for accurate progress bar, and save results to a file.
"""
low_bitrate_folders = set()
# Count MP3 files for progress bar
mp3_files = []
print("Counting MP3 files for progress tracking...")
for root, _, files in os.walk(directory):
for file in files:
if file.lower().endswith('.mp3'):
mp3_files.append(Path(root) / file)
total_files = len(mp3_files)
if total_files == 0:
print("\nNo MP3 files found in the directory.")
with open(output_file, 'w', encoding='utf-8') as f:
f.write("No MP3 files found in the directory.\n")
return
# Scan MP3 files with multiprocessing and progress bar
print(f"\nScanning {total_files} MP3 files using {multiprocessing.cpu_count()} CPU cores...")
with ProcessPoolExecutor() as executor:
futures = [executor.submit(process_mp3_file, file_path) for file_path in mp3_files]
# Process results with progress bar
for future in tqdm(as_completed(futures), total=total_files, desc="Progress", unit="file"):
result = future.result()
if result:
low_bitrate_folders.add(result)
# Print and save results
if low_bitrate_folders:
print("\nFolders containing MP3 files with bitrate < 320 kbps:")
with open(output_file, 'w', encoding='utf-8') as f:
f.write("Folders containing MP3 files with bitrate < 320 kbps:\n")
for folder in sorted(low_bitrate_folders):
print(f"- {folder}")
f.write(f"- {folder}\n")
print(f"\nResults saved to: {output_file}")
else:
print("\nNo folders found with MP3 files under 320 kbps.")
with open(output_file, 'w', encoding='utf-8') as f:
f.write("No folders found with MP3 files under 320 kbps.\n")
print(f"\nResults saved to: {output_file}")
def main():
# Get directory from user
directory = input("Enter the directory to scan (or press Enter for current directory): ").strip()
if not directory:
directory = os.getcwd()
# Verify directory exists
if not os.path.isdir(directory):
print(f"Error: '{directory}' is not a valid directory.")
return
# Get output file path from user
output_file = input("Enter the output file path (or press Enter for 'low_bitrate_folders.txt'): ").strip()
if not output_file:
output_file = 'low_bitrate_folders.txt'
# Ensure output file has a .txt extension
if not output_file.lower().endswith('.txt'):
output_file += '.txt'
print(f"\nScanning directory: {directory}")
scan_mp3_bitrate(directory, output_file)
if __name__ == "__main__":
try:
main()
except KeyboardInterrupt:
print("\nScript terminated by user.")
except Exception as e:
print(f"An unexpected error occurred: {e}")
Run the script:
python scan_mp3.py
Example Output
Enter the directory to scan (or press Enter for current directory): /music
Scanning directory: /music
Counting MP3 files for progress tracking...
Scanning 15000 MP3 files using 8 CPU cores...
Progress: 100%|██████████████████████████| 15000/15000 [00:50<00:00, 300.00file/s]
Folders containing MP3 files with bitrate < 320 kbps:
- /music/Album1
- /music/Album2/Tracks
Performance Notes
-
Speed: Optimized for large datasets (e.g., 15,000 MP3 files) using multiprocessing, typically completing in 30–60 seconds on a 4–8 core CPU.
-
Initial Counting: Adds a few seconds to count files for an accurate progress bar.
-
Scalability: Handles large directories efficiently but may require tuning (e.g., limiting CPU cores) for low-memory systems or slow disks.
Troubleshooting
-
Slow Performance: If the initial counting or scanning is slow, check disk speed (HDD vs. SSD) or reduce CPU cores:
with ProcessPoolExecutor(max_workers=4) as executor:
-
Memory Issues: Reduce max_workers if memory usage is high.
-
Errors: Check for corrupted MP3 files or permission issues. Share error messages for support.
-
Dependencies: Ensure mutagen and tqdm are installed in the correct environment.
Customization Options
-
Add filters to skip specific folders (e.g., .git).
-
Log bitrates of low-bitrate files.
-
Switch to threading for I/O-bound tasks (e.g., slow external drives).
Step-by-Step Breakdown
-
-
os: Provides directory traversal functionality using os.walk() to recursively scan folders.
-
mutagen.mp3.MP3: Reads MP3 file metadata, specifically bitrate.
-
pathlib.Path: Handles file paths in a cross-platform way.
-
tqdm: Displays a progress bar for user feedback during file scanning.
-
multiprocessing and concurrent.futures.ProcessPoolExecutor: Enable parallel processing of MP3 files across CPU cores.
-
as_completed: Processes results as they complete for real-time progress updates.
-
-
Purpose: Processes a single MP3 file to check its bitrate.
-
Input: A file path (as a Path object).
-
Process:
-
Loads the MP3 file using MP3(file_path) to access metadata.
-
Checks if the bitrate is less than 320,000 bits/sec (320 kbps).
-
Returns the parent folder path (as a string) if the bitrate is low, or None otherwise.
-
-
Error Handling: Catches exceptions (e.g., corrupted files) and returns None to avoid interrupting the scan.
-
Purpose: Scans the directory for MP3 files and identifies folders with low-bitrate files.
-
Steps:
-
Initial File Counting:
-
Uses os.walk(directory) to recursively traverse the directory.
-
Collects paths of all files with .mp3 extension (case-insensitive) into a list.
-
Prints “Counting MP3 files for progress tracking…” to inform the user.
-
Stores the total count for the progress bar.
-
-
Check for Empty Directory:
-
If no MP3 files are found, prints “No MP3 files found in the directory.” and exits.
-
-
Multiprocessing Scan:
-
Initializes a ProcessPoolExecutor to use all available CPU cores.
-
Submits each MP3 file for processing using process_mp3_file.
-
Uses tqdm to display a progress bar, updating as files are processed.
-
Collects parent folder paths for low-bitrate files into a set to avoid duplicates.
-
-
Output Results:
-
If low-bitrate folders are found, prints them in sorted order.
-
If none are found, prints “No folders found with MP3 files under 320 kbps.”
-
-
-
-
Purpose: Handles user input and script execution.
-
Steps:
-
Prompts the user to enter a directory path or press Enter to use the current directory (os.getcwd()).
-
Verifies the directory exists using os.path.isdir().
-
If invalid, prints an error and exits.
-
Calls scan_mp3_bitrate(directory) to start the scan.
-
Prints the directory being scanned for clarity.
-
-
-
-
Purpose: Runs the script safely with error handling.
-
Process:
-
Wraps main() in a try-except block.
-
Catches KeyboardInterrupt (Ctrl+C) and prints “Script terminated by user.”
-
Catches unexpected errors and prints them for debugging.
-
-