Python Script to Scan for Low-Bitrate MP3 Files

This Python script recursively scans a specified directory for MP3 files with a bitrate less than 320 kbps, lists the containing folders, and uses multiprocessing for speed and a progress bar for user feedback.
Below is a step-by-step breakdown of how the script works, suitable for users with basic Python knowledge.
Script Summary
  • Purpose: Identify folders containing MP3 files with bitrates below 320 kbps.
  • Features: Multiprocessing for faster scanning, progress bar for real-time feedback, initial file counting for accurate progress tracking.
  • Requirements: Python 3.x, mutagen (pip install mutagen), tqdm (pip install tqdm).
Usage Instructions
Install Dependencies
In a virtual environment:
source myenv/bin/activate
pip install mutagen tqdm
Or for user-level installation, use:
pip install --user mutagen tqdm
Or via system package manager (Ubuntu/Debian):
sudo apt install python3-mutagen python3-tqdm
Save and Run the Script:
  • Save the script as scan_mp3.py.

Script:

import os
from mutagen.mp3 import MP3
from pathlib import Path
from tqdm import tqdm
import multiprocessing
from concurrent.futures import ProcessPoolExecutor, as_completed

def process_mp3_file(file_path):
    """
    Process a single MP3 file and return its parent folder if bitrate < 320 kbps.
    """
    try:
        audio = MP3(file_path)
        if audio.info.bitrate < 320000:  # Bitrate in bits/sec
            return str(file_path.parent)
        return None
    except Exception:
        return None

def scan_mp3_bitrate(directory, output_file):
    """
    Recursively scan a directory for MP3 files with bitrate < 320 kbps using multiprocessing,
    with initial file counting for accurate progress bar, and save results to a file.
    """
    low_bitrate_folders = set()
    
    # Count MP3 files for progress bar
    mp3_files = []
    print("Counting MP3 files for progress tracking...")
    for root, _, files in os.walk(directory):
        for file in files:
            if file.lower().endswith('.mp3'):
                mp3_files.append(Path(root) / file)
    
    total_files = len(mp3_files)
    if total_files == 0:
        print("\nNo MP3 files found in the directory.")
        with open(output_file, 'w', encoding='utf-8') as f:
            f.write("No MP3 files found in the directory.\n")
        return
    
    # Scan MP3 files with multiprocessing and progress bar
    print(f"\nScanning {total_files} MP3 files using {multiprocessing.cpu_count()} CPU cores...")
    with ProcessPoolExecutor() as executor:
        futures = [executor.submit(process_mp3_file, file_path) for file_path in mp3_files]
        
        # Process results with progress bar
        for future in tqdm(as_completed(futures), total=total_files, desc="Progress", unit="file"):
            result = future.result()
            if result:
                low_bitrate_folders.add(result)
    
    # Print and save results
    if low_bitrate_folders:
        print("\nFolders containing MP3 files with bitrate < 320 kbps:")
        with open(output_file, 'w', encoding='utf-8') as f:
            f.write("Folders containing MP3 files with bitrate < 320 kbps:\n")
            for folder in sorted(low_bitrate_folders):
                print(f"- {folder}")
                f.write(f"- {folder}\n")
        print(f"\nResults saved to: {output_file}")
    else:
        print("\nNo folders found with MP3 files under 320 kbps.")
        with open(output_file, 'w', encoding='utf-8') as f:
            f.write("No folders found with MP3 files under 320 kbps.\n")
        print(f"\nResults saved to: {output_file}")

def main():
    # Get directory from user
    directory = input("Enter the directory to scan (or press Enter for current directory): ").strip()
    if not directory:
        directory = os.getcwd()
    
    # Verify directory exists
    if not os.path.isdir(directory):
        print(f"Error: '{directory}' is not a valid directory.")
        return
    
    # Get output file path from user
    output_file = input("Enter the output file path (or press Enter for 'low_bitrate_folders.txt'): ").strip()
    if not output_file:
        output_file = 'low_bitrate_folders.txt'
    
    # Ensure output file has a .txt extension
    if not output_file.lower().endswith('.txt'):
        output_file += '.txt'
    
    print(f"\nScanning directory: {directory}")
    scan_mp3_bitrate(directory, output_file)

if __name__ == "__main__":
    try:
        main()
    except KeyboardInterrupt:
        print("\nScript terminated by user.")
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
Run the script:
python scan_mp3.py
Enter a directory path or press Enter for the current directory
Example Output
Enter the directory to scan (or press Enter for current directory): /music

Scanning directory: /music
Counting MP3 files for progress tracking...

Scanning 15000 MP3 files using 8 CPU cores...
Progress: 100%|██████████████████████████| 15000/15000 [00:50<00:00, 300.00file/s]

Folders containing MP3 files with bitrate < 320 kbps:
- /music/Album1
- /music/Album2/Tracks
Performance Notes
  • Speed: Optimized for large datasets (e.g., 15,000 MP3 files) using multiprocessing, typically completing in 30–60 seconds on a 4–8 core CPU.
  • Initial Counting: Adds a few seconds to count files for an accurate progress bar.
  • Scalability: Handles large directories efficiently but may require tuning (e.g., limiting CPU cores) for low-memory systems or slow disks.
Troubleshooting
  • Slow Performance: If the initial counting or scanning is slow, check disk speed (HDD vs. SSD) or reduce CPU cores:
with ProcessPoolExecutor(max_workers=4) as executor:
  • Memory Issues: Reduce max_workers if memory usage is high.
  • Errors: Check for corrupted MP3 files or permission issues. Share error messages for support.
  • Dependencies: Ensure mutagen and tqdm are installed in the correct environment.
Customization Options
  • Add filters to skip specific folders (e.g., .git).
  • Log bitrates of low-bitrate files.
  • Switch to threading for I/O-bound tasks (e.g., slow external drives).

Step-by-Step Breakdown

Import Required Libraries:
    • os: Provides directory traversal functionality using os.walk() to recursively scan folders.
    • mutagen.mp3.MP3: Reads MP3 file metadata, specifically bitrate.
    • pathlib.Path: Handles file paths in a cross-platform way.
    • tqdm: Displays a progress bar for user feedback during file scanning.
    • multiprocessing and concurrent.futures.ProcessPoolExecutor: Enable parallel processing of MP3 files across CPU cores.
    • as_completed: Processes results as they complete for real-time progress updates.
Define process_mp3_file Function:
  • Purpose: Processes a single MP3 file to check its bitrate.
  • Input: A file path (as a Path object).
  • Process:
    • Loads the MP3 file using MP3(file_path) to access metadata.
    • Checks if the bitrate is less than 320,000 bits/sec (320 kbps).
    • Returns the parent folder path (as a string) if the bitrate is low, or None otherwise.
  • Error Handling: Catches exceptions (e.g., corrupted files) and returns None to avoid interrupting the scan.
Define scan_mp3_bitrate Function:
  • Purpose: Scans the directory for MP3 files and identifies folders with low-bitrate files.
  • Steps:
    • Initial File Counting:
      • Uses os.walk(directory) to recursively traverse the directory.
      • Collects paths of all files with .mp3 extension (case-insensitive) into a list.
      • Prints “Counting MP3 files for progress tracking…” to inform the user.
      • Stores the total count for the progress bar.
    • Check for Empty Directory:
      • If no MP3 files are found, prints “No MP3 files found in the directory.” and exits.
    • Multiprocessing Scan:
      • Initializes a ProcessPoolExecutor to use all available CPU cores.
      • Submits each MP3 file for processing using process_mp3_file.
      • Uses tqdm to display a progress bar, updating as files are processed.
      • Collects parent folder paths for low-bitrate files into a set to avoid duplicates.
    • Output Results:
      • If low-bitrate folders are found, prints them in sorted order.
      • If none are found, prints “No folders found with MP3 files under 320 kbps.”
Define main Function:
    • Purpose: Handles user input and script execution.
    • Steps:
      • Prompts the user to enter a directory path or press Enter to use the current directory (os.getcwd()).
      • Verifies the directory exists using os.path.isdir().
      • If invalid, prints an error and exits.
      • Calls scan_mp3_bitrate(directory) to start the scan.
      • Prints the directory being scanned for clarity.
Main Execution Block:
    • Purpose: Runs the script safely with error handling.
    • Process:
      • Wraps main() in a try-except block.
      • Catches KeyboardInterrupt (Ctrl+C) and prints “Script terminated by user.”
      • Catches unexpected errors and prints them for debugging.