# Whisper-Powered Subtitle Synchronization **A smart subtitle synchronization tool powered by (OpenAI's) Whisper.** This tool automatically detects and fixes desynchronized subtitles by listening to the audio track of your media. Unlike standard tools that only apply a fixed time shift, this project detects **Non-Linear Drift**, **Framerate Mismatches**, and **Variable Speed** issues, applying an "Elastic" correction map to perfectly align subtitles from start to finish. Designed to work as a standalone CLI tool or a **Bazarr** post-processing script. > [!INFO] > Generative AI has been used during the development of this project. --- ## Installation ### 1. Prerequisites * **Python 3.9+** * **FFmpeg:** Must be installed and accessible in your system PATH. * *Linux:* `sudo apt install ffmpeg` * *Windows:* Download binaries and add to PATH. ### 2. Clone & Install ```bash git clone cd # (Optional) Create a virtual environment python -m venv venv source venv/bin/activate # or venv\Scripts\activate on Windows # Install dependencies pip install -r requirements.txt ``` --- ## Configuration All settings are located in `config.py`. You can tweak these to balance speed vs. accuracy, the most importants being: ```python SYNC_CONFIG = { "device": "cpu", # Use 'cuda' if you have an NVIDIA GPU "compute_type": "int8", # Use 'float16' for GPU "sample_count": 25, # How many points to check (higher = more accurate curve) "scan_duration_sec": 60, # The length of each audio chunk to transcribe (higher = more data, slower) "correction_method": "auto" # "auto", "constant", or "force_elastic" } ``` --- ## How It Works 1. **Extract:** The tool extracts small audio chunks (e.g., 60 seconds) at regular intervals (Checkpoints) throughout the media file. 2. **Transcribe:** It uses Whisper to transcribe the speech in those chunks. 3. **Match:** It fuzzy-matches the transcribed text against the subtitle file to find the *actual* timestamp vs the *subtitle* timestamp. 4. **Analyze:** - If offsets are stable Apply **Global Offset**. - If offsets drift linearly Apply **Linear Regression** (Slope correction). - If offsets are chaotic Generate an **Elastic Map** (Piecewise Interpolation). 5. **Apply:** The subtitles are rewritten with the corrected timings. --- ## Usage ### Command Line (Manual) You can run the script manually by mimicking the Bazarr argument format: ```bash python main.py \ episode="/path/to/movie.mkv" \ episode_name="My Movie" \ subtitles="/path/to/subs.srt" \ episode_language="English" \ subtitles_language="English" ``` ### Integration with Bazarr > [!CAUTION] > Untested yet This tool is designed to be a "Custom Script" in Bazarr. 1. Go to **Bazarr > Settings > Subtitles > Post-Processing**. 2. Enable **"Execute a custom script"**. 3. **Command:** ```bash python /path/to/script/main.py ``` 4. **Arguments:** ```text episode="{{episode}}" episode_name="{{episode_name}}" subtitles="{{subtitles}}" episode_language="{{episode_language}}" subtitles_language="{{subtitles_language}}" ``` *(Note: Bazarr passes these variables automatically).*