Whisper-Powered Subtitle Synchronization

A smart subtitle synchronization tool powered by (OpenAI's) Whisper.

This tool automatically detects and fixes desynchronized subtitles by listening to the audio track of your media. Unlike standard tools that only apply a fixed time shift, this project detects Non-Linear Drift, Framerate Mismatches, and Variable Speed issues, applying an "Elastic" correction map to perfectly align subtitles from start to finish.

Designed to work as a standalone CLI tool or a Bazarr post-processing script.

Note

Generative AI has been used during the development of this project.


Installation

1. Prerequisites

  • Python 3.9+
  • FFmpeg: Must be installed and accessible in your system PATH.
  • Linux: sudo apt install ffmpeg
  • Windows: Download binaries and add to PATH.

2. Clone & Install

git clone <url of this repo>
cd <repo folder>

# (Optional) Create a virtual environment
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows

# Install dependencies
pip install -r requirements.txt


Configuration

All settings are located in config.py. You can tweak these to balance speed vs. accuracy, the most importants being:

SYNC_CONFIG = {
    "device": "cpu",            # Use 'cuda' if you have an NVIDIA GPU
    "compute_type": "int8",     # Use 'float16' for GPU
    "sample_count": 25,         # How many points to check (higher = more accurate curve)
    "scan_duration_sec": 60,    # The length of each audio chunk to transcribe (higher = more data, slower)
    "correction_method": "auto" # "auto", "constant", or "force_elastic"
}


How It Works

  1. Extract: The tool extracts small audio chunks (e.g., 60 seconds) at regular intervals (Checkpoints) throughout the media file.

  2. Transcribe: It uses Whisper to transcribe the speech in those chunks.

  3. Match: It fuzzy-matches the transcribed text against the subtitle file to find the actual timestamp vs the subtitle timestamp.

  4. Analyze:

    • If offsets are stable Apply Global Offset.
    • If offsets drift linearly Apply Linear Regression (Slope correction).
    • If offsets are chaotic Generate an Elastic Map (Piecewise Interpolation).
  5. Apply: The subtitles are rewritten with the corrected timings.


Usage

Command Line (Manual)

You can run the script manually by mimicking the Bazarr argument format:

python main.py \
  episode="/path/to/movie.mkv" \
  episode_name="My Movie" \
  subtitles="/path/to/subs.srt" \
  episode_language="English" \
  subtitles_language="English"

Integration with Bazarr

Caution

Untested yet

This tool is designed to be a "Custom Script" in Bazarr.

  1. Go to Bazarr > Settings > Subtitles > Post-Processing.
  2. Enable "Execute a custom script".
  3. Command:
python /path/to/script/main.py

  1. Arguments:
episode="{{episode}}" episode_name="{{episode_name}}" subtitles="{{subtitles}}" episode_language="{{episode_language}}" subtitles_language="{{subtitles_language}}"

(Note: Bazarr passes these variables automatically).

Description
No description provided
Readme 45 KiB
Languages
Python 100%