subtitles-sync-whisper/README.md

# Whisper-Powered Subtitle Synchronization

**A smart subtitle synchronization tool powered by (OpenAI's) Whisper.**

This tool automatically detects and fixes desynchronized subtitles by listening to the audio track of your media. Unlike standard tools that only apply a fixed time shift, this project detects **Non-Linear Drift**, **Framerate Mismatches**, and **Variable Speed** issues, applying an "Elastic" correction map to perfectly align subtitles from start to finish.

Designed to work as a standalone CLI tool or a **Bazarr** post-processing script.

> [!NOTE]
> Generative AI has been used during the development of this project.
---

## Installation
### 1. Prerequisites

* **Python 3.9+**
* **FFmpeg:** Must be installed and accessible in your system PATH.
* *Linux:* `sudo apt install ffmpeg`
* *Windows:* Download binaries and add to PATH.


### 2. Clone & Install

```bash
git clone <url of this repo>
cd <repo folder>

# (Optional) Create a virtual environment
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows

# Install dependencies
pip install -r requirements.txt

```

---

## Configuration

All settings are located in `config.py`. You can tweak these to balance speed vs. accuracy, the most importants being:

```python
SYNC_CONFIG = {
    "device": "cpu",            # Use 'cuda' if you have an NVIDIA GPU
    "compute_type": "int8",     # Use 'float16' for GPU
    "sample_count": 25,         # How many points to check (higher = more accurate curve)
    "scan_duration_sec": 60,    # The length of each audio chunk to transcribe (higher = more data, slower)
    "correction_method": "auto" # "auto", "constant", or "force_elastic"
}

```

---

## How It Works

1. **Extract:** The tool extracts small audio chunks (e.g., 60 seconds) at regular intervals (Checkpoints) throughout the media file.
2. **Transcribe:** It uses Whisper to transcribe the speech in those chunks.
3. **Match:** It fuzzy-matches the transcribed text against the subtitle file to find the *actual* timestamp vs the *subtitle* timestamp.
4. **Analyze:**
   - If offsets are stable  Apply **Global Offset**.
   - If offsets drift linearly  Apply **Linear Regression** (Slope correction).
   - If offsets are chaotic  Generate an **Elastic Map** (Piecewise Interpolation).


5. **Apply:** The subtitles are rewritten with the corrected timings.

---

## Usage
### Command Line (Manual)

You can run the script manually by mimicking the Bazarr argument format:

```bash
python main.py \
  episode="/path/to/movie.mkv" \
  episode_name="My Movie" \
  subtitles="/path/to/subs.srt" \
  episode_language="English" \
  subtitles_language="English"

```

### Integration with Bazarr

> [!CAUTION]
> Untested yet

This tool is designed to be a "Custom Script" in Bazarr.

1. Go to **Bazarr > Settings > Subtitles > Post-Processing**.
2. Enable **"Execute a custom script"**.
3. **Command:**
```bash
python /path/to/script/main.py

```


4. **Arguments:**
```text
episode="{{episode}}" episode_name="{{episode_name}}" subtitles="{{subtitles}}" episode_language="{{episode_language}}" subtitles_language="{{subtitles_language}}"

```

*(Note: Bazarr passes these variables automatically).*