diff --git a/README.md b/README.md new file mode 100644 index 0000000..a0c637a --- /dev/null +++ b/README.md @@ -0,0 +1,109 @@ +# Whisper-Powered Subtitle Synchronization + +**A smart subtitle synchronization tool powered by (OpenAI's) Whisper.** + +This tool automatically detects and fixes desynchronized subtitles by listening to the audio track of your media. Unlike standard tools that only apply a fixed time shift, this project detects **Non-Linear Drift**, **Framerate Mismatches**, and **Variable Speed** issues, applying an "Elastic" correction map to perfectly align subtitles from start to finish. + +Designed to work as a standalone CLI tool or a **Bazarr** post-processing script. + +> [!INFO] +> Generative AI has been used during the development of this project. +--- + +## Installation +### 1. Prerequisites + +* **Python 3.9+** +* **FFmpeg:** Must be installed and accessible in your system PATH. +* *Linux:* `sudo apt install ffmpeg` +* *Windows:* Download binaries and add to PATH. + + + +### 2. Clone & Install + +```bash +git clone +cd + +# (Optional) Create a virtual environment +python -m venv venv +source venv/bin/activate # or venv\Scripts\activate on Windows + +# Install dependencies +pip install -r requirements.txt + +``` + +--- + +## Configuration + +All settings are located in `config.py`. You can tweak these to balance speed vs. accuracy, the most importants being: + +```python +SYNC_CONFIG = { + "device": "cpu", # Use 'cuda' if you have an NVIDIA GPU + "compute_type": "int8", # Use 'float16' for GPU + "sample_count": 25, # How many points to check (higher = more accurate curve) + "scan_duration_sec": 60, # The length of each audio chunk to transcribe (higher = more data, slower) + "correction_method": "auto" # "auto", "constant", or "force_elastic" +} + +``` + +--- + +## How It Works + +1. **Extract:** The tool extracts small audio chunks (e.g., 60 seconds) at regular intervals (Checkpoints) throughout the media file. +2. **Transcribe:** It uses Whisper to transcribe the speech in those chunks. +3. **Match:** It fuzzy-matches the transcribed text against the subtitle file to find the *actual* timestamp vs the *subtitle* timestamp. +4. **Analyze:** + - If offsets are stable Apply **Global Offset**. + - If offsets drift linearly Apply **Linear Regression** (Slope correction). + - If offsets are chaotic Generate an **Elastic Map** (Piecewise Interpolation). + + +5. **Apply:** The subtitles are rewritten with the corrected timings. + +--- + +## Usage +### Command Line (Manual) + +You can run the script manually by mimicking the Bazarr argument format: + +```bash +python main.py \ + episode="/path/to/movie.mkv" \ + episode_name="My Movie" \ + subtitles="/path/to/subs.srt" \ + episode_language="English" \ + subtitles_language="English" + +``` + +### Integration with Bazarr + +> [!CAUTION] +> Untested yet + +This tool is designed to be a "Custom Script" in Bazarr. + +1. Go to **Bazarr > Settings > Subtitles > Post-Processing**. +2. Enable **"Execute a custom script"**. +3. **Command:** +```bash +python /path/to/script/main.py + +``` + + +4. **Arguments:** +```text +episode="{{episode}}" episode_name="{{episode_name}}" subtitles="{{subtitles}}" episode_language="{{episode_language}}" subtitles_language="{{subtitles_language}}" + +``` + +*(Note: Bazarr passes these variables automatically).* \ No newline at end of file