chore: add readme

This commit is contained in:
2026-01-30 20:29:49 +01:00
parent 2a7f1a526d
commit c7c4f9a0aa

109
README.md Normal file
View File

@@ -0,0 +1,109 @@
# Whisper-Powered Subtitle Synchronization
**A smart subtitle synchronization tool powered by (OpenAI's) Whisper.**
This tool automatically detects and fixes desynchronized subtitles by listening to the audio track of your media. Unlike standard tools that only apply a fixed time shift, this project detects **Non-Linear Drift**, **Framerate Mismatches**, and **Variable Speed** issues, applying an "Elastic" correction map to perfectly align subtitles from start to finish.
Designed to work as a standalone CLI tool or a **Bazarr** post-processing script.
> [!INFO]
> Generative AI has been used during the development of this project.
---
## Installation
### 1. Prerequisites
* **Python 3.9+**
* **FFmpeg:** Must be installed and accessible in your system PATH.
* *Linux:* `sudo apt install ffmpeg`
* *Windows:* Download binaries and add to PATH.
### 2. Clone & Install
```bash
git clone <url of this repo>
cd <repo folder>
# (Optional) Create a virtual environment
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
# Install dependencies
pip install -r requirements.txt
```
---
## Configuration
All settings are located in `config.py`. You can tweak these to balance speed vs. accuracy, the most importants being:
```python
SYNC_CONFIG = {
"device": "cpu", # Use 'cuda' if you have an NVIDIA GPU
"compute_type": "int8", # Use 'float16' for GPU
"sample_count": 25, # How many points to check (higher = more accurate curve)
"scan_duration_sec": 60, # The length of each audio chunk to transcribe (higher = more data, slower)
"correction_method": "auto" # "auto", "constant", or "force_elastic"
}
```
---
## How It Works
1. **Extract:** The tool extracts small audio chunks (e.g., 60 seconds) at regular intervals (Checkpoints) throughout the media file.
2. **Transcribe:** It uses Whisper to transcribe the speech in those chunks.
3. **Match:** It fuzzy-matches the transcribed text against the subtitle file to find the *actual* timestamp vs the *subtitle* timestamp.
4. **Analyze:**
- If offsets are stable Apply **Global Offset**.
- If offsets drift linearly Apply **Linear Regression** (Slope correction).
- If offsets are chaotic Generate an **Elastic Map** (Piecewise Interpolation).
5. **Apply:** The subtitles are rewritten with the corrected timings.
---
## Usage
### Command Line (Manual)
You can run the script manually by mimicking the Bazarr argument format:
```bash
python main.py \
episode="/path/to/movie.mkv" \
episode_name="My Movie" \
subtitles="/path/to/subs.srt" \
episode_language="English" \
subtitles_language="English"
```
### Integration with Bazarr
> [!CAUTION]
> Untested yet
This tool is designed to be a "Custom Script" in Bazarr.
1. Go to **Bazarr > Settings > Subtitles > Post-Processing**.
2. Enable **"Execute a custom script"**.
3. **Command:**
```bash
python /path/to/script/main.py
```
4. **Arguments:**
```text
episode="{{episode}}" episode_name="{{episode_name}}" subtitles="{{subtitles}}" episode_language="{{episode_language}}" subtitles_language="{{subtitles_language}}"
```
*(Note: Bazarr passes these variables automatically).*