chore: add readme
This commit is contained in:
109
README.md
Normal file
109
README.md
Normal file
@@ -0,0 +1,109 @@
|
||||
# Whisper-Powered Subtitle Synchronization
|
||||
|
||||
**A smart subtitle synchronization tool powered by (OpenAI's) Whisper.**
|
||||
|
||||
This tool automatically detects and fixes desynchronized subtitles by listening to the audio track of your media. Unlike standard tools that only apply a fixed time shift, this project detects **Non-Linear Drift**, **Framerate Mismatches**, and **Variable Speed** issues, applying an "Elastic" correction map to perfectly align subtitles from start to finish.
|
||||
|
||||
Designed to work as a standalone CLI tool or a **Bazarr** post-processing script.
|
||||
|
||||
> [!INFO]
|
||||
> Generative AI has been used during the development of this project.
|
||||
---
|
||||
|
||||
## Installation
|
||||
### 1. Prerequisites
|
||||
|
||||
* **Python 3.9+**
|
||||
* **FFmpeg:** Must be installed and accessible in your system PATH.
|
||||
* *Linux:* `sudo apt install ffmpeg`
|
||||
* *Windows:* Download binaries and add to PATH.
|
||||
|
||||
|
||||
|
||||
### 2. Clone & Install
|
||||
|
||||
```bash
|
||||
git clone <url of this repo>
|
||||
cd <repo folder>
|
||||
|
||||
# (Optional) Create a virtual environment
|
||||
python -m venv venv
|
||||
source venv/bin/activate # or venv\Scripts\activate on Windows
|
||||
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
All settings are located in `config.py`. You can tweak these to balance speed vs. accuracy, the most importants being:
|
||||
|
||||
```python
|
||||
SYNC_CONFIG = {
|
||||
"device": "cpu", # Use 'cuda' if you have an NVIDIA GPU
|
||||
"compute_type": "int8", # Use 'float16' for GPU
|
||||
"sample_count": 25, # How many points to check (higher = more accurate curve)
|
||||
"scan_duration_sec": 60, # The length of each audio chunk to transcribe (higher = more data, slower)
|
||||
"correction_method": "auto" # "auto", "constant", or "force_elastic"
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Extract:** The tool extracts small audio chunks (e.g., 60 seconds) at regular intervals (Checkpoints) throughout the media file.
|
||||
2. **Transcribe:** It uses Whisper to transcribe the speech in those chunks.
|
||||
3. **Match:** It fuzzy-matches the transcribed text against the subtitle file to find the *actual* timestamp vs the *subtitle* timestamp.
|
||||
4. **Analyze:**
|
||||
- If offsets are stable Apply **Global Offset**.
|
||||
- If offsets drift linearly Apply **Linear Regression** (Slope correction).
|
||||
- If offsets are chaotic Generate an **Elastic Map** (Piecewise Interpolation).
|
||||
|
||||
|
||||
5. **Apply:** The subtitles are rewritten with the corrected timings.
|
||||
|
||||
---
|
||||
|
||||
## Usage
|
||||
### Command Line (Manual)
|
||||
|
||||
You can run the script manually by mimicking the Bazarr argument format:
|
||||
|
||||
```bash
|
||||
python main.py \
|
||||
episode="/path/to/movie.mkv" \
|
||||
episode_name="My Movie" \
|
||||
subtitles="/path/to/subs.srt" \
|
||||
episode_language="English" \
|
||||
subtitles_language="English"
|
||||
|
||||
```
|
||||
|
||||
### Integration with Bazarr
|
||||
|
||||
> [!CAUTION]
|
||||
> Untested yet
|
||||
|
||||
This tool is designed to be a "Custom Script" in Bazarr.
|
||||
|
||||
1. Go to **Bazarr > Settings > Subtitles > Post-Processing**.
|
||||
2. Enable **"Execute a custom script"**.
|
||||
3. **Command:**
|
||||
```bash
|
||||
python /path/to/script/main.py
|
||||
|
||||
```
|
||||
|
||||
|
||||
4. **Arguments:**
|
||||
```text
|
||||
episode="{{episode}}" episode_name="{{episode_name}}" subtitles="{{subtitles}}" episode_language="{{episode_language}}" subtitles_language="{{subtitles_language}}"
|
||||
|
||||
```
|
||||
|
||||
*(Note: Bazarr passes these variables automatically).*
|
||||
Reference in New Issue
Block a user