Show HN: I Added Support for Qwen3-ASR and Qwen3 ForcedAligner in WhisperX

Category: ai-ml

Tags: speech-recognition, forced-alignment, transcription, ai, python

Score: 7.0/10 (Innovation: 5, Technical: 7, Documentation: 8, Utility: 8)

WhisperX is a high-performance, batched inference engine for OpenAI's Whisper ASR model, adding accurate word-level timestamps via forced phoneme alignment and speaker diarization. It's interesting because it significantly improves Whisper's practical usability for transcription tasks by making it 70x faster, adding precise timestamps for subtitling, and identifying different speakers, which are critical features for production applications.

Target audience: data engineers, ai engineers, devops, backend devs

Repository: https://github.com/m-bain/whisperX/pull/1401 · Python · BSD-2-Clause · 21289 stars

View on Hacker News