Show HN: Overlapping Speaker Transcription Model
Category: ai-ml
Tags: speech-recognition, transcription, whisper, multi-speaker, ai
Score: 7.0/10 (Innovation: 7, Technical: 7, Documentation: 8, Utility: 7)
Chorus-v1 is a fine-tuned Whisper model that transcribes overlapping two-speaker audio into separate, timestamped transcripts per speaker in a single forward pass per speaker, eliminating the need for a separate diarization step. It's interesting because it addresses a common pain point in meeting transcription—speaker overlap—with a clever token-conditioning approach that directly integrates speaker separation into the ASR model itself.
Target audience: data engineers, backend devs, devops
Repository: https://huggingface.co/Trelis/Chorus-v1
View on Hacker News