Wav2Vec2-BERT STT

Server status: checking…

REST upload (audio ≤ 15s)

WebSocket live streaming

disconnected

VAD segments speech into utterances. Partials update every ~1 s while you speak; a final is emitted when you pause (or on Stop & finalize) and is run through the denoiser before transcription.

Partial:

Final: