ASR
It seems that JSUT-5000 only is good enough since whisper has already well-pre-trained......
audio_path
Drop Audio Here
- or -
Click to Upload
Clear
Submit
output