自定义

Reddit – The heart of the internet

发布时间：2025年11月3日来源：szf

Hey everyone! I hope you’re having a great day. I recently compared all the open source whisper-based packages that support long-form transcription. Long-form transcription is basically transcribing audio files that are longer than whisper’s input limit, which is 30 seconds. This can be useful if you want to chat with a youtube video or podcast etc. I compared the following packages: OpenAI’s official whisper package Huggingface Transformers Huggingface BetterTransformer (aka Insanely-fast-whisper) FasterWhisper WhisperX Whisper.cpp I compared between them in the following areas: Accuracy – using word error rate (wer) and character error rate (cer) Efficieny – using vram usage and latency I’ve written a detailed blog post about this. If you just want the results, here they are: For all metrics, lower is better If you have any comments or questions please leave them below.

关键词： 30秒 FasterWhisper Huggingface BetterTransformer (aka Insanely-fast-whisper)Huggingface Transformers OpenAI's official whisper package Whisper.cpp WhisperX Whisper包比较 Whisper输入限制准确率 - wer和cer 开源Whisper包效率 - ram使用和latency 长文本长文本转录音频文件

Reddit – The heart of the internet

你可能还想读

AI眼镜厂商Solos发布两款新智能眼镜，售价249美元起

大模型为深度伪造带来土壤，业界呼吁跨学科联合攻坚鉴伪技术

小米AI眼镜发布，支持“看一下支付”，1999元起

小米AI眼镜发布，支持“看一下支付”，1999元起

小米首款AI眼镜在京东开售 1999元起晒单享3期免息

6月国产手机各价位段销量冠军出炉：华为中高端齐开花

华为发布Pura80系列手机 售价6499元起

2699元起 华为nova10系列今日发布_TechWeb

华为发布Pura80系列手机售价6499元起

2699元起华为nova10系列今日发布_TechWeb