whisper-large-v3-ivrit-coreml
CoreML encoder (.mlmodelc) conversion of ivrit-ai/whisper-large-v3 for use with whisper.cpp on Apple Silicon. The encoder runs on the Apple Neural Engine (ANE) when present, freeing Metal/CPU for the decoder.
Usage with whisper.cpp
- Download
ivrit-ai-whisper-large-v3-encoder.mlmodelc.zipand the original GGML weights (.bin) from ivrit-ai/whisper-large-v3-ggml. - Place them side-by-side, with the
.mlmodelcdirectory named to match the.binfilename (replacing.binwith-encoder.mlmodelc):
~/Your-Models-Dir/
βββ ivrit-ai-whisper-large-v3.bin
βββ ivrit-ai-whisper-large-v3-encoder.mlmodelc/ # β extracted from this repo's zip
- whisper.cpp v1.8.4+ (built with
WHISPER_USE_COREML=1, which is the default in the official xcframework) auto-detects the sibling.mlmodelconwhisper_init_from_file_with_paramsand routes encoder forward passes to CoreML.
On first load, CoreML compiles the model for the current device (~10β15 s); every subsequent load is instant.
Conversion details
- Generated via whisper.cpp's
models/convert-h5-to-coreml.py --model-name large-v3 --encoder-only True --optimize-ane Trueagainst the ivrit-ai HuggingFace checkpoint. - 100% of encoder weights came from the ivrit-ai fine-tune (no fallback to OpenAI's base large-v3).
- Validated against the HuggingFace reference encoder on random log-mel inputs: median per-position diff β 0.02, p99 diff β 0.31 β within FP16 quantization noise.
- Input shape:
(1, 128, 3000)(whisper large-v3 standard). - Output shape:
(1, 1500, 1280). - Compute precision: Mixed FP16/FP32/Int32, ANE-optimized.
License
Apache 2.0 β same as the upstream ivrit-ai/whisper-large-v3 base model. The base model is itself a fine-tune of openai/whisper-large-v3 (MIT).
This conversion is a derivative work. Please retain the attribution chain when redistributing.
Acknowledgements
- ivrit-ai for the Hebrew fine-tuning of Whisper large-v3.
- OpenAI for the original Whisper architecture and weights.
- ggerganov / whisper.cpp for the conversion tooling.
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support