argmaxinc / WhisperKit

On-device Inference of Whisper Speech Recognition Models for Apple Silicon

Home Page:https://takeargmax.com/blog/whisperkit

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Initializing models... Error: unknown error

milsun opened this issue · comments

When using Swift CLI example, followed the exact commands mentioned, getting above error:

Chip: Apple M3 Max
OS: 14.3.1
Xcode: Version 15.2
Apple Swift version: 5.9.2

Hi @milsun Can you share the full logs from the terminal?

(base) volleedemo@Vollees-MacBook-Pro vollee-demo % git clone https://github.com/argmaxinc/whisperkit.git
cd whisperkit
Cloning into 'whisperkit'...
remote: Enumerating objects: 367, done.
remote: Counting objects: 100% (189/189), done.
remote: Compressing objects: 100% (104/104), done.
remote: Total 367 (delta 88), reused 115 (delta 55), pack-reused 178
Receiving objects: 100% (367/367), 901.56 KiB | 6.39 MiB/s, done.
Resolving deltas: 100% (135/135), done.
(base) volleedemo@Vollees-MacBook-Pro whisperkit % make setup
make download-model MODEL=large-v3
Setting up environment...
/Users/volleedemo/miniconda3/bin/pip3
/Users/volleedemo/miniconda3/bin/python3
Requirement already satisfied: huggingface_hub in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (0.21.2)
Requirement already satisfied: filelock in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (from huggingface_hub) (3.13.1)
Requirement already satisfied: fsspec>=2023.5.0 in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (from huggingface_hub) (2024.2.0)
Requirement already satisfied: requests in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (from huggingface_hub) (2.31.0)
Requirement already satisfied: tqdm>=4.42.1 in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (from huggingface_hub) (4.65.0)
Requirement already satisfied: pyyaml>=5.1 in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (from huggingface_hub) (6.0.1)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (from huggingface_hub) (4.10.0)
Requirement already satisfied: packaging>=20.9 in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (from huggingface_hub) (23.1)
Requirement already satisfied: charset-normalizer<4,>=2 in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (from requests->huggingface_hub) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (from requests->huggingface_hub) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (from requests->huggingface_hub) (1.26.18)
Requirement already satisfied: certifi>=2017.4.17 in /Users/volleedemo/miniconda3/lib/python3.11/site-packages (from requests->huggingface_hub) (2023.11.17)
Checking for Homebrew...
Homebrew is installed.
Checking for git-lfs...
git-lfs is installed.
Done 🚀
Downloading model large-v3...
Setting up repository...
Repository not found, initializing...
Cloning into './Models/whisperkit-coreml'...
remote: Enumerating objects: 483, done.
remote: Counting objects: 100% (17/17), done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 483 (delta 0), reused 0 (delta 0), pack-reused 466
Receiving objects: 100% (483/483), 1.25 MiB | 1.47 MiB/s, done.
Resolving deltas: 100% (138/138), done.
Fetching model large-v3...
(base) volleedemo@Vollees-MacBook-Pro whisperkit %
(base) volleedemo@Vollees-MacBook-Pro whisperkit %
(base) volleedemo@Vollees-MacBook-Pro whisperkit %
(base) volleedemo@Vollees-MacBook-Pro whisperkit % swift run transcribe --model-path "Models/whisperkit-coreml/openai_whisper-large-v3" --stream
Fetching https://github.com/apple/swift-argument-parser.git from cache
Fetching https://github.com/huggingface/swift-transformers.git from cache
Fetched https://github.com/apple/swift-argument-parser.git (1.27s)
Fetched https://github.com/huggingface/swift-transformers.git (1.28s)
Computing version for https://github.com/apple/swift-argument-parser.git
Computed https://github.com/apple/swift-argument-parser.git at 1.3.0 (0.23s)
Computing version for https://github.com/huggingface/swift-transformers.git
Computed https://github.com/huggingface/swift-transformers.git at 0.1.2 (0.26s)
Computed https://github.com/apple/swift-argument-parser.git at 1.3.0 (0.00s)
Computed https://github.com/huggingface/swift-transformers.git at 0.1.2 (0.00s)
Creating working copy for https://github.com/apple/swift-argument-parser.git
Working copy of https://github.com/apple/swift-argument-parser.git resolved at 1.3.0
Creating working copy for https://github.com/huggingface/swift-transformers.git
Working copy of https://github.com/huggingface/swift-transformers.git resolved at 0.1.2
Building for debugging...
/Users/volleedemo/Desktop/vollee-demo/whisperkit/Sources/WhisperKit/Core/TokenSampler.swift:47:29: warning: conformance of 'Float16' to 'BNNSScalar' is unavailable in macOS; this is an error in Swift 6
scalarType: FloatType.self, // FIXME: Float16 here breaks in swift 6
^
Accelerate.Float16:4:11: note: conformance of 'Float16' to 'BNNSScalar' has been explicitly marked unavailable here
extension Float16 : BNNSScalar {
^
/Users/volleedemo/Desktop/vollee-demo/whisperkit/Sources/WhisperKit/Core/TokenSampler.swift:56:33: warning: conformance of 'Float16' to 'BNNSScalar' is unavailable in macOS; this is an error in Swift 6
scalarType: FloatType.self,
^
Accelerate.Float16:4:11: note: conformance of 'Float16' to 'BNNSScalar' has been explicitly marked unavailable here
extension Float16 : BNNSScalar {
^
[96/96] Linking transcribe
Build complete! (8.63s)
Initializing models...
Error: unknown error

Tried with all available tags too, same error.

Thanks @milsun, could you also share your HW details? Is this an Apple Silicon Mac?

@milsun One more thing that would help us would be adding the --verbose flag:

swift run transcribe --model-path "Models/whisperkit-coreml/openai_whisper-large-v3" --stream --verbose

And paste the output of that here.

swift run transcribe --model-path "Models/whisperkit-coreml/openai_whisper-large-v3" --stream --verbose

Fetching https://github.com/huggingface/swift-transformers.git from cache
Fetching https://github.com/apple/swift-argument-parser.git from cache
Fetched https://github.com/apple/swift-argument-parser.git (1.19s)
Fetched https://github.com/huggingface/swift-transformers.git (1.19s)
Computing version for https://github.com/apple/swift-argument-parser.git
Computed https://github.com/apple/swift-argument-parser.git at 1.3.0 (0.03s)
Computing version for https://github.com/huggingface/swift-transformers.git
Computed https://github.com/huggingface/swift-transformers.git at 0.1.2 (0.02s)
Computed https://github.com/apple/swift-argument-parser.git at 1.3.0 (0.00s)
Computed https://github.com/huggingface/swift-transformers.git at 0.1.2 (0.00s)
Creating working copy for https://github.com/apple/swift-argument-parser.git
Working copy of https://github.com/apple/swift-argument-parser.git resolved at 1.3.0
Creating working copy for https://github.com/huggingface/swift-transformers.git
Working copy of https://github.com/huggingface/swift-transformers.git resolved at 0.1.2
Building for debugging...
/Users/volleedemo/Desktop/vollee-demo/whisperkit/Sources/WhisperKit/Core/TokenSampler.swift:47:29: warning: conformance of 'Float16' to 'BNNSScalar' is unavailable in macOS; this is an error in Swift 6
scalarType: FloatType.self, // FIXME: Float16 here breaks in swift 6
^
Accelerate.Float16:4:11: note: conformance of 'Float16' to 'BNNSScalar' has been explicitly marked unavailable here
extension Float16 : BNNSScalar {
^
/Users/volleedemo/Desktop/vollee-demo/whisperkit/Sources/WhisperKit/Core/TokenSampler.swift:56:33: warning: conformance of 'Float16' to 'BNNSScalar' is unavailable in macOS; this is an error in Swift 6
scalarType: FloatType.self,
^
Accelerate.Float16:4:11: note: conformance of 'Float16' to 'BNNSScalar' has been explicitly marked unavailable here
extension Float16 : BNNSScalar {
^
[96/96] Linking transcribe
Build complete! (8.42s)
Initializing models...
[WhisperKit] Running on arm64
[WhisperKit] Loading models...
[WhisperKit] Loading models from /Users/volleedemo/Desktop/vollee-demo/whisperkit/Models/whisperkit-coreml/openai_whisper-large-v3 with prewarmMode: false
[WhisperKit] Loading feature extractor
[WhisperKit] Loaded feature extractor
[WhisperKit] Loading audio encoder
[WhisperKit] Loaded audio encoder
[WhisperKit] Loading text decoder
[WhisperKit] Loaded text decoder
[WhisperKit] Loading tokenizer for large-v3
Error: unknown error

Thanks @milsun, could you also share your HW details? Is this an Apple Silicon Mac?

My first comment has all the HW details.

Can you share the specific versions, for which it works, need to get it in place for a project I have been working on. Thanks!

Thanks for adding the HW info in your earlier comment. M1 and newer (all Apple Silicon Macs) should definitely work for large-v3.

@milsun I tested this command on M3 Max + macOS 14.4 which is the closest configuration I have to yours and it worked for me.

We think that you might have been impacted by an issue that tries to pull the tokenizer assets from HuggingFace Hub and they had an outage around the same time you filed this issue. Could you please try again and report back whether the problem persists?

Did a fresh clone and install, still the same issue.

Since I am unable to reproduce even when using similar hardware and OS versions, let's try a few different commands on your side to localize the issue. Could you try changing the model version from large-v3 to large-v2 and tiny respectively and report back?

@atiorh I have the same issue as milsun.

Hardware:

M1 Macbook Pro 8GB
macOS 14.1.1

Setup:

1. Cloned the repo
2. Ran make to setup project and download the models

Logs

  1. tiny
swift run transcribe --model-path "Models/whisperkit-coreml/openai_whisper-tiny" --audio-path ./Tests/WhisperKitTests/Resources/es_test_clip.wav --verbose
Building for debugging...
[1/1] Write swift-version--58304C5D6DBC2206.txt
Build complete! (0.26s)
Transcribing audio at Tests/WhisperKitTests/Resources/es_test_clip.wav -- file:///Users/nagasaiabhinaydevarinti/Documents/WhisperKit/
Initializing models...
[WhisperKit] Running on arm64
[WhisperKit] Loading models...
[WhisperKit] Loading models from /Users/nagasaiabhinaydevarinti/Documents/WhisperKit/Models/whisperkit-coreml/openai_whisper-tiny with prewarmMode: false
[WhisperKit] Loading feature extractor
[WhisperKit] Loaded feature extractor
[WhisperKit] Loading audio encoder
[WhisperKit] Loaded audio encoder
[WhisperKit] Loading text decoder
[WhisperKit] Loaded text decoder
[WhisperKit] Loading tokenizer for tiny
Error: unknown error
  1. large-v3
swift run transcribe --model-path "Models/whisperkit-coreml/openai_whisper-large-v3" --audio-path ./Tests/WhisperKitTests/Resources/es_test_clip.wav --verbose
Building for debugging...
[1/1] Write swift-version--58304C5D6DBC2206.txt
Build complete! (0.15s)
Transcribing audio at Tests/WhisperKitTests/Resources/es_test_clip.wav -- file:///Users/nagasaiabhinaydevarinti/Documents/WhisperKit/
Initializing models...
[WhisperKit] Running on arm64
[WhisperKit] Loading models...
[WhisperKit] Loading models from /Users/nagasaiabhinaydevarinti/Documents/WhisperKit/Models/whisperkit-coreml/openai_whisper-large-v3 with prewarmMode: false
[WhisperKit] Loading feature extractor
[WhisperKit] Loaded feature extractor
[WhisperKit] Loading audio encoder
[WhisperKit] Loaded audio encoder
[WhisperKit] Loading text decoder
[WhisperKit] Loaded text decoder
[WhisperKit] Loading tokenizer for large-v3
Error: unknown error
  1. large-v2
swift run transcribe --model-path "Models/whisperkit-coreml/openai_whisper-large-v2" --audio-path ./Tests/WhisperKitTests/Resources/es_test_clip.wav --verbose
Building for debugging...
[1/1] Write swift-version--58304C5D6DBC2206.txt
Build complete! (0.24s)
Transcribing audio at Tests/WhisperKitTests/Resources/es_test_clip.wav -- file:///Users/nagasaiabhinaydevarinti/Documents/WhisperKit/
Initializing models...
[WhisperKit] Running on arm64
[WhisperKit] Loading models...
[WhisperKit] Loading models from /Users/nagasaiabhinaydevarinti/Documents/WhisperKit/Models/whisperkit-coreml/openai_whisper-large-v2 with prewarmMode: false
[WhisperKit] Loading feature extractor
[WhisperKit] Loaded feature extractor
[WhisperKit] Loading audio encoder
[WhisperKit] Loaded audio encoder
[WhisperKit] Loading text decoder
[WhisperKit] Loaded text decoder
[WhisperKit] Loading tokenizer for large-v2
Error: unknown error

Also FYI, I ran make test --filter UnitTests to see if tokenizer loading would fail but all tests ran successfully.

Update:

  • After running swift test --filter UnitTests, transcribe with tiny model: swift run transcribe --model-path "Models/whisperkit-coreml/openai_whisper-tiny" --audio-path ./Tests/WhisperKitTests/Resources/es_test_clip.wav --verbose is working.
  • large-v2 fails with the same logs as above
  • large-v3 works too. But I ran swift test --filter FunctionalTests and killed it after sometime. I assume testInitLarge() was executed and only post that large-v3 worked.

I'll see if I can reproduce this with a fresh install.