WeightedIndex error: invalid weight

Question

WeightedIndex error: invalid weight

andri-jpg opened this issue a year ago · comments

When trying to run Pythia model using gptneox, I got this error, btw I use termux on Android with rust installed to run this model.

$ cargo run --release -- gptneox infer -m pythia-160m-q4_0.bin -p "Tell me how cool the Rust programming language is:" Finished release [optimized] target(s) in 2.18s
Running target/release/llm gptneox infer -m pythia-160m-q4_0.bin -p 'Tell me how cool the Rust programming language is:'
✓ Loaded 148 tensors (92.2 MB) after 293ms
<|padding|>Tell me how cool the Rust programming language is:The application panicked (crashed).
Message: WeightedIndex error: InvalidWeight
Location: crates/llm-base/src/samplers.rs:157

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.
Run with RUST_BACKTRACE=full to include source snippets.

Philpax · Answer 1 · Mon Jun 19 2023 18:03:46 GMT+0800 (China Standard Time)

Interesting, can you link the exact model you used, and the model of phone you have? I suspect this is more likely an issue with the execution on the phone (which will be a more complicated issue to diagnose), but we should rule out any issues with the model on a PC first.

Andrian Syah putra · Answer 2 · Mon Jun 19 2023 18:17:15 GMT+0800 (China Standard Time)

Interesting, can you link the exact model you used, and the model of phone you have? I suspect this is more likely an issue with the execution on the phone (which will be a more complicated issue to diagnose), but we should rule out any issues with the model on a PC first.

Here is the link to model https://huggingface.co/rustformers/pythia-ggml/blob/main/pythia-160m-q4_0.bin

FYI this model run ok on my PC.
But I find bloom and llama run smoothly on my phone
My device is : Poco m3 pro 5g with 4gb ram

Philpax · Answer 3 · Mon Jun 19 2023 23:55:23 GMT+0800 (China Standard Time)

Ok, I've done some more testing - this model "works" (produces a lot of garbage) on x86-64 Windows, but doesn't work on macOS ARM64.

I think this is an ARM64 issue, or at least it's more obviously broken on ARM64. We'll need to test with upstream GGML GPT-NeoX support to see if this is an issue with GGML or with our implementation.

Andrian Syah putra · Answer 4 · Tue Jun 20 2023 00:39:19 GMT+0800 (China Standard Time)

Yeah, I think so too. Maybe only some models can run on ARM64 architecture. llama.cpp (officially supported on Android according to the documentation), alpaca, or vicuna should work fine on Android. When I saw the availability of GPT-J in Rustformers, I became interested in performing inference with Rust on Android. Previously, llama.cpp only supported large models. I haven't tested the GPT-J family models yet because at that time they could only be run using the Transformers Python library, which requires Torch. Please note that Torch cannot be installed on Termux, and the same applies to NumPy.

Tommy van der Vorst · Answer 5 · Tue Jun 20 2023 05:31:47 GMT+0800 (China Standard Time)

I got this error a few times while implementing Metal support (#311) and it happened there when a graph was not fully computed or otherwise misconfigures (leading to garbage output). This was also on arm64 (M1). So either something up with graph construction or some ARM64 specific race condition?

Edit: could also just be the context running out of memory, or will that always lead to an error?

Andrian Syah putra · Answer 6 · Sun Jul 02 2023 09:13:17 GMT+0800 (China Standard Time)

I got this error a few times while implementing Metal support (#311) and it happened there when a graph was not fully computed or otherwise misconfigures (leading to garbage output). This was also on arm64 (M1). So either something up with graph construction or some ARM64 specific race condition?

Edit: could also just be the context running out of memory, or will that always lead to an error?

btw, what model do you use on arm64? What rustformers models are supported on arm besides llama and bloom?