ggerganov / ggml

Tensor library for machine learning

ggerganov/ggml Issues

Zig build is broken
Updated 6 months ago2
gguf : opening an invalid file may cause a out of bounds access
Closed 6 months ago
[Question] How do I force a computation of a tensor/force a dependency between 2 tensors?
Closed 6 months ago2
the gguf module in ggml.c must have an option to write as a stream.
Updated 6 months ago
Compilation error with make on Linux Lite related to AVX.
Closed 6 months ago1
...
Closed 6 months ago
Question about convert auto-gptq model to ggml format?
Updated 6 months ago
[2GPU] Memcpy2D of matrixXmatrix -- src size (and form)
Updated 7 months ago3
test-conv-transpose fails when building with sanitizers enabled
Closed 7 months ago
Issue with ggml
Updated 7 months ago1
Using memcpy on tensor data when the tensor is not contiguous
Updated 7 months ago1
How is the data allocated by ggml_allocr_alloc align?
Closed 7 months ago2
ggml : improve memory allocation for weights and similar lists of tensors
Closed 4 months ago2
Build fails on Centos7 with make: *** [all] Error 2
Closed 7 months ago5
ggml : create a complete document
Updated 6 months ago5
ggml : remove GGML_MAX_NODES limit
Closed 7 months ago10
Question about ggml-alloc assert in CPU ggml-backend version of Sam.cpp
Closed 7 months ago6
ggml : replace conv stage_0 and stage_1 with im2col and mul_mat
Closed 6 months ago10
Conv2D kernel CuBLAS implementation - need feedback
Closed 6 months ago11
ggml_allocr_new null pointer exception
Closed 7 months ago
ggml : expose hash table API from ggml.c and reuse in ggml-alloc
Closed 7 months ago17
CUDA implementation of ggml_clamp
Closed 8 months ago3
[Question] What is the status of Vulkan backend?
Updated 2 months ago12
How about adding "tokenizer.ggml.cls_token_id" for special cls token.
Updated 8 months ago
Issue inferencing HuggingFace's GPT-J 4 bits model
Updated 7 months ago1
[Feature Request] any hyper-resolution inference
Updated 7 months ago2
Is T5 (mLongT5, FlanT5 etc.) being developed with GGML ?
Updated 7 months ago1
prompt is too long (539 tokens, max 508)
Updated 8 months ago
Can't some operations run in parallel?
Updated 7 months ago1
Merge HF LoRa adapter with a quantized GPT-J model using ggml
Updated 8 months ago
Port k-quants support from ggerganov/llama.cpp to ggerganov/ggml
Updated 8 months ago
Asserting over nb[0] as type check causes issue for tensors after permuting
Updated 8 months ago1
Quantized matmul with CUDA sets the result to zero instead of properly computing it
Closed 8 months ago5
Why GGML_F16_STEP is 32?
Updated 7 months ago3
Why GPT-J performs better on graviton without using simd than x86 using simd
Updated 8 months ago
Converting an arbritrary HF Transformer GPT2 to ggml format
Updated 8 months ago1
No output from executable when building with MSYS2
Updated 8 months ago
Error when using ggml_scale
Closed 8 months ago4
Directly converted from bfloat16 weights are 20x slower than converted from float32 ones.
Updated 8 months ago
sam visual studio
Closed 8 months ago
How to access the ggml_tensor data?
Closed 8 months ago1
sam bad result
Closed 8 months ago8
How to easily convert Meta's MMS-ASR models to ggml?
Updated 8 months ago1
ggml : better way to express implicit node dependencies in a graph
Updated 8 months ago
linesearch_backtracking has confusing logic
Closed 8 months ago1
Confirmation about the order of tensor dimensions
Closed 8 months ago5
mpt-1b fails with mpt_model_load: unknown tensor 'transformer.blocks.0.attn.k_ln.weight' in model file
Updated 8 months ago4
[Feature request] Add support/demo implementation for Qwen-VL GGUF model
Updated 7 months ago2
Use custom GPT-J checkpoint
Updated 8 months ago1
gpt-j, starcoder, gptneox examples cause "not enough space in the context's memory pool" for batches >32
Updated 9 months ago2