hpcaitech / EnergonAI

Large-scale model inference.

hpcaitech/EnergonAI Issues

How to inference BLOOM-176B by multi-node multi-card？
Updated 8 months ago2
关于示例代码版本落后，无法运行等问题About the example code version backward, can not run and other issues
Updated 8 months ago7
[bug] The InferenceEngine used in the example for distributed inference cannot be imported.
Updated 9 months ago
Support GPT BigCode (bigcode/starcoder, bigcode/gpt_bigcode-santacoder, etc.)
Updated 10 months ago1
OPT-125m problem
Updated 10 months ago
Location of logs
Closed 10 months ago
Docker cannot find the parent image defined in `docker/Dockerfile`
Updated 10 months ago
Does EnergonAI support accelerated inference for segmenting anything?
Updated a year ago
EnergonAI running OPT reasoning example: When encountering a client request, the server is blocked and cannot return the result
Updated a year ago1
Failed to load pre-trained model weights for OPT_125M
Closed a year ago2
Where is InferenceEngine definition?
Updated a year ago
question about load model state_dict in multi-gpus
Closed a year ago2
Is there an example of the http client?
Updated a year ago1
Concrete doc of this project
Updated a year ago1
can't find server.sh
Updated a year ago3
_pickle.UnpicklingError: invalid load key, '{'.
Updated a year ago4
Why does it unreadable generated by OPT-30B inferring with EnergonAI
Closed a year ago11
CUDA error: no kernel image is available for execution on the device
Updated a year ago3
OPT inference
Updated a year ago2
Where is InferenceEngine definition?
Updated a year ago2
an error caused by running the example of the opt
Updated a year ago4
Cannot run opt 125m examples with latest energonai docker images
Updated a year ago2
How to use dynamic batch features
Updated a year ago1
OPT demo TEST
Updated a year ago2
fail to install EnergonAI
Updated a year ago3
Is there any examples of using offload feature in GPT/BLOOM/OPT inference?
Updated a year ago1
miss cache error when pose generation opt
Updated a year ago2
failure to compile energonai by the command : python setup.py build
Updated a year ago1
Doesn't run gpt reference?
Updated a year ago1
Does not support Cuda 10.2 ?
Updated a year ago1
Not compatible with the latest version of transformers? (4.26.1)
Updated a year ago2
Can not start the Bloom server
Updated a year ago3
Maybe you should add license for using OneFlow's LayerNorm Kernel implement？
Closed a year ago2
Failed to load OPT-30B checkpoint
Closed a year ago2
Support OPT-IML model
Closed a year ago1
Detected RRef Leaks during shutdown, empty pipe, tests_engine failed
Updated a year ago1
trpc.rpc_sync consumed most time
Updated a year ago
RuntimeError('FusedLayerNormAffineFunction requires cuda extensions')
Updated a year ago
torch.load() hangs indefinitely when reading OPT pre-trained model weights
Updated 2 years ago1
need guidelines on converting OPT-17B checkpoint
Closed 2 years ago
does EnergonAI support gpt model with int8 quantitation in model parallel?
Updated 2 years ago1
[RFC] Async engine and pipeline based on RPC
Closed 2 years ago1
num_beams for beam search
Updated 2 years ago1
inference of pre-trained model
Updated 2 years ago1
Remove hard code directory path
Updated 2 years ago
Provide a docker service
Closed 2 years ago
OPT inference generate example
Updated 2 years ago1
Missing energonai_linear_func in setup.py
Updated 2 years ago1
Connection refused on docker exposed port
Updated 2 years ago1
[Feature]: Automatic Pipeline Parallelism
Updated 2 years ago