web_demo.py on Linux is loading indefinitely
mikeleatila opened this issue · comments
Hi @WangRongsheng, I am trying to run your web_demo.py on Linux using:
CUDA_VISIBLE_DEVICES=0 python src/web_demo.py --model_name_or_path mistralai/Mixtral-8x7B-Instruct-v0.1 --checkpoint_dir /mnt347/ddd/test/Aurora/final-checkpoint --finetuning_type lora --quantization_bit 4 --template mistral
I have also set share=True in:
def main():
demo = create_web_demo()
demo.queue()
demo.launch(server_name="0.0.0.0", server_port=7888, share=True, inbrowser=True)
However when I run the code it seems that is running in the back-end:
01/16/2024 05:01:00 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA
01/16/2024 05:01:01 - INFO - llmtuner.model.adapter - Loaded fine-tuned model from checkpoint(s): /mnt347/ddd/test/Aurora/final-checkpoint
01/16/2024 05:01:01 - INFO - llmtuner.model.loader - trainable params: 0 || all params: 46706200576 || trainable%: 0.0000
01/16/2024 05:01:01 - INFO - llmtuner.model.loader - This IS expected that the trainable params is 0 if you are using model for inference only.
01/16/2024 05:01:01 - INFO - llmtuner.data.template - Add pad token:
Running on local URL: http://0.0.0.0:7888
but not in the browser (Chrome) as the web-page (of the URL above) is loading indefinitely (I guess it will fail eventually). Could you please help with this? Many thanks in advance!
I will debug it. Please change your machine env for test!
Thanks a lot @WangRongsheng. Sure, I have tested it on another machine just now with the same settings above and the error persists I am afraid. Let me know your thoughts?
I managed to get it working after clearing cache and cookies and restarting the browser. Sorry for the bother @WangRongsheng and many thanks again.