OpenNMT / CTranslate2

Fast inference engine for Transformer models

Home Page:https://opennmt.net/CTranslate2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Phi-3 support

Theodotus1243 opened this issue · comments

Powerful model trained on syntetic data, has high MMLU

4K context window one should be easier, as has no LongRope

https://huggingface.co/microsoft/Phi-3-mini-4k-instruct
https://arxiv.org/pdf/2404.14219.pdf

I second this. The current phi loader is broken, apparently because of some changes that Microsoft did to the model after it was initially released. At any rate, adapting the phi loader to the new phi3 should be easier than starting from scratch.

For anyone else researching this, phi3 support has been added to the convert_hf_to_gguf.py script in llama.cpp. Perhaps something can be gleaned from there to simplify the implementation of the ct2 converter.

no worry it will be done, it's quite easy for the mini-4k since it takes all llama2 arch.
fyi: https://forum.opennmt.net/t/phi-3-3-8b-llama2-7b-ensemble-just-for-fun/5729

Is it done yet? I've been waiting patiently for approximately two hours now? ;-)

Hello, I am working on it. Some unexpected problems appears.

I'm not skilled enough to help directly by implementing the code...but if you want me to do any grunt work or research let me know dude...anything to assist speed up the process. Thanks!

I'd like to start learning to eventually possibly help...Question...how do I get the actual model architecture to start with...It's my understanding that getting the model's structure, what activation functions are used, etc. and basically starting to understanding the structure is key in making additional converters down the road. For example, here's a link:

https://bbycroft.net/llm

Here are some other links that I've been gathering with the goal of eventually contributing a converter...based on first trying to understand the structure of LLMs...

https://github.com/mert-kurttutan/torchview

https://github.com/lutzroeder/netron

Huggingface sometimes (but not always) has information like this...

image

Basically, any good starting point for me that you'd recommend dude? Thanks!

Remember, you're dealing with an idiot who doesn't do this for a profession and has never taken the LLM 101 class in college let alone have a doctoral degree. ;-) I don't even know what "mlp.down" or "layernorm.weight" means, for example, but am willing to learn.

PR #1680 to add the converter for Phi3