salesforce / CodeGen

CodeGen is a family of open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to train on custom data — continued

hg0428 opened this issue · comments

I am completely new to AI, I would like to know how I can train it to recognize a new language.
I have no idea what to do. I can find no docs online about this. I have attempted to train it using the Trainer from transformers, but I keep coming up with errors. Can I have a code example for this?
I have a dict of expected inputs to expected outputs. Should the dict be input:input+output or input:output? I would expect it to be the former.
I have 0 GPUs and I have no idea how to use Jaxformer.

I would like some code example of training or something. I am trying to teach it a new language and teach it other new things. Any examples? Anything I need to know?

Please help me.

We are working on a tutorial how to fine-tune the models and will update the documentation, once completed.

Conceptually, you will have to
(1) pre-process the data (https://github.com/salesforce/jaxformer/tree/main/preprocess)
(2) fine-tune the model on either TPU or GPU (https://github.com/salesforce/jaxformer#fine-tuning)

In general, it will be difficult to fine-tune these models without any hardware. You may want to consider Google Colab notebooks.

We can not provide more guidance at this point in time.

Apologies for the unsatisfactory answer.

We are working on a tutorial how to fine-tune the models and will update the documentation, once completed.

Conceptually, you will have to (1) pre-process the data (https://github.com/salesforce/jaxformer/tree/main/preprocess) (2) fine-tune the model on either TPU or GPU (https://github.com/salesforce/jaxformer#fine-tuning)

In general, it will be difficult to fine-tune these models without any hardware. You may want to consider Google Colab notebooks.

We can not provide more guidance at this point in time.

Apologies for the unsatisfactory answer.

I do have hardware, just no nvidia GPU. And I heard that that was the type of GPU that would be required for something like this.

Please notify me once the tutorial is completed.