Pipeline uploading full model after first run
paulcjh opened this issue · comments
Once a pipeline that contains a model has been run, the full initialised model is then uploaded. This is particularly inconvenient for huggingface transformer models.
Example:
with Pipeline("HF pipeline") as builder:
input_str = Variable(str, is_input=True)
builder.add_variables(input_str)
hf_model = TransformersModelForCausalLM(
model_path="EleutherAI/gpt-neo-125M",
tokenizer_path="EleutherAI/gpt-neo-125M",
)
output_str = hf_model.predict(input_str)
builder.output(output_str)
output_pipeline = Pipeline.get_pipeline("HF pipeline")
# Run the model
output_pipeline.run("Hello there")
print("Now uploading GPTNeo pipeline")
# The next line will upload the full 1GB model
uploaded_pipeline = api.upload_pipeline(output_pipeline)```
Rather than uploading whatever object is passed to upload_pipeline
we could check whether it is an instance or not. AFAIK only the class is needed rather than the instance; the builder knows the __init__
arguments so can take the class, re-initialise it on the server, and then call the load/predict functions. (This does assume the user calls no other instance methods, e.g. hf_model.some_other_method()
, which mutate self
.)
To extract the class from an instance:
import inspect
def class_from_object(instance_or_class):
if inspect.isclass(instance_or_class):
return instance_or_class
else:
return type(instance_or_class)
Sounds good - we can try this out post API v1 migration. I guarantee people will upload after a local run so this will def have to be in there post public launch.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.