ploomber.tasks.Link is not usable
marr75 opened this issue · comments
With a source attribute, ploomber.tasks.Link cannot be instantiated. Without a source, the task fails validation.
It is not currently possible to use ploomber.tasks.Link in a pipeline spec.
I think it's because when we addded Link
, we only had the Python API (not the pipeline.yaml API), and we never worked on ensuring it'd work with pipeline.yaml
. Feel free to open a PR!
@edublancas I will. I could use a little guidance from you, though.
Locally, I've got this signature for Link:
class Link(Task):
...
def __init__(self, source, product, dag, name=None):
kwargs = dict(hot_reload=dag._params.hot_reload)
self._source = type(self)._init_source(kwargs)
super().__init__(product, dag, name, None)
And tasks using Link tend to look like:
# Dummy task to wrap success stories exported from hubspot
- name: success-stories
source: ""
product: "{{PRODUCTS_DIR}}/success-stories.csv"
class: Link
product_class: File
Which, isn't terrible but the blank source
, the class
, and the product_class
could all be a little confusing.
I don't think I can get around the source
issue without quite a bit of rewiring in the spec task validation (which strictly looks for source
without OO/protocol based validation). The product_class issue may be solvable by trying to validate whether product is a pathlike or url-like.
I suppose I could make any string that matches source.lower() == "link"
get a class
of Link. Maybe that kills two birds with one stone?
Let me know your thoughts.