Is it possible to specify a Python function as `args` in a YAMLFileCatalog?
dougiesquire opened this issue · comments
I'd like to create a YAMLFileCatalog that includes a source with an argument that is a Python function. Is there a way to do this?
An example using a CSV source:
import ast
import intake
import pandas
# Create the source data
data = {
"col0": [["a","b"], ["c","d","e"]],
"col1": [0, 1]
}
pandas.DataFrame(data).to_csv("test.csv", index=False)
# Open using intake with converter function
cat = intake.open_csv("test.csv", csv_kwargs={"converters": {"col0": ast.literal_eval}})
cat.read() # works
# Can we write a YAML catalog with the converter function specified in args?
with open("test.yaml", "w") as f:
f.write(cat.yaml())
# It seems not (this fails)
intake.open_catalog("test.yaml")
The contents of test.yaml
look like this:
sources:
csv:
args:
csv_kwargs:
converters:
col0: !!python/name:ast.literal_eval ''
urlpath: test.csv
description: ''
driver: intake.source.csv.CSVSource
metadata: {}
YAML does allow for specifying python objects like that, but Intake explicitly "safe loads" YAML, to prevent arbitrary code execution from reading a catalog file. In a few places, it is possible to specify functions as a string like "package.module.funcname" (or maybe a ":" instead of the final "."; see https://intake.readthedocs.io/en/latest/transforms.html#functional-example ), but these are NOT evaluated by loading alone, but by the specifics of the driver in question. The CSV driver does not know about any func-type arguments.