Make code more idiomatic, clean, and better enforced
dwreeves opened this issue · comments
I was working on a separate issue and I started cleaning up the code. I realized that the code needs a little more love to be in tip-top shape and it would be inappropriate to combine a PR that both adds a feature and does a lot of cleanup, so I believe this should be a separate issue.
Example issues with existing code style
- Incomplete typing, most notably in
hooks.py
. - Mypy should be part of pre-commit
- pre-commit is not run as part of CI, but should be
- When I substitute some types from the docstrings into Python types, I get a couple type errors in Mypy. Two examples:
FivetranHook.set_schedule_type
says it returns a string, but it clearly returns a dict, and mypy makes this clear. (Upon closer inspection, the return type doesn't actually matter because the return value is never used.)FivetranHook._do_api_call
'sjson
arg says it takes a dict, but it clearly takes a string, or more specifically a string representation of a JSON.
- Some unnecessary calls to the API. For example,
FivetranHook.prep_connector
makes 3 total API calls when it really only needs to make 2. - Some un-idiomatic Airflow patterns. For example,
_get_hook()
instead ofhook
as acached_property
. FivetranHook._do_api_call
has some unnecessary weird behavior. Notably, it has a parameter calledjson
that maps to either thedata=json
orparams=json
kwargs in therequests.request()
call. This seems less desirable than just passingdata=data
,params=params
, orjson=json
. Specifically, thejson
kwarg is confusing from the perspective of therequests
andaiohttp
APIs: in those APIs,json=
expects a dict anddata=
expects a string, but in the FivetranHook API,json=
is a string that gets mapped todata=
.
API Breakage?
It's unclear how much API breakage would be tolerated here. We can discuss in a PR, but TLDR I believe for a minor version bump that:
- absolutely zero breakage should be tolerated for the
FivetranSensor
andFivetranOperator
. - very minor breakage can be tolerated for
FivetranHook
. Specifically for private methods, and methods where the return type is never actually used by anything inFivetranSensor
orFivetranOperator
:_do_api_call
and_do_api_call_async
should be changed to look and feel like therequests.request
andaiohttp.ClientSession.request
APIs.- That said, an error message should clearly state to other users what the issue is by checking whether the first arg to these calls is a
tuple
. I imagine there are some people who are poking under the hood of this code and are calling_do_api_call()
directly.
- That said, an error message should clearly state to other users what the issue is by checking whether the first arg to these calls is a
FivetranHook.check_connector
should return theconnector_details
dict object. (So thatprep_connector
can avoid making an unnecessary 3rd API call.)FivetranHookprep_connector
should return eitherNone
ordict[str, Any] | None
. (Note that the returned value is never used anywhere in the code).