You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was working on a separate issue and I started cleaning up the code. I realized that the code needs a little more love to be in tip-top shape and it would be inappropriate to combine a PR that both adds a feature and does a lot of cleanup, so I believe this should be a separate issue.
Example issues with existing code style
Incomplete typing, most notably in hooks.py.
Mypy should be part of pre-commit
pre-commit is not run as part of CI, but should be
When I substitute some types from the docstrings into Python types, I get a couple type errors in Mypy. Two examples:
FivetranHook.set_schedule_type says it returns a string, but it clearly returns a dict, and mypy makes this clear. (Upon closer inspection, the return type doesn't actually matter because the return value is never used.)
FivetranHook._do_api_call's json arg says it takes a dict, but it clearly takes a string, or more specifically a string representation of a JSON.
Some unnecessary calls to the API. For example, FivetranHook.prep_connector makes 3 total API calls when it really only needs to make 2.
Some un-idiomatic Airflow patterns. For example, _get_hook() instead of hook as a cached_property.
FivetranHook._do_api_call has some unnecessary weird behavior. Notably, it has a parameter called json that maps to either the data=json or params=json kwargs in the requests.request() call. This seems less desirable than just passing data=data, params=params, or json=json. Specifically, the json kwarg is confusing from the perspective of the requests and aiohttp APIs: in those APIs, json= expects a dict and data= expects a string, but in the FivetranHook API, json= is a string that gets mapped to data=.
API Breakage?
It's unclear how much API breakage would be tolerated here. We can discuss in a PR, but TLDR I believe for a minor version bump that:
absolutely zero breakage should be tolerated for the FivetranSensor and FivetranOperator.
very minor breakage can be tolerated for FivetranHook. Specifically for private methods, and methods where the return type is never actually used by anything in FivetranSensor or FivetranOperator:
_do_api_call and _do_api_call_async should be changed to look and feel like the requests.request and aiohttp.ClientSession.request APIs.
That said, an error message should clearly state to other users what the issue is by checking whether the first arg to these calls is a tuple. I imagine there are some people who are poking under the hood of this code and are calling _do_api_call() directly.
FivetranHook.check_connector should return the connector_details dict object. (So that prep_connector can avoid making an unnecessary 3rd API call.)
FivetranHookprep_connector should return either None or dict[str, Any] | None. (Note that the returned value is never used anywhere in the code).
The text was updated successfully, but these errors were encountered:
I was working on a separate issue and I started cleaning up the code. I realized that the code needs a little more love to be in tip-top shape and it would be inappropriate to combine a PR that both adds a feature and does a lot of cleanup, so I believe this should be a separate issue.
Example issues with existing code style
hooks.py
.FivetranHook.set_schedule_type
says it returns a string, but it clearly returns a dict, and mypy makes this clear. (Upon closer inspection, the return type doesn't actually matter because the return value is never used.)FivetranHook._do_api_call
'sjson
arg says it takes a dict, but it clearly takes a string, or more specifically a string representation of a JSON.FivetranHook.prep_connector
makes 3 total API calls when it really only needs to make 2._get_hook()
instead ofhook
as acached_property
.FivetranHook._do_api_call
has some unnecessary weird behavior. Notably, it has a parameter calledjson
that maps to either thedata=json
orparams=json
kwargs in therequests.request()
call. This seems less desirable than just passingdata=data
,params=params
, orjson=json
. Specifically, thejson
kwarg is confusing from the perspective of therequests
andaiohttp
APIs: in those APIs,json=
expects a dict anddata=
expects a string, but in the FivetranHook API,json=
is a string that gets mapped todata=
.API Breakage?
It's unclear how much API breakage would be tolerated here. We can discuss in a PR, but TLDR I believe for a minor version bump that:
FivetranSensor
andFivetranOperator
.FivetranHook
. Specifically for private methods, and methods where the return type is never actually used by anything inFivetranSensor
orFivetranOperator
:_do_api_call
and_do_api_call_async
should be changed to look and feel like therequests.request
andaiohttp.ClientSession.request
APIs.tuple
. I imagine there are some people who are poking under the hood of this code and are calling_do_api_call()
directly.FivetranHook.check_connector
should return theconnector_details
dict object. (So thatprep_connector
can avoid making an unnecessary 3rd API call.)FivetranHookprep_connector
should return eitherNone
ordict[str, Any] | None
. (Note that the returned value is never used anywhere in the code).The text was updated successfully, but these errors were encountered: