Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make code more idiomatic, clean, and better enforced #53

Closed
dwreeves opened this issue Aug 18, 2023 · 0 comments · Fixed by #54
Closed

Make code more idiomatic, clean, and better enforced #53

dwreeves opened this issue Aug 18, 2023 · 0 comments · Fixed by #54
Assignees

Comments

@dwreeves
Copy link
Contributor

dwreeves commented Aug 18, 2023

I was working on a separate issue and I started cleaning up the code. I realized that the code needs a little more love to be in tip-top shape and it would be inappropriate to combine a PR that both adds a feature and does a lot of cleanup, so I believe this should be a separate issue.

Example issues with existing code style

  • Incomplete typing, most notably in hooks.py.
  • Mypy should be part of pre-commit
  • pre-commit is not run as part of CI, but should be
  • When I substitute some types from the docstrings into Python types, I get a couple type errors in Mypy. Two examples:
    • FivetranHook.set_schedule_type says it returns a string, but it clearly returns a dict, and mypy makes this clear. (Upon closer inspection, the return type doesn't actually matter because the return value is never used.)
    • FivetranHook._do_api_call's json arg says it takes a dict, but it clearly takes a string, or more specifically a string representation of a JSON.
  • Some unnecessary calls to the API. For example, FivetranHook.prep_connector makes 3 total API calls when it really only needs to make 2.
  • Some un-idiomatic Airflow patterns. For example, _get_hook() instead of hook as a cached_property.
  • FivetranHook._do_api_call has some unnecessary weird behavior. Notably, it has a parameter called json that maps to either the data=json or params=json kwargs in the requests.request() call. This seems less desirable than just passing data=data, params=params, or json=json. Specifically, the json kwarg is confusing from the perspective of the requests and aiohttp APIs: in those APIs, json= expects a dict and data= expects a string, but in the FivetranHook API, json= is a string that gets mapped to data=.

API Breakage?

It's unclear how much API breakage would be tolerated here. We can discuss in a PR, but TLDR I believe for a minor version bump that:

  • absolutely zero breakage should be tolerated for the FivetranSensor and FivetranOperator.
  • very minor breakage can be tolerated for FivetranHook. Specifically for private methods, and methods where the return type is never actually used by anything in FivetranSensor or FivetranOperator:
    • _do_api_call and _do_api_call_async should be changed to look and feel like the requests.request and aiohttp.ClientSession.request APIs.
      • That said, an error message should clearly state to other users what the issue is by checking whether the first arg to these calls is a tuple. I imagine there are some people who are poking under the hood of this code and are calling _do_api_call() directly.
    • FivetranHook.check_connector should return the connector_details dict object. (So that prep_connector can avoid making an unnecessary 3rd API call.)
    • FivetranHookprep_connector should return either None or dict[str, Any] | None. (Note that the returned value is never used anywhere in the code).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant