Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[experiment] Add resource time limit and rate limiting #1485

Draft
wants to merge 1 commit into
base: devel
Choose a base branch
from

Conversation

sh-rp
Copy link
Collaborator

@sh-rp sh-rp commented Jun 18, 2024

Description

This PR extends the add_limit function to add time limits and rate limits to resources. This approach is to be discussed, but very straightforward, easy to test and works for both sync and async resources.

TODOs (if we go this route):

  • More tests that test pipe iterators with multiple resources (we want to allow a global rate limit for APIs for example)
  • Docs
  • Extend source level add_limit to have the same functionality as the resource level one
  • Improve the logger warning if add_limit is declared on non-incremental resources.
  • Investigate fifo extractor strategy, maybe do not go to round robin if none is yielded...

Other thoughts:

  • We might want to apply the rate limit wait also once before the original generator is used plus allow rate limiting on the transformers, otherwise global rate limiting for APIs will not work.

Copy link

netlify bot commented Jun 18, 2024

Deploy Preview for dlt-hub-docs canceled.

Name Link
🔨 Latest commit 5fc2324
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/6734869165314d0008ffccee

if max_items >= 0 or max_time and not self.incremental:
from dlt.common import logger

logger.warning(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will need to be improved a bit, but I think this is a quite nice solution.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also: generally speaking it would be cool to have some wrapper around log messages to be able to test them, maybe mocking would be enough, not sure

while (last_iteration + min_wait) - time.time() > 0:
# we give control back to the pipe iterator
yield None
time.sleep(0.1)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could make this configurable, I am not sure wether it is needed though.

@@ -788,52 +788,6 @@ def test_add_transformer_right_pipe() -> None:
iter([1, 2, 3]) | dlt.resource(lambda i: i * 3, name="lambda")


def test_limit_infinite_counter() -> None:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these three tests were moved to the new location where a couple more tests will be added specifically for the limits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant