Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config to set the spider do not crash when the redis is unusble #108

Open
kingname opened this issue Aug 21, 2017 · 1 comment
Open

Config to set the spider do not crash when the redis is unusble #108

kingname opened this issue Aug 21, 2017 · 1 comment

Comments

@kingname
Copy link

When the spider is running and suddenly redis is breakdown, the spider will crash. Is there a parameter in settings.py to let the spider waiting for redis's comming back?

@kingname
Copy link
Author

kingname commented Aug 21, 2017

Add the try ... except in spider.RedisSpider's next_requests() method may be one of the solution. Or you have another Pythonic way.

     def next_requests(self):
        """Returns a request to be scheduled or none."""
        use_set = self.settings.getbool('REDIS_START_URLS_AS_SET', defaults.START_URLS_AS_SET)
        fetch_one = self.server.spop if use_set else self.server.lpop
        # XXX: Do we need to use a timeout here?
        found = 0
        # TODO: Use redis pipeline execution.
        while found < self.redis_batch_size:
            try:
                data = fetch_one(self.redis_key)
           except Exception:
                # according to the parameter in settings to determine
                # what to do here,  raise an exception or just wait and retry.
            if not data:
                # Queue empty.
                break
            req = self.make_request_from_data(data)
            if req:
                yield req
                found += 1
            else:
                self.logger.debug("Request not made from data: %r", data)

        if found:
            self.logger.debug("Read %s requests from '%s'", found, self.redis_key)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants