Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream-loader and url-fetcher can become unresponsive #202

Open
edsu opened this issue Mar 2, 2022 · 3 comments
Open

stream-loader and url-fetcher can become unresponsive #202

edsu opened this issue Mar 2, 2022 · 3 comments
Assignees
Labels

Comments

@edsu
Copy link
Member

edsu commented Mar 2, 2022

I've noticed on smoketest that after long periods of time (days) the stream-loader and url-fetcher can sometimes start to fail to pick up new jobs from Redis--but are still running. I think that this is happening because the event loop inside of each type of job needs to log and catch when an exception occurs. I think we need to look specifically at UrlFetcher, VideoFetcher, StreamLoader and SearchLoader.

@edsu edsu self-assigned this Mar 2, 2022
@edsu edsu added the bug label Mar 2, 2022
@edsu
Copy link
Member Author

edsu commented Mar 8, 2022

I think I've observed StreamLoader getting stuck when the unit tests run via GitHub actions. smoketest and the GitHub actions both use the same keys, and may be interfering with each other somehow.

@edsu
Copy link
Member Author

edsu commented Mar 11, 2022

I noticed that SearchLoader ran into a connection reset error and didn't recover from it. From the app.log:

{"code":"ECONNRESET","errno":"ECONNRESET","level":"error","message":"request to https://api.twitter.com/2/tweets/search/all?expansions=author_id%2Cin_reply_to_user_id%2Creferenced_tweets.id%2Creferenced_tweets.id.author_id%2Centities.mentions.username%2Cattachments.poll_ids%2Cattachments.media_keys%2Cgeo.place_id&user.fields=created_at%2Cdescription%2Centities%2Cid%2Clocation%2Cname%2Cpinned_tweet_id%2Cprofile_image_url%2Cprotected%2Cpublic_metrics%2Curl%2Cusername%2Cverified%2Cwithheld&tweet.fields=attachments%2Cauthor_id%2Ccontext_annotations%2Cconversation_id%2Ccreated_at%2Centities%2Cgeo%2Cid%2Cin_reply_to_user_id%2Clang%2Cpublic_metrics%2Ctext%2Cpossibly_sensitive%2Creferenced_tweets%2Creply_settings%2Csource%2Cwithheld&media.fields=alt_text%2Cduration_ms%2Cheight%2Cmedia_key%2Cpreview_image_url%2Ctype%2Curl%2Cwidth%2Cpublic_metrics&poll.fields=duration_minutes%2Cend_datetime%2Cid%2Coptions%2Cvoting_status&place.fields=contained_within%2Ccountry%2Ccountry_code%2Cfull_name%2Cgeo%2Cid%2Cname%2Cplace_type&query=%23blacklivesmatter&max_results=100&start_time=2012-02-01T14%3A59%3A00Z&end_time=2016-12-31T14%3A59%3A00Z&next_token=REDACTED failed, reason: read ECONNRESET","type":"system"}

I think SearchLoader should catch these, sleep a little bit, and requeue.

@edsu
Copy link
Member Author

edsu commented Mar 12, 2022

The other one that needs to be caught in the StreamLoader:

tweet-loader_1  | {"level":"error","message":"stream disconnected with error Stream unresponsive","stack":"Error: Stream unresponsive\n    at Timeout.<anonymous> (/code/node_modules/twitter-v2/build/TwitterStream.js:43:38)\n    at listOnTimeout (internal/timers.js:549:17)\n    at processTimers (internal/timers.js:492:7)"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant