Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Twitter 'latest' search fails with non-200 status code (401) #834

Closed
ExtremeSRL opened this issue Apr 15, 2023 · 53 comments
Closed

Twitter 'latest' search fails with non-200 status code (401) #834

ExtremeSRL opened this issue Apr 15, 2023 · 53 comments
Labels
bug Something isn't working module:twitter upstream

Comments

@ExtremeSRL
Copy link

ExtremeSRL commented Apr 15, 2023

Describe the bug

twitter search stop working

File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\python311\Scripts\snscrape.exe\__main__.py", line 7, in <module>
  File "C:\Python311\Lib\site-packages\snscrape\_cli.py", line 320, in main
    for i, item in enumerate(scraper.get_items(), start = 1):
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 1659, in get_items
    for obj in self._iter_api_data('https://api.twitter.com/2/search/adaptive.json', _TwitterAPIType.V2, params, paginationParams, cursor = self._cursor):
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 761, in _iter_api_data
    obj = self._get_api_data(endpoint, apiType, reqParams)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 727, in _get_api_data
    r = self._get(endpoint, params = params, headers = self._apiHeaders, responseOkCallback = self._check_api_response)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\base.py", line 251, in _get
    return self._request('GET', *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\base.py", line 247, in _request
    raise ScraperException(msg)
snscrape.base.ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=lang%3Ait&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up.

How to reproduce

twitter search scraper

Expected behaviour

retrieve twitter post

Screenshots and recordings

File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\python311\Scripts\snscrape.exe\__main__.py", line 7, in <module>
  File "C:\Python311\Lib\site-packages\snscrape\_cli.py", line 320, in main
    for i, item in enumerate(scraper.get_items(), start = 1):
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 1659, in get_items
    for obj in self._iter_api_data('https://api.twitter.com/2/search/adaptive.json', _TwitterAPIType.V2, params, paginationParams, cursor = self._cursor):
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 761, in _iter_api_data
    obj = self._get_api_data(endpoint, apiType, reqParams)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 727, in _get_api_data
    r = self._get(endpoint, params = params, headers = self._apiHeaders, responseOkCallback = self._check_api_response)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\base.py", line 251, in _get
    return self._request('GET', *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\base.py", line 247, in _request
    raise ScraperException(msg)
snscrape.base.ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=lang%3Ait&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up.

Operating system

windows 10

Python version: output of python3 --version

3.7.5

snscrape version: output of snscrape --version

0.6.1.20230315.dev2+gedac5f3

Scraper

twitter-search

How are you using snscrape?

CLI (snscrape ... as a command, e.g. in a terminal)

Backtrace

No response

Log output

File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\python311\Scripts\snscrape.exe\__main__.py", line 7, in <module>
  File "C:\Python311\Lib\site-packages\snscrape\_cli.py", line 320, in main
    for i, item in enumerate(scraper.get_items(), start = 1):
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 1659, in get_items
    for obj in self._iter_api_data('https://api.twitter.com/2/search/adaptive.json', _TwitterAPIType.V2, params, paginationParams, cursor = self._cursor):
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 761, in _iter_api_data
    obj = self._get_api_data(endpoint, apiType, reqParams)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 727, in _get_api_data
    r = self._get(endpoint, params = params, headers = self._apiHeaders, responseOkCallback = self._check_api_response)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\base.py", line 251, in _get
    return self._request('GET', *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\base.py", line 247, in _request
    raise ScraperException(msg)
snscrape.base.ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=lang%3Ait&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up.

Dump of locals

File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\python311\Scripts\snscrape.exe\__main__.py", line 7, in <module>
  File "C:\Python311\Lib\site-packages\snscrape\_cli.py", line 320, in main
    for i, item in enumerate(scraper.get_items(), start = 1):
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 1659, in get_items
    for obj in self._iter_api_data('https://api.twitter.com/2/search/adaptive.json', _TwitterAPIType.V2, params, paginationParams, cursor = self._cursor):
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 761, in _iter_api_data
    obj = self._get_api_data(endpoint, apiType, reqParams)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\modules\twitter.py", line 727, in _get_api_data
    r = self._get(endpoint, params = params, headers = self._apiHeaders, responseOkCallback = self._check_api_response)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\base.py", line 251, in _get
    return self._request('GET', *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python311\Lib\site-packages\snscrape\base.py", line 247, in _request
    raise ScraperException(msg)
snscrape.base.ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=lang%3Ait&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up.

Additional context

none

@ExtremeSRL ExtremeSRL added the bug Something isn't working label Apr 15, 2023
@NicoBerlo
Copy link

Actually the error returned by twitter is :
{"errors":[{"message":"Bad Authentication data","code":215}]}

@Yomguithereal
Copy link

Related to medialab/minet#682

@Hesko123
Copy link

Related to medialab/minet#682

Don't think it is related to the latest tab as it's not something new. Snscrape was already aware about that.

@JustAnotherArchivist
Copy link
Owner

Actually the error returned by twitter is :
{"errors":[{"message":"Bad Authentication data","code":215}]}

No, it isn't. Why do people keep opening the API URLs in browsers and expecting it to work despite lacking the relevant authentication headers?

Yes, looks like Twitter removed the 'latest' search again. Unless they reverse that, it's unlikely that there is a fix for this. Cf. #634 for the previous occurrence of this a couple months ago.

@JustAnotherArchivist JustAnotherArchivist changed the title twitter search stop working Twitter 'latest' search fails with non-200 status code (401) Apr 15, 2023
@JustAnotherArchivist
Copy link
Owner

For the record, the error returned by Twitter is:

{"errors":[{"code":32,"message":"Could not authenticate you."}]}

@Mr-Freewan
Copy link

Same error in TwitterUserScraper. Not only search. Moreover, there was a short period of time when it worked, but after about 10 minutes it broke again

@JustAnotherArchivist
Copy link
Owner

It is only the latest search, but twitter-user, twitter-hashtag, and a few more scrapers are simple wrappers around the search, so yes, they're also affected.

@Mr-Freewan
Copy link

I noticed that it works, but it is very unstable. It gives an error (non-200 (401)) in 2 out of 3 requests, but it works fine on 3.

@Mr-Freewan
Copy link

Mr-Freewan commented Apr 15, 2023

It working again =)

@JustAnotherArchivist
Copy link
Owner

Haven't seen any further interruptions, but I'll keep the issue open and pinned for now in case it returns.

@JustAnotherArchivist
Copy link
Owner

No more issues since Saturday. :-)

@ExtremeSRL
Copy link
Author

twitter these days is making changes to the business plans and I guess therefore also to the API.
Let's stay tuned because I'm afraid there will be more problems.
In the meantime always thanks for your great work!

@0xTechnician
Copy link

The problem just came back! 401 on search by query

@projectno3
Copy link

projectno3 commented Apr 20, 2023

I have issues too, but for me it alternates between working and not working (as if my internet connection was unstable, but that is not the case).

@rmnhg
Copy link

rmnhg commented Apr 20, 2023

I only have this problem with twitter-user. twitter-search runs fine (if I don't use any parameter like from:USERNAME)

@JustAnotherArchivist
Copy link
Owner

JustAnotherArchivist commented Apr 20, 2023

@rmnhg No, it happens with both. twitter-user is a very thin wrapper around twitter-search anyway; wouldn't make any sense if they didn't behave the same (unless they were restricting specifically from:X queries, which isn't the case). You probably just got lucky on your twitter-search runs and unlucky on the twitter-user ones.

@Mr-Freewan
Copy link

It works, but it is very unstable. Apparently Twitter is doing some work on its servers again.

@codilau
Copy link

codilau commented Apr 20, 2023

What I see is that unauthenticated searches fail even in the browser. "Your account may not be allowed to perform this action. Please refresh the page and try again."

@kooperalan
Copy link

I set a delay of 1 minute between each tweet and it works.

@Josias-TopicWorx
Copy link

Josias-TopicWorx commented Apr 20, 2023

Issue seems to persist for me, every other request returns data.
As @kooperalan said, adding a delay seems to work. For me, adding a 10 seconds delay has completely removed the problem for me.

@AntoinePaix
Copy link

I have the same error with my own scraper and a complete different implementation (I use http). The 'top' tab works well but not the 'latest' tab.

But last night the advanced search was working fine with 'latest'...

@Hesko123
Copy link

I have the same error with my own scraper and a complete different implementation (I use http). The 'top' tab works well but not the 'latest' tab.

But last night the advanced search was working fine with 'latest'...

What's the advantage of using http implementation ?

@AntoinePaix
Copy link

Ooops, I meant httpx. It's a python client with nice features such as request/response hooks, http2 and async capabilities.

@Hesko123
Copy link

Hesko123 commented Apr 20, 2023

Ooops, I meant httpx. It's a python client with nice features such as request/response hooks, http2 and async capabilities.

Oh async fort multi threading ? Response hooks for tweet responses ?

@AntoinePaix
Copy link

@Hesko123 async like if you want to run multiple scrapers inside one thread.

Twitter's problem with the 'latest' search is really episodic. I just ran my personal scraper several times, the first 2 failed but the third passed without issue.

@Hesko123

This comment was marked as off-topic.

@AntoinePaix

This comment was marked as off-topic.

@Hesko123

This comment was marked as off-topic.

@AntoinePaix

This comment was marked as off-topic.

@Hesko123

This comment was marked as spam.

@mc0ps
Copy link

mc0ps commented Apr 20, 2023

@Hesko123 async like if you want to run multiple scrapers inside one thread.

Twitter's problem with the 'latest' search is really episodic. I just ran my personal scraper several times, the first 2 failed but the third passed without issue.

It seems almost random (just worked 1/5 times for me). I'm wondering if it has something to do with the user-agent, because I noticed that it's set randomly.

EDIT: maybe not, I just tried setting the user-agent to one of the ones that worked, and seems to fail repeatedly anyway

@AntoinePaix
Copy link

It's quite weird but when I copy as curl the request made to the adaptive.json API, if I remove the cookies I have 1/3 the authentication error.

But if I put the cookies back with only the "guest_id" cookie I have the impression that I no longer have the authentication problem...

@AntoinePaix
Copy link

The guest_id cookie is set when you do a request to the frontend endpoint of advanced search API like : https://twitter.com/search?q=ukraine&src=typed_query&f=live

@dengkefeng
Copy link

I see snscrape call twitter by using twitter api "https://api.twitter.com/2/search/adaptive.json",so is it going to be affected by twitter new policy with very small free rate limit. Is there a plan to fix this problem, like supporting scrape twitter by webpage (e.g: https://twitter.com/search?q=from%3Aelonmusk&src=typed_query&f=live)?

@JustAnotherArchivist
Copy link
Owner

@dengkefeng #695

@JustAnotherArchivist
Copy link
Owner

@AntoinePaix Negative, I'm also seeing failures with the guest_id cookie set.

@dengkefeng
Copy link

@dengkefeng #695

Got it, thanks @JustAnotherArchivist very much. So how do we resolve the main issue in this thread? Just wait for twitter to come back? Thanks!

@AntoinePaix
Copy link

@JustAnotherArchivist Ah darn. Finally it is rather good news, it means that it is not necessarily a problem related to a new method of authentication.

@laurent-IA
Copy link

the search function is no more accessible if you are not logged in . . so I suppose it is the end for snscrape

@JustAnotherArchivist
Copy link
Owner

The error is now blocked (403), and all requests seem to be affected. No indication of what's happening on the web interface though, just 'Please refresh the page and try again', so it may well be unintentional.

@pablorm296
Copy link

Maybe earlier today we were witnessing a canary deploy that directed x% of the traffic to this new version where the search feature and the guest token are no longer available 😞

@Mr-Freewan

This comment was marked as off-topic.

@MrCabss69

This comment was marked as off-topic.

@lsalvinien
Copy link

The error is now blocked (403), and all requests seem to be affected. No indication of what's happening on the web interface though, just 'Please refresh the page and try again', so it may well be unintentional.

Go on Twitter.Com
Logout if you are logged in
And you will see the page has changed, there is no more the search feature

I doubt it is unintentional

@JustAnotherArchivist
Copy link
Owner

Closing this issue as this is no longer the problem affecting everyone. See #846 instead.

@Reimao

This comment was marked as off-topic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module:twitter upstream
Projects
None yet
Development

No branches or pull requests