Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Twitter search scrapes in default 'latest' order fail with 401 ('non-200 status code') #634

Closed
JustAnotherArchivist opened this issue Jan 10, 2023 · 37 comments
Labels
bug Something isn't working module:twitter upstream

Comments

@JustAnotherArchivist
Copy link
Owner

JustAnotherArchivist commented Jan 10, 2023

Since today, any twitter-search scrapes (i.e. including twitter-user and twitter-hashtags, which are just wrappers around the search) that do not include the --top flag fail. Twitter removed the 'Latest' tab from the search page this afternoon, and since approximately 22:35 UTC, the API seems to fail as well.

@MazenTayseer
Copy link

MazenTayseer commented Jan 10, 2023

So I have to add top flag for it to work?

@constantin-barbu
Copy link

constantin-barbu commented Jan 10, 2023

The "Latest" tab seems to be back now on Twitter search pages.

@JustAnotherArchivist
Copy link
Owner Author

JustAnotherArchivist commented Jan 10, 2023

It's unclear right now whether the functionality is still available at all. If not, it might never get fully solved. You can probably work around it by using --top (e.g. snscrape twitter-search --top 'from:username'), but I'm not sure whether you can get the full feed that way.

@AhmetcanFR

This comment was marked as spam.

@JustAnotherArchivist
Copy link
Owner Author

@constantin-barbu They might be doing more changes at the minute. I don't see it. I'm also seeing claims that it is only available when logged in (which would be useless for snscrape).

@constantin-barbu
Copy link

You're right, it's only available when logged in :(.

@AhmetcanFR
Copy link

AhmetcanFR commented Jan 10, 2023

Is there a way to log in our access token so we can have access to it?
Would it still be without rate limit?

@JustAnotherArchivist
Copy link
Owner Author

@AhmetcanFR No, logging in is not and will not be supported: #270

@AhmetcanFR
Copy link

What can we do for now?

@JustAnotherArchivist
Copy link
Owner Author

There's currently no known workaround apart from the one I mentioned above.

@MazenTayseer
Copy link

@JustAnotherArchivist if I set 'top=False'
Will it work as before?

@JustAnotherArchivist
Copy link
Owner Author

@MazenTayseer No, only top = True will work currently.

@Goitsemedi888
Copy link

Goitsemedi888 commented Jan 11, 2023

where would this be used top = True???

for i,tweet in enumerate(sntwitter.TwitterSearchScraper('#CRM OR #Salesforce, since:2022-1-1 until:2023-1-10').get_items()):
    if i>250000:
        break
    tweets_list.append([tweet.date, tweet.id, tweet.rawContent, tweet.user.username])
    
# Creating a dataframe from the tweets list above
CRM_tweets_df = pd.DataFrame(tweets_list, columns=['Datetime', 'Tweet Id', 'Text', 'Username'])

@JustAnotherArchivist
Copy link
Owner Author

@Goitsemedi888 It's an optional argument to the scraper: TwitterSearchScraper('query', top = True)

@JustAnotherArchivist JustAnotherArchivist changed the title Twitter scrapes fail with 401 ('non-200 status code') Twitter search scrapes in default 'latest' order fail with 401 ('non-200 status code') Jan 11, 2023
@JustAnotherArchivist

This comment was marked as outdated.

@richardjozsa

This comment was marked as outdated.

@ShionNorimo

This comment was marked as duplicate.

@jannat5134

This comment was marked as off-topic.

@tmtsmrsl

This comment was marked as off-topic.

@JustAnotherArchivist

This comment was marked as off-topic.

@tmtsmrsl

This comment was marked as off-topic.

@leexijie

This comment was marked as off-topic.

@someguy-2020

This comment was marked as off-topic.

@JustAnotherArchivist

This comment was marked as off-topic.

@JustAnotherArchivist
Copy link
Owner Author

'Top' searches are now broken as well: #647

@JustAnotherArchivist
Copy link
Owner Author

JustAnotherArchivist commented Jan 14, 2023

Twitter seems to have reverted this change. The 'latest' search scraping works again on the current snscrape version, even though it's still unavailable on the web interface.

Repository owner unlocked this conversation Jan 14, 2023
@MazenTayseer
Copy link

MazenTayseer commented Jan 14, 2023

TwitterSearchScraper is still not working for me
UPDATE: I had to change to the latest version for it to work.

@JustAnotherArchivist
Copy link
Owner Author

If you (or anyone else) have a problem on the latest snscrape version and no issue exists yet for it, file a complete bug report please.

@MariaPng
Copy link

i downloaded the latest version of snscrape but still the error remains,what i have to do?

@JustAnotherArchivist
Copy link
Owner Author

@MariaPng You file a complete bug report, as the comment just before yours says.

@Satarupa22-SD
Copy link

I am still facing this issue, i have implemented top = True in my code but the error is persisting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working module:twitter upstream
Projects
None yet
Development

No branches or pull requests