Error 403, ScraperException: 4 requests #848

anastasita308 · 2023-04-21T09:41:31Z

I have been trying to scrape Twitter, it worked just fine up until yesterday. My version of snscrape is updated, but it still gives me this error after 8 seconds.
I know this has been raised before, but I cannot find a solution from what was already mentioned in the past questions at the moment.

ScraperException: 4 requests to https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&include_ext_has_nft_avatar=1&include_ext_is_blue_verified=1&include_ext_verified_type=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_ext_alt_text=true&include_ext_limited_action_results=false&include_quote_count=true&include_reply_count=1&tweet_mode=extended&include_ext_collab_control=true&include_ext_views=true&include_entities=true&include_user_entities=true&include_ext_media_color=true&include_ext_media_availability=true&include_ext_sensitive_media_warning=true&include_ext_trusted_friends_metadata=true&send_error_codes=true&simple_quoted_tweet=true&q=%28plastic+OR+environment+OR+pollution+OR+packaging+OR+waste+OR+climate+OR+sustainability%29+%28%40Unilever%29+until%3A2019-10-07+since%3A2019-09-29&tweet_search_mode=live&count=20&query_source=spelling_expansion_revert_click&pc=1&spelling_corrections=1&include_ext_edit_control=true&ext=mediaStats%2ChighlightedLabel%2ChasNftAvatar%2CvoiceInfo%2Cenrichments%2CsuperFollowMetadata%2CunmentionInfo%2CeditControl%2Ccollab_control%2Cvibe failed, giving up.

This is my code:

import snscrape.modules.twitter as sntwitter
import pandas as pd
import re
import string
import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
import time

# define the search query
query = "(plastic OR environment OR pollution OR packaging OR waste OR climate OR sustainability) (@Unilever) until:2019-10-07 since:2019-09-29"

# define a list of stopwords
stop_words = set(stopwords.words('english'))

# define the list of tweets
tweets = []
limit=500

# loop through the search results and clean the text of each tweet
for i, tweet in enumerate(sntwitter.TwitterSearchScraper(query).get_items()):
    if i == limit:
        break
    else:
        # clean the text of the tweet
        text = tweet.content.lower()
        text = re.sub(r'http\S+', '', text) # remove URLs
        text = re.sub(r'@\w+', '', text) # remove mentions
        hashtags = re.findall(r'#\w+', text) # find hashtags
        cleaned_hashtags = [re.sub(r'#', '', hashtag) for hashtag in hashtags] # remove hash from hashtags
        text = re.sub(r'#\w+', ' '.join(cleaned_hashtags), text) # replace hashtags with cleaned hashtags
        text = text.translate(str.maketrans('', '', string.punctuation)) # remove punctuation
        words = [word for word in text.split() if word not in stop_words] # remove stop words
        cleaned_text = ' '.join(words)

        # check if the cleaned text is empty, and skip the tweet if it is
        if not cleaned_text:
            continue
        
        # add the cleaned text and other tweet data to the list
        tweets.append([tweet.date, tweet.username, cleaned_text])

        # pause the code for 1 second before making the next request
        time.sleep(3)

# create a dataframe from the list of tweets and save to CSV
df = pd.DataFrame(tweets, columns=['Date', 'User', 'Tweet'])
df.to_csv('unilever_tweets.csv', index=False)

The text was updated successfully, but these errors were encountered:

Wouze · 2023-04-21T10:27:18Z

#846 discussed this issue

anastasita308 closed this as completed Apr 21, 2023

JustAnotherArchivist added the duplicate This issue or pull request already exists label Apr 21, 2023

JustAnotherArchivist closed this as not planned Won't fix, can't repro, duplicate, stale Apr 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error 403, ScraperException: 4 requests #848

Error 403, ScraperException: 4 requests #848

anastasita308 commented Apr 21, 2023 •

edited

Loading

Wouze commented Apr 21, 2023

Error 403, ScraperException: 4 requests #848

Error 403, ScraperException: 4 requests #848

Comments

anastasita308 commented Apr 21, 2023 • edited Loading

Wouze commented Apr 21, 2023

anastasita308 commented Apr 21, 2023 •

edited

Loading