Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snscrape multiple iterative search not working #770

Closed
alice23DS opened this issue Mar 15, 2023 · 12 comments
Closed

snscrape multiple iterative search not working #770

alice23DS opened this issue Mar 15, 2023 · 12 comments
Labels
invalid This doesn't seem right

Comments

@alice23DS
Copy link

alice23DS commented Mar 15, 2023

Describe the bug

I am trying to fetch data from the config file and then trying to scrape data with the iterative search for those keywords.

The code has been into the processing for a while and doesn't show any error or result.

Initially I tried to search for a week data, now just to confirm i tried to scrape for a day, still the code is not retrieving anything.

How to reproduce

for n in (account_name):
    
    for i, tweet in enumerate(sntwitter.TwitterSearchScraper(f'{n} + near:"United States" + since:{datestamp_start} until:{datestamp_end}').get_items()):

Expected behaviour

image

Screenshots and recordings

image

image

Operating system

Windows 10

Python version: output of python3 --version

3.8.8

snscrape version: output of snscrape --version

Version 0.6.1

Scraper

TwitterSearchScraper

How are you using snscrape?

Module (import snscrape.modules.something in Python code)

Backtrace

No response

Log output

No response

Dump of locals

No response

Additional context

No response

@alice23DS alice23DS added the bug Something isn't working label Mar 15, 2023
@JustAnotherArchivist
Copy link
Owner

What concrete query are you running, which doesn't return results? And does the same query give results on Twitter's web interface?

@alice23DS
Copy link
Author

sntwitter.TwitterSearchScraper(f'{account_name[n]} + near:"United States" + since:{datestamp_start} until:{datestamp_end}').get_items()

It works when i am not using iterative search for account_name and give combination of keywords directly.

But when i put it under the loop for iterative search with -----for n, k in enumerate(account_name):

it is neither producing any result nor giving any error.

@JustAnotherArchivist
Copy link
Owner

My only guess would be that account_name isn't what you think it is. snscrape itself has no idea what's happening in your code. It just gets a query string and runs that search.

@alice23DS
Copy link
Author

account_name is a variable which has keywords

image

@JustAnotherArchivist
Copy link
Owner

Yes, I saw that, and those queries return results for me just fine:

>>> import snscrape.modules.twitter
>>> next(snscrape.modules.twitter.TwitterSearchScraper('Amex near:"United States" since:2022-11-01 until:2022-11-02').get_items())
Tweet(url='https://twitter.com/lifematters32/status/1587591692697337857', date=datetime.datetime(2022, 11, 1, 23, 45, 30, tzinfo=datetime.timezone.utc), rawContent='@Norahlyza @smf_chi @dc5423 @VitalVegas @LasVegasLocally I have all those perk cards (Hilton Honors-Amex/Platinum Amex) so my stay was fantastic - I highly suggest if you don’t have those cards already 💪🏾', renderedContent='@Norahlyza @smf_chi @dc5423 @VitalVegas @LasVegasLocally I have all those perk cards (Hilton Honors-Amex/Platinum Amex) so my stay was fantastic - I highly suggest if you don’t have those cards already 💪🏾', id=1587591692697337857, user=User(username='lifematters32', id=214226801, displayname='Joe “DUG SZN” Green', rawDescription='Born in Michigan. Live in Florida. Love sports and talking shit. #goblue #pistons #lions #tigers and #redwings too', renderedDescription='Born in Michigan. Live in Florida. Love sports and talking shit. #goblue #pistons #lions #tigers and #redwings too', descriptionLinks=None, verified=False, created=datetime.datetime(2010, 11, 10, 22, 2, 47, tzinfo=datetime.timezone.utc), followersCount=117, friendsCount=387, statusesCount=5084, favouritesCount=16340, listedCount=2, mediaCount=370, location='Miami, FL', protected=False, link=None, profileImageUrl='https://pbs.twimg.com/profile_images/1493642122855133192/YOO9VI0e_normal.jpg', profileBannerUrl='https://pbs.twimg.com/profile_banners/214226801/1606962948', label=None), replyCount=0, retweetCount=0, likeCount=3, quoteCount=0, conversationId=1587472439801364480, lang='en', source='<a href="https://twitter.com/download/iphone" rel="">Twitter for iPhone</a>', sourceUrl='https://twitter.com/download/iphone', sourceLabel='Twitter for iPhone', links=None, media=None, retweetedTweet=None, quotedTweet=None, inReplyToTweetId=1587590262674853888, inReplyToUser=User(username='Norahlyza', id=548272062, displayname='Norepi 🍿🐆', rawDescription=None, renderedDescription=None, descriptionLinks=None, verified=None, created=None, followersCount=None, friendsCount=None, statusesCount=None, favouritesCount=None, listedCount=None, mediaCount=None, location=None, protected=None, link=None, profileImageUrl=None, profileBannerUrl=None, label=None), mentionedUsers=[User(username='Norahlyza', id=548272062, displayname='Norepi 🍿🐆', rawDescription=None, renderedDescription=None, descriptionLinks=None, verified=None, created=None, followersCount=None, friendsCount=None, statusesCount=None, favouritesCount=None, listedCount=None, mediaCount=None, location=None, protected=None, link=None, profileImageUrl=None, profileBannerUrl=None, label=None), User(username='smf_chi', id=635901973, displayname='SMF', rawDescription=None, renderedDescription=None, descriptionLinks=None, verified=None, created=None, followersCount=None, friendsCount=None, statusesCount=None, favouritesCount=None, listedCount=None, mediaCount=None, location=None, protected=None, link=None, profileImageUrl=None, profileBannerUrl=None, label=None), User(username='dc5423', id=28842469, displayname='Cha cha', rawDescription=None, renderedDescription=None, descriptionLinks=None, verified=None, created=None, followersCount=None, friendsCount=None, statusesCount=None, favouritesCount=None, listedCount=None, mediaCount=None, location=None, protected=None, link=None, profileImageUrl=None, profileBannerUrl=None, label=None), User(username='VitalVegas', id=514487309, displayname='Vital Vegas', rawDescription=None, renderedDescription=None, descriptionLinks=None, verified=None, created=None, followersCount=None, friendsCount=None, statusesCount=None, favouritesCount=None, listedCount=None, mediaCount=None, location=None, protected=None, link=None, profileImageUrl=None, profileBannerUrl=None, label=None), User(username='LasVegasLocally', id=1587143948, displayname='Las Vegas Locally 🌴', rawDescription=None, renderedDescription=None, descriptionLinks=None, verified=None, created=None, followersCount=None, friendsCount=None, statusesCount=None, favouritesCount=None, listedCount=None, mediaCount=None, location=None, protected=None, link=None, profileImageUrl=None, profileBannerUrl=None, label=None)], coordinates=Coordinates(longitude=-80.498527, latitude=25.65479), place=Place(id='7707ad9771781687', fullName='The Hammocks, FL', name='The Hammocks', type='city', country='United States', countryCode='US'), hashtags=None, cashtags=None, card=None, viewCount=None, vibe=None)
>>> next(snscrape.modules.twitter.TwitterSearchScraper('AmericanExpress near:"United States" since:2022-11-01 until:2022-11-02').get_items())
Tweet(url='https://twitter.com/jesusfuel/status/1587589356411572224', date=datetime.datetime(2022, 11, 1, 23, 36, 13, tzinfo=datetime.timezone.utc), rawContent='@Babackd @AmericanExpress @Ticketmaster @F1 Want the link? Tickets were 2000+', renderedContent='@Babackd @AmericanExpress @Ticketmaster @F1 Want the link? Tickets were 2000+', id=1587589356411572224, user=User(username='jesusfuel', id=139676974, displayname='xFuel', rawDescription='Radiation Oncologist, Golden retriever tamer.', renderedDescription='Radiation Oncologist, Golden retriever tamer.', descriptionLinks=None, verified=False, created=datetime.datetime(2010, 5, 3, 10, 20, 37, tzinfo=datetime.timezone.utc), followersCount=1222, friendsCount=4917, statusesCount=37976, favouritesCount=91211, listedCount=27, mediaCount=5366, location='Norte de México', protected=False, link=None, profileImageUrl='https://pbs.twimg.com/profile_images/1528729244884471811/K99-7X1g_normal.jpg', profileBannerUrl='https://pbs.twimg.com/profile_banners/139676974/1648422330', label=None), replyCount=1, retweetCount=0, likeCount=0, quoteCount=0, conversationId=1587574335895527424, lang='en', source='<a href="https://twitter.com/download/android" rel="">Twitter for Android</a>', sourceUrl='https://twitter.com/download/android', sourceLabel='Twitter for Android', links=None, media=None, retweetedTweet=None, quotedTweet=None, inReplyToTweetId=1587574335895527424, inReplyToUser=User(username='Babackd', id=82770008, displayname='Baback Davarnejad', rawDescription=None, renderedDescription=None, descriptionLinks=None, verified=None, created=None, followersCount=None, friendsCount=None, statusesCount=None, favouritesCount=None, listedCount=None, mediaCount=None, location=None, protected=None, link=None, profileImageUrl=None, profileBannerUrl=None, label=None), mentionedUsers=[User(username='Babackd', id=82770008, displayname='Baback Davarnejad', rawDescription=None, renderedDescription=None, descriptionLinks=None, verified=None, created=None, followersCount=None, friendsCount=None, statusesCount=None, favouritesCount=None, listedCount=None, mediaCount=None, location=None, protected=None, link=None, profileImageUrl=None, profileBannerUrl=None, label=None), User(username='AmericanExpress', id=42712551, displayname='American Express', rawDescription=None, renderedDescription=None, descriptionLinks=None, verified=None, created=None, followersCount=None, friendsCount=None, statusesCount=None, favouritesCount=None, listedCount=None, mediaCount=None, location=None, protected=None, link=None, profileImageUrl=None, profileBannerUrl=None, label=None), User(username='Ticketmaster', id=27743648, displayname='Ticketmaster', rawDescription=None, renderedDescription=None, descriptionLinks=None, verified=None, created=None, followersCount=None, friendsCount=None, statusesCount=None, favouritesCount=None, listedCount=None, mediaCount=None, location=None, protected=None, link=None, profileImageUrl=None, profileBannerUrl=None, label=None), User(username='F1', id=69008563, displayname='Formula 1', rawDescription=None, renderedDescription=None, descriptionLinks=None, verified=None, created=None, followersCount=None, friendsCount=None, statusesCount=None, favouritesCount=None, listedCount=None, mediaCount=None, location=None, protected=None, link=None, profileImageUrl=None, profileBannerUrl=None, label=None)], coordinates=Coordinates(longitude=-100.421037, latitude=25.4805381), place=Place(id='b19e24ce42ccd6aa', fullName='Monterrey, Nuevo León', name='Monterrey', type='city', country='Mexico', countryCode='MX'), hashtags=None, cashtags=None, card=None, viewCount=None, vibe=None)

If you don't get results in your loop, you're probably passing a different string into the scraper than you think you are.

@alice23DS
Copy link
Author

Can I send you my code?

@JustAnotherArchivist
Copy link
Owner

I'm sorry, but I don't have time to debug your code.
I just noticed that you have some plus signs in your query, but that doesn't make a difference for me.
Add a print(repr(...)) for your query string in the loop and make sure it matches what I used above.

@alice23DS
Copy link
Author

sure. Thanks, Ill try that

@alice23DS
Copy link
Author

image

can anyone tell what this means 'Unsupported unified_card type on tweet 1587186251718959107: 'commerce_drop' '

Getting this while running the query

@JustAnotherArchivist
Copy link
Owner

It's a warning that snscrape couldn't extract the card on that tweet. I have never seen commerce_drop before, thanks. Apparently Twitter's own web client also has no idea how to render it.

@JustAnotherArchivist
Copy link
Owner

I assume this means you got your code working?

@alice23DS
Copy link
Author

Yes thank you, there was issue with my query as is it was not able to pass value of account_name[n]

@JustAnotherArchivist JustAnotherArchivist added invalid This doesn't seem right and removed bug Something isn't working labels Mar 15, 2023
@JustAnotherArchivist JustAnotherArchivist closed this as not planned Won't fix, can't repro, duplicate, stale Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

2 participants