Skip to content

ryanswanstrom/Twitter-User-Analysis

Repository files navigation

Twitter-User-Analysis

Python code to pull user data from twitter.

Data files

Both data files were generated on October 8, 2013.

  1. twitter_user_datascience_data.csv This data set was generated by pulling user accounts from twitter associated with the query 'datascience'. Notice that only about 150 users are active, the remaining users are quite sparse.
  2. twitter_user_data.csv This data set was generated by pulling user accounts from twitter associated with the query 'nfl'. This data is full and complete.
  3. twitter_user_data_data.csv This data set was generated by pulling user accounts from twitter associated with the query 'data'. This data is full and complete. (note: this file was generated October 10, 2013)

About the Data files

Each row is associated with a different twitter user/account. Below are the columns.

  1. handle - twitter username
  2. name - full name of the twitter user
  3. age - number of days the user has existed on twitter
  4. num_of_tweets - number of tweets this user has created (includes retweets)
  5. has_profile - 1 if the user has created a profile description, 0 otherwise
  6. has_pic - 1 if the user has setup a profile pic, 0 otherwise
  7. num_following - number of other twitter users, this user is following
  8. num_of_favorites - number of tweets the user has favorited
  9. num_of_lists - number of public lists this user has been added to
  10. num_of_followers - number of other users following this user

How to run the code

  1. git clone https://github.com/swGooF/Twitter-User-Analysis.git
  2. cd Twitter-User-Analysis
  3. First open getdata.py and enter your Twitter access_token_key, access_token_secret, consumer_key, consumer_secret
  4. from a command line 'python getdata.py datascience 3'

What are the parameters

  1. a query string: 'datascience' in the example above
  2. Number of pages to return, each page will return 20 users, the current Twitter API is 180 calls per 15 minutes and each page requires a new call

Some Analysis

R Code For Numerous Models

Basic iPython Analysis

About

Python code to pull user data from twitter.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages