Skip to content

slides and code for "Getting an Edge with Network Analysis" talk

Notifications You must be signed in to change notification settings

alonnir/PyCon-US-2021-Talk

Repository files navigation

PyCon-US-2021-Talk

Slides and code for "Getting an Edge with Network Analysis" talk at PyCon US 2021.

Interested in learning more about network analysis? Check out SNAcks for a highly curated resource list (or a snack size awesome list).

The video of the talk is now available here.


Take-home assigments

  1. Wikipedia surfing challenge
  • Use Wikipedia's data to build a netwrok, where articles act as nodes and internal links between one node to another are the edges. Calculate measures such as betweeness centrality or PageRank to find important and interesting nodes. What interesting things can you learn about Wikipedia's network?

  • Using the network you built, choose 2 articles on Wikipedia for very different topics (e.g. Python Conference and Hot-air balloon) and find all the paths between them. What can you observe by exploring the different paths?

Data can be found here and some inspiration can be found here.

Edit: Thanks to a kind commenter in PyCon's live chat, I learnt about this fantastic site - Six Degrees of Wikipedia.

  1. How would you answer the Biz Dev question? In the talk we introduced a dataset with transactions on a peer-2-peer payments platform. How would you use this data to find merchants/businesses using this platform? Hint: what would you expect the network around metchants would look like, and how will it be different from networks around "standard" users.

  2. You work for a social network and want to limit the spread of misinformation.

  • How would you use network analysis to do that effectively?
  • Try to think how network where (mis)information travels fast look like, and how is it different from networks where the spread of (mis)information is slower.
  1. I Want Out is a community on reddit where posts' titles detail the OP's country of origin, and desired relocation country. For example: [IWantOut] 23M USA-> Netherlands.
  • Get your hands on a data set of posts from the subreddit. Fortunately, it's relatively easy to collect that data.
  • Use the dataset you obtained to create a directed network. What interesting things can you learn from the network? Which are the top countries users want to leave? Which countries are the most desired locations?
  • Extra points if you add age, gender or occupation (which are often included in the title of the post) as node attributes.
  • Finally, try to think what are the limitations of such analysis. Is the data biased in any way (or multiple ways)?

About

slides and code for "Getting an Edge with Network Analysis" talk

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages