Skip to content

Analyzing 30 Years of academic themes in arXiv research using clustering techniques

License

Notifications You must be signed in to change notification settings

paumartinez1/arxiv-clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Authors

Paula Martinez, Joshua Simon

Project Description

This project explores the evolution of academic research themes across multiple disciplines over the past 30 years, from 1995 to 2024. By applying unsupervised clustering techniques to a large corpus of preprint articles from the arXiv repository, the study uncovers patterns of thematic convergence, divergence, and interdisciplinary interactions within fields such as mathematics, physics, computer science, and statistics.

Key Takeaways

  1. The clustering techniques produced thematically coherent but potentially granular subgroups within the broader mathematics, statistics, and computer science themes, as indicated by domain-specific terms.

  2. While the 250 sample case had balanced cluster distributions, the 500 sample case showed an asymmetry, suggesting more detailed subgroups within the larger statistical methods/theory cluster.

  3. Statistics remained a consistently relevant theme, with shifts between theory and applications. Computer science exhibited more transformative evolution, driven by technological advances, interdisciplinary nature, and practical applications.

About

Analyzing 30 Years of academic themes in arXiv research using clustering techniques

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published