Short Description
The GitHub repository of this package may have been artificially inflated with stars (from bots, crowdsourcing, etc.).
Suggestion
This could be a sign of spam, fraud, or even a supply chain attack. The package should be carefully reviewed before installing.
Information
Suspicious Stars on GitHub is a high-severity alert in the supply chain risk category. Using the number of GitHub stars as a metric for supply chain security is not always reliable, as this popularity metric can be corrupted. There are multiple GitHub star black markets where people can purchase stars to artificially inflate this metric.
Our research has determined that fake GitHub stars are frequently associated with scams, fraud, and malicious activity. We identified 3,746,538 suspected fake stars in the last five years (July 2019 to July 2024) and 10,155 repositories that have seemingly run a fake star campaign. The number of suspected fake stars has rapidly growing in the last six months.
Recommended actions
If you find a dependency flagged with the "Suspicious GitHub Stars" alert from Socket, here are the recommended actions:
- Investigate the Package: Review the package's activity, issues, and community involvement. Look for signs of unusual or automated behavior.
- Check Alternatives: Consider using an alternative package with a more transparent and trustworthy reputation.
- Contact the Maintainer: If you're unsure, reach out to the maintainer for clarification on the legitimacy of the stars.
- Report Suspicious Activity: If you believe the package is compromised, report it to GitHub or the relevant platform.
These steps can help protect your project from potential risks associated with fake stars and malicious activity.
Examples
Packages flagged with this alert link to the package overview page. It also gives an estimate for the percentage of suspicious stars.
Detection Method
This alert employs two heuristics:
- Low Activity Heuristic: Some fake star merchants use scripts to massively register one-time, thrown-away accounts to deliver fake stars. Thus, inspired by the Dagster’s detector, we designed a similar heuristic to find users that are registered to star a repository and then become inactive at the same day.
- Clustering Heuristic: Fake star merchants usually assure their client that the stars they bought will be delivered in a very short time, and some of them are reusing their accounts at hand to star many repositories. This business model leaves such “cluster-alike” patterns that are extremely hard to hide and extremely rare among real users. Mathematically, this corresponds to a heuristic that finds clusters of N users and M repositories, in which each of the repositories received stars from at least P% of the N users in a short time period ∆t. This heuristic is used by Facebook to detect fake likes. It is equivalent to the maximal biclique enumeration problem which is NP-Complete. Their algorithm, CopyCatch, is designed to find local optimas on an extremely large-scale dataset in a distributed system. Their original implementation is using the MapReduce framework and not open-source, but we replicated the same algorithm on the GHArchive dataset stored in Google BigQuery.
Additional resources
3.7 Million Fake GitHub Stars: A Growing Threat Linked to Scams and Malware
The GitHub Black Market That Helps Coders Cheat the Popularity Contest