📦 Discover the Package on TestPyPI: git_cluster Package
🔍 Dive Deeper in Our GitHub Repository: Git-Clustering GitHub Repo
This repository introduces an enhanced version of the GIT (Graph of Intensity Topology) clustering algorithm. It's been augmented with additional methods, repackaged for ease of use, and includes comprehensive benchmarks to demonstrate its performance. 🚀
- Broad Applicability: Tested across a variety of datasets. 🌍 (See the benchmarks in the notebooks/).
- User-friendly Packaging: Simplified integration into your projects. 📦
To get started, explore the notebooks/Quick_Start_with_GIT.ipynb notebook for a step-by-step guide on applying this algorithm to your data.
To validate the installation and functionality of the GIT Clustering package, you can either run the steps manually following the instructions below or click the Open in Colab button to open a Colab notebook where everything is set up for you.
Follow these steps to manually install the GIT Clustering package and test its functionality:
-
Install the GIT Clustering package from TestPyPI and upgrade gdown for dataset downloading:
!pip install -i "https://test.pypi.org/simple/" git_cluster !pip install -U gdown
-
Download the datasets and prepare it for use:
!gdown 1yNwCStP3Sdf2lfvNe9h0WIZw2OQ3O2UP && unzip datasets.zip
-
Execute a sample clustering process:
from git_cluster import GIT from utils import alignPredictedWithTrueLabels, autoPlot from dataloaders import Toy_DataLoader as Toy_DataLoader # Load the Circles Dataset X_circles, Y_circles_true = Toy_DataLoader(name='circles', path="/content/datasets/toy_datasets").load() # Create an instance of the GIT clustering git = GIT(k=12, target_ratio=[1, 1]) # Fit the GIT model to the dataset and predict cluster labels. Y_circles_pred = git.fit_predict(X_circles) # Plot the dataset and highlight the clusters with different colors. autoPlot(X_circles, Y_circles_pred)
- We extend our thanks to the original authors of the GIT algorithm for their foundational work in
Clustering Based on Graph of Intensity Topology
:- Gao, Zhangyang and Lin, Haitao and Tan, Cheng and Wu, Lirong and Li, Stan and others.
If you use the GIT Clustering algorithm in your research or project, please consider citing the original work:
@article{gao2021git,
title={Git: Clustering Based on Graph of Intensity Topology},
author={Gao, Zhangyang and Lin, Haitao and Tan, Cheng and Wu, Lirong and Li, Stan and others},
journal={arXiv preprint arXiv:2110.01274},
year={2021}
}