A Python3 package for running, visualizing and producing animations of t-distributed stochastic Nearest-Neighbor Embedding (t-SNE) implemented in C++. The code is a modified version of bhtsne taken from Laurens van der Maaten repository. The package implements t-SNE as a Class following the sklearn syntax.
- Clone or download this repository
- Open file
- Compile C++ code and install the package with the following commands:
I suggest you install the code using pip
from an Anaconda Python 3 environment. From that environment:
git clone https://github.com/alexandreday/tsne_visual.git
cd tsne_visual
g++ cpp/sptree.cpp cpp/tsne.cpp -o tsne_visual/bh_tsne -O2
pip install .
That's it, you're good to go ! You can now import tsne_visual
from anywhere. See the following example
for a quick start.
For an example look at example/example.py
. This is an example of t-SNE applied to the MNIST data set (provided in example/MNIST/
).
The syntax used is very similar to sklearn syntax.
It should a produce a figure similar to this:
- g++ compiler (for the C++ code)
- Python3.x
- ffmpeg software (optional - for producing animations)
- scikit-learn package
During the course of a research project I ended up using t-SNE quite a bit for large datasets (N>20000). I wanted something easy to use (i.e. in written in python, and that gave me easy access to all of t-SNE parameters) but also very fast (i.e. with C/C++ speed). I also wanted to produce animations of the t-SNE as a function of the iterations. To achieve all this I ended combining codes from multiple sources and writing a bit of code myself. I thought this might be useful for other people too.
- t-SNE's original author website
- sklearn t-SNE
- Google's embedding projector
- How to use t-SNE effectively !
pip3 uninstall tsne_visual