Autoencoder dimensionality reduction, EMD-Manhattan metrics comparison and classifier based clustering on MNIST dataset.
-
Updated
Mar 5, 2021 - C++
Autoencoder dimensionality reduction, EMD-Manhattan metrics comparison and classifier based clustering on MNIST dataset.
This repo shows research paper upon which I worked during my summer research intern - 2022.
Implementacija algoritama predstavljenih na predmetu Analiza velikih skupova podataka (AVSP)
TTAK.KO-12.0276 LSH Recursive Hasher
The assignment comprises two main tasks: implementing LSH to identify similar businesses based on user ratings and developing various collaborative filtering recommendation systems to predict user ratings for businesses.
Coursera's Natural Language Processing specialization
Explored Jaccard distance, Min-Hashing, and LSH for user similarity in a movie rating dataset. Tasks involve dataset preprocessing, exact Jaccard Similarity computation, Min-Hash signatures, and LSH implementation. Results and observations are documented in code, output files, and a report
Software Development for Algorithmic Problems (UoA) Assignments
Scaling Up Nearest Neighbor Search : How Dataset Size and Dimensionality Affect KNN Variants
Implementation of algorithms for big data using python, numpy, pandas.
📈|Time Series - Nearest neighbor search and Clustering using LSH, Hypercube (and Lloyd's only at the clustering) algorithms with metrics: L2, Discrete and Continuous Fréchet.
This repo aims to implement an modular engine for Locality-Sensitive Hashing (LSH).
Projects involving Frequent Itemset Mining and analysis of hierarchical space partitioning techniques
This repository contains simple and funny Data Mining projects in Python.
Finding Similar Items: Textually Similar Documents
MDLE First Assignment - The objective of this project was to implement the A-Priori algorithm to obtain the most frequent itemsets for a list of conditions for a large set of patients, obtaining then associations between conditions by extracting some rules, and also to implement and apply LSH to identify similar news articles from a dataset.
Repository for all assignments of the course COL761: Data Mining (Fall 2020), taught at IIT Delhi
Implementing Locality Sensitive Hashing for DNA Sequences.
Finding similar documents using LSH with MapReduce on multi-node Spark Cluster
Add a description, image, and links to the lsh-algorithm topic page so that developers can more easily learn about it.
To associate your repository with the lsh-algorithm topic, visit your repo's landing page and select "manage topics."