Descriptive Modeling and Clustering of Textual Data

Instructions to run code (please use Eclipse IDE) :

git clone [email protected]:ssnap03/bds.git

Open Eclipse -> File -> Import project from file system or archive -> browse to the directory named bds where you cloned the repository -> open and click finish
Eclipse will setup the repository as a maven project and download dependencies
Once the dependencies are resolved, from the file explorer on the left pane, navigate to com.bds.textmining.driver.Driver.java, right click on this class and click run as -> java application
The output will display on the console, a text file called topics.txt with the topics extracted will be generated in the src directory and the plots of the clusters will be rendered.
Please expect a small delay for output to render as computation of the tf idf matrix and clustering are slightly computationally expensive
screenshots of the output are available in the output directory in this repo as well for reference along with the generated topics.txt file containing the output of topic modeling

Contributors:

Abhishek Narayanan & Anirudh Nistala

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
output		output
src		src
.gitignore		.gitignore
README.md		README.md
pom.xml		pom.xml

Provide feedback