Skip to content

ssnap03/data_science

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Descriptive Modeling and Clustering of Textual Data

Instructions to run code (please use Eclipse IDE) :

  1. Clone this repository to your local machine :

git clone [email protected]:ssnap03/bds.git

  1. Open Eclipse -> File -> Import project from file system or archive -> browse to the directory named bds where you cloned the repository -> open and click finish

  2. Eclipse will setup the repository as a maven project and download dependencies

  3. Once the dependencies are resolved, from the file explorer on the left pane, navigate to com.bds.textmining.driver.Driver.java, right click on this class and click run as -> java application

  4. The output will display on the console, a text file called topics.txt with the topics extracted will be generated in the src directory and the plots of the clusters will be rendered.

  5. Please expect a small delay for output to render as computation of the tf idf matrix and clustering are slightly computationally expensive

  6. screenshots of the output are available in the output directory in this repo as well for reference along with the generated topics.txt file containing the output of topic modeling

Contributors:

Abhishek Narayanan & Anirudh Nistala

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages