Skip to content

Basic text analytics demo using Gutenberg Project data.

Notifications You must be signed in to change notification settings

mtzmonica/Text-Analytics-Demo

Repository files navigation

Text Analytics I

Implementation of various basic text analytics techniques and algorithms.

Implemented Techniques/Algorithms:

  • Data Scraping
  • Normalization/Preprocessing
    • Tokenization
  • Vectorization: Feature Extraction
    • Bag of Words
    • TF-IDF
  • Unsupervised Learning:
    • Document Similarity
      • Cosine Similarity
    • Document Clustering Algorithms
  • Supervised Learning:
    • Classification Algorithms

Visualizations

  • Word Clouds - simple term freq distribution of using NLTK and Word_Cloud library. (data: Gutenberg Project)

About

Basic text analytics demo using Gutenberg Project data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages