Skip to content

Houssem-Ousji/Text-Analyzer-using-NLTK

Repository files navigation

Text-Analyzer-using-NLTK

Its a python scripts(class task) which contain two part:

the indexing phase:

  • word tokenization
  • line tokenization
  • deleting stop words
  • word racinisation
  • word lemmatisation
  • word labeling

the research phase:

  • getting The list of documents containing a given word
  • getting The number of occurrences of a given word in each returned document
  • getting The weight of a given word in each returned document
  • getting The tf-idf of a given word in each returned document
  • getting The most relevant document for a given word

Releases

No releases published

Packages

No packages published

Languages