Skip to content

Latest commit

 

History

History
38 lines (25 loc) · 3.92 KB

README.md

File metadata and controls

38 lines (25 loc) · 3.92 KB

Awesome NLP Projects for Persian and English

Welcome to the "awesome-nlp" repository! This repository contains a collection of Natural Language Processing (NLP) and Information Retrieval projects designed for addressing problems in both Persian and English languages. Below, you'll find a brief overview of each project included in this repository, along with links to their respective repositories.

Projects

The QA (Question Answering) project aims to classify news or articles into thematic categories. It employs a model that, given a document, predicts its subject category. This dataset encompasses seven thematic categories. The QA task is tackled using Hidden Markov Models (HMM) and a transformer model. You can access the fine-tuned transformer model on HuggingFace.

In this project, we delve into the detection and correction of bias in language models for both English and Persian languages. Bias in machine learning models can skew their decisions, and this project addresses bias related to race and gender. You can choose a language model, such as BERT, for this task.

This project is designed to recognize illegal Persian words that may have undergone certain modifications, including the introduction of non-Persian characters like English letters, numbers, and special characters. It aims to improve upon existing bots that may fail to detect illegal words with unrelated characters.

This project compiles and analyzes user reviews of "The Godfather" trilogy from IMDB. After preprocessing the data, it conducts sentiment analysis and compares the sentiment of each movie within the trilogy.

In this project, medical data is processed and preprocessed. It provides methods for information retrieval, including Boolean retrieval, TF-IDF, transformer-based models, and vector-based retrieval like FastText. Given a topic or illness, it retrieves relevant posts and articles.