GitHub - nicolay-r/sentiment-relation-classifiers: Source code for DIALOG-2018 paper: "Extracting Sentiment Attitudes From Analytical Texts"

Description

This project is a collection of researches related to sentiment attitudes extraction. Given a mass media russian news articles and list of marked named entities in it, we may predict a sentiment attitudes -- pos/neg relations between entities. Each attitude could be classified as follows: positive, negative, or neutral. Having RuSentRel collection, we apply and compare a different (feature based) machine learning approaches, such as svm, nb, rf, knn.

Being applyed for unlabeled articles (test collections), we interested only in non neutral attitudes. As a result, we extract positive and negative attitudes and discard neutrals.

Dataset

We use RuSentRel 1.0 corpus consisted of analytical articles from Internet-portal inosmi.ru.

Results

Presented in table below (in comparison with baseline neg/pos/distr methods)

Model	Precision	Recall	F1(P,N)
Baseline neg	0.03	0.39	0.05
Baseline pos	0.02	0.40	0.04
Baseline distr	0.05	0.23	0.08
KNN	0.18	0.06	0.09
SVM (GRID)	0.09	0.36	0.15
Random forest (GRID)	0.41	0.21	0.27

Installation

Using virtualenv. Create virtual environment, suppose my_env, and activate it as follows:

virtualenv my_env
source my_env/bin/activate

Use Makefile to install core library and download RuSentRel 1.0 dataset:

make install

We use word2vec model which were taken from rusvectores, Because of some features that depends on word embedding vocabulary, it is necessary to additionally download a model as follows:

make download_model

Note: This word embedding model stores a russian terms with additional POS suffix written in mystem notation.

Usage

Being not interested in neutral attitutes, dataset doesn't provide such information (i.e. attitudes with 'neutral' labels). For extraction of positive and negative attitudes we additionally introduce (extract from news) neutral attudes to distinguish really sentiment attitudes from neutral in further.

At first, we compose a list of neutral relations per each news of train an test collection by running:

./neutrals.py

Next, compose a list of feature vectors per each attitude of test and train collection as follows:

./vectorize.py

Finally we are ready to apply different models by calling:

./predict_*.py

Where asterics sign * denotes a pattern matching and group of methods you want, and here could be default (which also includes grid search), class_elemination, mfs (model features selection), rfe (recursive features selection), uv (univariate)

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
classifiers		classifiers
data		data
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
io_utils.py		io_utils.py
neutrals.py		neutrals.py
predict_baseline.py		predict_baseline.py
predict_class_elemination.py		predict_class_elemination.py
predict_default.py		predict_default.py
predict_mfs.py		predict_mfs.py
predict_rfe.py		predict_rfe.py
predict_uv.py		predict_uv.py
vectorize.py		vectorize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Description

Dataset

Results

Installation

Usage

About

Releases

Packages

Languages

License

nicolay-r/sentiment-relation-classifiers

Folders and files

Latest commit

History

Repository files navigation

Description

Dataset

Results

Installation

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages