Skip to content

Source code for DIALOG-2018 paper: "Extracting Sentiment Attitudes From Analytical Texts"

License

Notifications You must be signed in to change notification settings

nicolay-r/sentiment-relation-classifiers

Repository files navigation

Description

This project is a collection of researches related to sentiment attitudes extraction. Given a mass media russian news articles and list of marked named entities in it, we may predict a sentiment attitudes -- pos/neg relations between entities. Each attitude could be classified as follows: positive, negative, or neutral. Having RuSentRel collection, we apply and compare a different (feature based) machine learning approaches, such as svm, nb, rf, knn.

Being applyed for unlabeled articles (test collections), we interested only in non neutral attitudes. As a result, we extract positive and negative attitudes and discard neutrals.

Dataset

We use RuSentRel 1.0 corpus consisted of analytical articles from Internet-portal inosmi.ru.

Results

Presented in table below (in comparison with baseline neg/pos/distr methods)

Model Precision Recall F1(P,N)
Baseline neg 0.03 0.39 0.05
Baseline pos 0.02 0.40 0.04
Baseline distr 0.05 0.23 0.08
KNN 0.18 0.06 0.09
SVM (GRID) 0.09 0.36 0.15
Random forest (GRID) 0.41 0.21 0.27

Installation

Using virtualenv. Create virtual environment, suppose my_env, and activate it as follows:

virtualenv my_env
source my_env/bin/activate

Use Makefile to install core library and download RuSentRel 1.0 dataset:

make install

We use word2vec model which were taken from rusvectores, Because of some features that depends on word embedding vocabulary, it is necessary to additionally download a model as follows:

make download_model

Note: This word embedding model stores a russian terms with additional POS suffix written in mystem notation.

Usage

Being not interested in neutral attitutes, dataset doesn't provide such information (i.e. attitudes with 'neutral' labels). For extraction of positive and negative attitudes we additionally introduce (extract from news) neutral attudes to distinguish really sentiment attitudes from neutral in further.

At first, we compose a list of neutral relations per each news of train an test collection by running:

./neutrals.py

Next, compose a list of feature vectors per each attitude of test and train collection as follows:

./vectorize.py

Finally we are ready to apply different models by calling:

./predict_*.py

Where asterics sign * denotes a pattern matching and group of methods you want, and here could be default (which also includes grid search), class_elemination, mfs (model features selection), rfe (recursive features selection), uv (univariate)

About

Source code for DIALOG-2018 paper: "Extracting Sentiment Attitudes From Analytical Texts"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published