This is an effort to separate Sarcasm
from Satire
in English language sentences.As is well known there is a fine line separating the two and they mostly differ in their connotation , one is said to be more brazen and blatant in comparison to the other. Separating the two,by putting them together in a classification task seems to be an interesting problem for various reasons.
The 'go-to' techniques such as Embedding
wouldn't go very far as both sarcasm and satire have a certain amount of wordplay which tend
to give the same word different meaning based on their relative usage in a text.
This is an effort to solve the problem in a generalised manner so that it generalizes to any english language text, having class ratios of any order with sentences of various lenghts.
This is still 'work in progress' with a test-set accuracy of around 0.73 in a medium sized corpus. I am hoping, in future this might lead to a open source contribution in the form of some kind of API. With the betterment of accuracy this might turn out to be an important milestone in affect recognition.
Update:BERT gives an accuracy of around 0.80 in a decently sized validation corpus.
Commit history is in my Bit-bucket account
Any query/question/concerns can be addressed to: [email protected] ,[email protected]
Author:Abhishek Mukherjee