View on GitHub

SIDATA

SemEval2022-Task6-Sarcasm-Detection

URL: https://github.com/AmirAbaskohi/SemEval2022-Task6-Sarcasm-Detection

Description

Sarcasm is commonly used on social media to mock, irritate, or amuse others. Its metaphorical nature poses significant challenges for sentiment analysis systems. This repository presents the results of the UTNLP team in the SemEval-2022 Task 6 on sarcasm detection, including a comparative analysis of various models and data augmentation techniques. The study compares machine learning models, transformer-based models, and data augmentation strategies, including both generative-based and mutation-based methods. The best approach achieved an F1-score of 0.414 after refining the initial model.

Project Overview

Methodology

Dataset

Results

The models were evaluated using F1-score and accuracy, with the following results:

Data Augmentation F1-Score Accuracy
Shuffling 0.305 0.7471
Shuffling + Replacing 0.3011 0.7414
Shuffling + Elimination 0.3064 0.7478
Elimination 0.301 0.7478
GPT-2 0.2923 0.675
Model F1-Score Accuracy
SVM 0.3064 0.7478
LSTM-based 0.2751 0.7251
BERT-based 0.414 0.8634
Attention-based 0.2959 0.7793
Google’s T5 0.4038 0.8124
Electra 0.2907 0.7642

Implementation & Code

The repository contains scripts for: