View on GitHub

SIDATA

ArSarcasm

URL: https://github.com/iabufarha/ArSarcasm

Description: This repository contains the Arabic sarcasm dataset (ArSarcasm).

Project Overview

ArSarcasm is a dataset for sarcasm detection in Arabic tweets. It was built using existing Arabic sentiment analysis datasets (SemEval 2017 and ASTD) and includes annotations for sarcasm and dialect.

Dataset:

The dataset contains 10,547 tweets, where 1,682 (16%) are labeled as sarcastic. The dataset is available in CSV format with an 80/20 train-test split:

Dataset Fields:

Dataset Usage:

The dataset is structured for sarcasm detection research and includes sentiment and dialectal variations, making it useful for broader NLP tasks in Arabic.

Dataset Statistics:

Training Methods:

No specific training methods are provided in the repository.

Results:

No specific results or performance metrics are provided in the repository.

Dataset Files: