ArSarcasm-v2
URL: https://github.com/iabufarha/ArSarcasm-v2
Description: ArSarcasm-v2 is an extension to the original ArSarcasm dataset. It was used for the shared task on sarcasm detection and sentiment analysis, which was part of WANLP 2021.
Dataset
- Name: ArSarcasm-v2
- Size:
- Training data: 2.28 MB
- Testing data: 571 KB
Additional Information
How the datasets were created
ArSarcasm-v2 is an extension of the original ArSarcasm dataset, incorporating data from the DAICT corpus and additional tweets. Each tweet was annotated for sarcasm, sentiment (positive, negative, neutral), and dialect (modern standard Arabic and regional Arabic dialects). The dataset consists of 15,548 tweets in total, divided into:
- Training set: 12,548 tweets
- Test set: 3,000 tweets
The data was created for the WANLP 2021 Shared Task on Sarcasm and Sentiment Detection in Arabic. Annotations were performed manually, ensuring accuracy and alignment with the task’s goals.
Training methods applied
Information not available.
Results obtained
Information not available.