View on GitHub

SIDATA

Automatic Sarcasm Detection

URL: https://github.com/EducationalTestingService/sarcasm

Description

This repository contains datasets and research materials related to sarcasm detection, including data from the 2nd FigLang Workshop at ACL 2020. It provides shared tasks and benchmarks for sarcasm detection in Twitter and Reddit conversations.

Dataset

The repository includes training and testing datasets for sarcasm detection in Twitter and Reddit, provided in JSONL format.

Data Format

Each entry in the dataset contains:

Dataset Statistics

| Platform | Train Size | Test Size |
|———-|———–|———-|
| Reddit | 4,400 | 1,800 |
| Twitter | 5,000 | 1,800 |

Shared Task & Evaluation

References

Main Paper:

Ethical Considerations

Potentially Controversial Content:

Implementation & Results