SPIRS
URL: https://github.com/bshmueli/SPIRS
Description
SPIRS is a high-quality sarcasm dataset containing 15,000 sarcastic tweets and 15,000 non-sarcastic tweets, totaling 30,000 samples. The dataset was collected using a novel data capturing method called reactive supervision, which enables the collection of both intended and perceived sarcasm. This unique approach allows for a richer context in sarcasm detection tasks.
Dataset Details
- SPIRS stands for Sarcasm, Perceived and Intended, by Reactive Supervision.
- The dataset includes two files:
SPIRS-sarcastic-ids.csv: Contains 15,000 sarcastic tweet IDs (positive samples).SPIRS-non-sarcastic-ids.csv: Contains 15,000 non-sarcastic tweet IDs (negative samples).
- Additional metadata for sarcastic tweets includes:
- Sarcasm perspective (intended or perceived).
- Author sequence.
- Contextual tweet IDs (cue, oblivious, and eliciting tweets).
Key Features
- Reactive Supervision: A method allowing the collection of both intended and perceived sarcasm texts.
- Rich Context: Includes contextual information that can help better understand sarcasm, such as author sequence and related tweets.
- Research: The dataset is explained in detail in the reactive supervision paper, and more insights can be found in the Medium article or the YouTube video.
Dataset Size
SPIRS-non-sarcastic-ids.csv: 673 KBSPIRS-sarcastic-ids.csv: 1.31 MB
Methods
No specific methods information is provided in the repository.
Results
No specific results information is provided in the repository.
Models
No specific models information is provided in the repository.