Datasets | Notion

GLUE, the General Language Understanding Evaluation benchmark.
Wikitext: Contains text extracted from wikipedia.
IMDB: This dataset is suitable for text binary classification.
Yelp review: This dataset is ideal for text multi-classification.
Text REtrieval Conference (TREC): This dataset is for question classification.
AG news: This dataset is suitable for the topic classification dataset. It contains 1 million news and their corresponding topic as labels. The labels fall into 5 classes.
DPpedia 14: This dataset contains a subset of the DBpedia dataset.