Place: DHØ 1.52 Time: 11:40
Yes, we are going back in time to the Presidential Debate in the US 2020 - the time of lots of unhappy Tweeting. It’s just too good a dataset and case to let it go…
Political tweets: https://github.com/SDS-AAU/SDS-master/raw/master/M2/data/pol_tweets.gz
from https://github.com/alexlitel/congresstweets We’ve preprocessed a bit to make things easier. 1: Dems. 0: Rep.
Tweets around the time of the debate in oktober 20 (8000): https://github.com/SDS-AAU/SDS-master/raw/master/M2/data/pres_debate_2020.gz
Both datasets are in JSON format. Task: Build a classifier that can distinguish Dem/Rep tweets. Bonus: 1. Explore discussed topics; 2. find out what drives predictions.