Data Science 2017-2018

Teacher: Suzan Verberne

Course schedule

Each week we have a 45 minute lecture and a 45 minute practical session with exercises in R and Python. The homework is either reading a paper or completing an assignment.

  • Dataset 1: OpenML speeddating data (link)
  • Dataset 2: News and clickbait posts from Facebook

WeekLecturePractical sessionHomework (literature/assignment)
1IntroductionDataset 1: Task definitionPaper 1. “What Educated Citizens Should Know About Statistics and Probability” (2003)
2Exploration and visualisationDataset 1: data explorationPaper 2. “The dual frontier: Patented inventions and prior scientific advance ” (2017)
3Model learning 1Dataset 1: model learningPaper 3. “Machine learning: Trends, perspectives, and prospects” (2015)
4Evaluation 1Dataset 1: evaluationLiterature assignment
5Big dataDataset 2: data explorationPaper 4: “The Parable of Google Flu- Traps in Big Data Analysis” (2014)
6Data collection Dataset 2: data annotationPractical assignment 1: Inter-rater agreement
7Pre-processing & Feature extraction 1Dataset 2: data pre-processingPaper 5: “Crawling Facebook for Social Network Analysis Purposes” (2011)
8Pre-processing & Feature extraction 2Dataset 2: feature extractionPractical assignment 2: Feature extraction
9Model learning 2Dataset 2: model learningPaper 6: “Developing Age and Gender Predictive Lexica over Social Media” (2014)
10Evaluation 2Dataset 2: evaluation & reportingPractical assignment 3: Evaluation
11Analysis and dicussionDataset 2: error analysisPaper 7: “Exploring the Query Halo Effect in Site Search- Leading People to Longer Queries” (2017)
12Feature extraction 3Dataset 2: reportingFinal assignment
13Ethical issuesQ&A

The assessment of the course consists of a written exam (60% of course grade) and practical assignments (40% of course grade). The practical assignments comprise four small tasks (5% each) and one more substantial report (20%). The grade for the written exam should be 5.5 or higher in order to complete the course. The average grade for the practical assignments should be 5.5 or higher in order to complete the course. If one of the tasks is not submitted the grade for that task is 0.