Data Science 2018-2019

Teacher: Suzan Verberne
Teaching assistants: Mohamed Barbouch & Jeroen Rook



Course schedule

Date and time: Wednesdays, 13:30-15:15
Locations: 1st hour: Snellius 312, 2nd hour: Snellius 302-304 & 306-308.

This course is partly combined with the course Data Science and Process Modelling from the I&E program (dr. Frank Takes).

  • Lectures 2 and 5 are taught by dr. Takes;
  • Lecture 7 is joined by the I&E students;
  • Lecture 9 is a common guest lecture;
  • The practical session in weeks 2, 3 and 4 is extended to the 7th and 8th hour

Slides and other materials will be posted on Blackboard weekly (under 'course documents'). Assignments should be submitted through Blackboard as well.


WeekTopicPractical sessionHomework (literature/assignment)
1 (6-2)IntroductionDataset 1: Data explorationPaper: “What Educated Citizens Should Know About Statistics and Probability” (2003)
2 (13-2)Visual AnalyticsVisual Analytics
3 (20-3)Model learning 1Visual AnalyticsPaper: “Machine learning: Trends, perspectives, and prospects” (2015)
4 (27-2)Evaluation 1Visual AnalyticsDeadline visual analytics assignment: March 4
5 (6-3)Network analytics
6 (27-3)Data collection Dataset 2: explorationPaper: “The Parable of Google Flu- Traps in Big Data Analysis” (2014)
Practical session instructions week 6
7 (3-4)Pre-processing & Feature extraction 1Dataset 2: data pre-processingPaper: “Crawling Facebook for Social Network Analysis Purposes” (2011)
Practical session instructions week 7
8 (10-4)Pre-processing & Feature extraction 2Dataset 2: feature extractionDeadline Feature extraction assignment: April 15
9 (17-4)Event data mining. Guest lecture by Bram Cappers, TU/e and AnalyzeData
10 (24-4)Model learning 2Dataset 2: model learning & evaluationPaper: “Exploring the Query Halo Effect in Site Search- Leading People to Longer Queries” (2017)
11 (1-5)Analysis and dicussionDataset 2: error analysisDeadline Model comparison assignment: May 8
12 (8-5)Recap on statistical conceptsDataset 2: reporting
13 (15-5)Big data & Responsible Data ScienceQ&ADeadline Final assignment: June 14

N.B. There is no class on March 13 and March 20.


Data

  • Dataset 1: OpenML speeddating data (link)
  • Dataset 2: News and clickbait posts from Facebook


Course grading

The assessment of the course consists of a written exam (60% of course grade) and a practical part (40% of course grade). The practical part is subdivided in (1). Visual analytics assignment (10%); (2) Feature extraction assignment (5%); (3) Model comparison assignment (5%); (4) Final assignment (20%). The grade for the written exam should be 5.5 or higher in order to complete the course. The average grade for the practical assignments should be 5.5 or higher in order to complete the course. If one of the tasks is not submitted the grade for that task is 0.


Earlier editions of this course

Link to the course page for this course in 2017-2018