Data Science 2018-2019
Course schedule
Date and time: Wednesdays, 13:30-15:15
Locations: 1st hour: Snellius 312, 2nd hour: Snellius 302-304 & 306-308.
This course is partly combined with the course Data Science and Process Modelling from the I&E program (dr. Frank Takes).
- Lectures 2 and 5 are taught by dr. Takes;
- Lecture 7 is joined by the I&E students;
- Lecture 9 is a common guest lecture;
- The practical session in weeks 2, 3 and 4 is extended to the 7th and 8th hour
Slides and other materials will be posted on Blackboard weekly (under 'course documents'). Assignments should be submitted through Blackboard as well.
Week | Topic | Practical session | Homework (literature/assignment) |
---|---|---|---|
1 (6-2) | Introduction | Dataset 1: Data exploration | Paper: “What Educated Citizens Should Know About Statistics and Probability” (2003) |
2 (13-2) | Visual Analytics | Visual Analytics | |
3 (20-3) | Model learning 1 | Visual Analytics | Paper: “Machine learning: Trends, perspectives, and prospects” (2015) |
4 (27-2) | Evaluation 1 | Visual Analytics | Deadline visual analytics assignment: March 4 |
5 (6-3) | Network analytics | ||
6 (27-3) | Data collection | Dataset 2: exploration | Paper: “The Parable of Google Flu- Traps in Big Data Analysis” (2014) Practical session instructions week 6 |
7 (3-4) | Pre-processing & Feature extraction 1 | Dataset 2: data pre-processing | Paper: “Crawling Facebook for Social Network Analysis Purposes” (2011) Practical session instructions week 7 |
8 (10-4) | Pre-processing & Feature extraction 2 | Dataset 2: feature extraction | Deadline Feature extraction assignment: April 15 |
9 (17-4) | Event data mining. Guest lecture by Bram Cappers, TU/e and AnalyzeData | ||
10 (24-4) | Model learning 2 | Dataset 2: model learning & evaluation | Paper: “Exploring the Query Halo Effect in Site Search- Leading People to Longer Queries” (2017) |
11 (1-5) | Analysis and dicussion | Dataset 2: error analysis | Deadline Model comparison assignment: May 8 |
12 (8-5) | Recap on statistical concepts | Dataset 2: reporting | |
13 (15-5) | Big data & Responsible Data Science | Q&A | Deadline Final assignment: June 14 |
N.B. There is no class on March 13 and March 20.
Data
- Dataset 1: OpenML speeddating data (link)
- Dataset 2: News and clickbait posts from Facebook
Course grading
The assessment of the course consists of a written exam (60% of course grade) and a practical part (40% of course grade). The practical part is subdivided in (1). Visual analytics assignment (10%); (2) Feature extraction assignment (5%); (3) Model comparison assignment (5%); (4) Final assignment (20%). The grade for the written exam should be 5.5 or higher in order to complete the course. The average grade for the practical assignments should be 5.5 or higher in order to complete the course. If one of the tasks is not submitted the grade for that task is 0.
Earlier editions of this course
Link to the course page for this course in 2017-2018