Data and Network Visualization
Prerequisites: You need to be proficient with Python to take this course – read the “to satisfy the prerequisites” section below
Course schedule: this course will take place twice a week during the second half of the term, starting on November 2, 2017.
Course Level: Master and PhD
Office: 609 Nador 11
Office hours: TBA or by appointment
Brief introduction to the course
The Internet and modern computers have given us vast amounts of data, so it is more important than ever to understand and analyze these data. A picture is worth a thousand words, so visualizations, from scientific plots to interactive data explorers, are crucial to summarize and communicate new discoveries.
The goals of the course
The major goals of this course are to understand how visual representations can help in the analysis and understanding of complex data, how to distinguish a good from a bad visualization, how to design effective visualizations, and how to create your own visualizations using programming skills and visualization software. We will achieve these goals by evaluating existing visualizations, teaching general theoretical principles of information design, and by walking students through the analysis and visualization of real world datasets during hands-on classes.
Lectures: 12 classes of 100min. For most classes, we will spend the first 30 minutes introducing the day's concepts, and spend the rest of the class doing the lab exercises or evaluating visualizations. Therefore, use of a computer will be required during some lectures. Students can form groups and use their own laptops.
- Basics of human perception and cognition of visualization
- Meaningful visualization for research
- Data “munging” or cleaning to process and visualize data
- Visualization of static data with Matplotlib, using Python
- Principles of Network visualizations (layouts, information reduction)
- Network visualization using Gephi
- Basic interactive visualizations using Plotly
1st class: Is a picture worth 1000 words? Principles of information Design
2nd class: What is a good visualization? Cognitive aspects of visualization.
3rd class: Getting you from a dataset to several plots
4th class: Lots of visualization examples, and practice going from data to chart
5th class: Line charts, Pie-charts, Box plots, Violin plots
6th class: Visualizing multivariate Data
7th class: More practice with multivariate data
8th class: Visualizing networks with Gephi
9th class: More visualization with Gephi. Visualization with Networkx
10th class: Interactive visualization with plotly
11th class: Interactive visualization with plotly
12th class: Final Project Presentation
We will spend the first 20 min of classes 5th to 9th to discuss about good and bad visualization practices, taken from research papers and from the web (as part of the course assignment).
The Visual Display of Quantitative Information
by Edward Tufte, 2001.
Interactive Data Visualization for the Web
by Scott Murray, O'Reilly Media, 2012.
Online resources and documentation provided during classes
Further information, such as the course website, assessment deadlines, office hours, contact details etc. will be given during the course.
The instructor reserves the right to modify this syllabus as deemed necessary any time during the term. Any modifications to the syllabus will be discussed with students during a class period. Students are responsible for information given in class.
In short: don't do it! You may work with friends to help guide problem solving or consult stack overflow (or similar) to work out a solution, but copying—from friends, previous students, or the Internet—is strictly prohibited. NEVER copy blindly blocks of code – we can tell immediately.
If caught cheating, you will fail this course. Ask questions in recitation and at office hours. If you're really stuck and can't get help, write as much code as you can and write comments within your code explaining where you're stuck.
By the end of the course, students will have acquired the following skills:
- Apply methods for visualization of data from a variety of fields
- Write code to create scientifically reproducible figures
- Distinguish good from bad visualizations
- Use basic principles of human perception and cognition in visualization
- Create some basic web-based interactive visualizations
- Learning to layout and visualize network data
Students are expected to attend lectures and hands-on sessions, to hand in one assignment during the course and to develop a project during the entire term.
- Attendance of the classes and hands-on sessions: 30% of the final grade
- Assignment(s): 30% of the final grade
- Final project: 40% of the final grade
For the final project, students will have to apply and show proficiency with the principles and tools used during the course. A few options for projects will be suggested in class.
To satisfy the prerequisites
This course has a focus on data visualization for research. As such, we mostly use a programming language, Python, to create reproducible figures and interactive plots. With one exception (Gephi), we make no use of programs with a Graphical User Interface, like those available with spreadsheets. Since we need to pick one programming language for the course, we require students to prove proficiency with Python before the course starts, in one of the following ways:
a) Take for grade or audit the course MATH 5016 “Scientific Python”, given during the first six weeks of the term. Please note that space is limited in the course and you might not be able to enter the regular list. In this case please consider option b.
b) Take a MOOC course on programming with Python and show the certificate.
c) Show and discuss a project you developed in Python. Projects from someone else (web, friend, previous students) are not considered.
If you use options b) or c): if there is a waiting list for the course, the certificate or the project must be shown before the beginning of the term to hold a place among the regular attendees. If there is no waiting list, it is fine to provide the certificate or show your previous project before the course begins (November 2nd). However, the instructor holds no responsibility in case you do not satisfy the prerequisite and need to drop the course.