CS 598: Reliability of Cloud-Scale Systems

Fall 2018

Tianyin Xu
4108 Siebel Center

Tu/Th 14:00–15:15pm
1103 Siebel Center

Teaching Assistant
Ranvit Bommineni

Office Hours
Tu/Th 17:00–18:00pm or by appointment
4108 Siebel Center


Course Overview

The purpose of this course is to teach the principles and practices of reliability engineering in modern "cloud-scale" systems, and expose students to the research of software and system reliability. We will look at how large-scale systems fail in the real world, and we will study the state-of-the-art reliability techniques and practices, including those widely adopted in industry and new ideas proposed by academia.

We will be going over the following topics:

This is a research-oriented seminar course with a major course project.

Reading List

The course does not have a textbook. Instead, the course material will come from seminal, noteworthy, or representative papers and articles from the literature. Each lecture (except the first) will have two assigned papers to read, typically including one from academia and the other from industry. You should read these papers before coming to class, and be prepared to discuss them. Occasionally I will also list recommended readings; you are encouraged to read those, but not required.

I highly recommend you to read Griswold's advices on how to read a research paper. The take-home message is that until you can answer a bunch of questions, you are not done reading a paper.

I also strongly encourage you to discuss the papers with other students in the class — you may have insights that others do not, and vice versa. Often students form reading groups, which I heartily encourage. Note that group discussion, however, is not an effective substitute for actually reading the paper.

You are required to write reviews for the assigned papers. The review form (which consists of a number of questions) will be posted at Piazza. The review is due 11:59pm Mon/Wed (the day before the class day). The paper reviews contribute to 10% of your overall grade.

Class Participation

Since this is a discuss-based course, class participation is required. We will discuss the papers and articles that we will have all read before each class. I will lead discussions by asking questions of students at random in class. Note that your answers to these questions form 10% of your overall grade, so it is important that you both show up to class as well as read the papers.


The best way to learn is by doing. You will undertake your own research project individually or in a group of two to three. I will provide a list of ideas to get you started thinking (cf. the project page), but I highly encourage you to pursue your own ideas. You will write a project report and present it at the end of the course.

You are expected to be aware of UIUC's academic integrity guidelines. Any violation of the course or university policies will be treated seriously, and could lead to severe repercussions. Don't cheat. It's not worth it.