CS 253 - Distributed Systems
Fall 2010

Overview

Instructor: Harsha V. Madhyastha

Lectures: TTh 11:10 a.m. - 12:30 p.m., INTS 1125

Office hours: Send me email to setup an appointment.

Grading: Project (50%) + In-class quizzes (20%) + Class participation (30%). Papers presented in class count toward class participation.

Mid-term and Final exam: None.

Synopsis: Distributed systems are ubiquitous today. For example, consider the steps that typically occur when you attempt to search on Google. When you first type your request into your browser, the processing necessary to execute your command is performed across a distributed system of CPU cores on your computer. The result of this execution is to query the DNS nameserver system distributed across the Internet in order to determine which one of Google's data centers distributed across the globe should handle your request. The network traffic exchanged with the nameservers and with Google's webservers is sent over the Internet, which itself is a distributed system of routers. Lastly, to service your request received at one of Google's data centers, Google's distributed computing and storage infrastructure has to exchange messages between several of its servers within the data center.

The design of all the above distributed systems---multicore processors, DNS nameservers, content distribution networks, the Internet, data centers---is based on similar objectives such as performance, fault-tolerance, consistency, cost-efficiency, and efficient resource utilization. In this class, we will examine efforts in designing and implementing distributed systems in several domains to understand a) whether similar techniques are used to achieve similar goals in different domains, and b) if not, what domain-specific characteristics drive the need for domain-specific optimization techniques. In a nutshell, the end-goal of this course is to equip the students with a toolkit of the basic rules of thumb in distributed system design, for use in their future research.

Schedule

Please monitor your account on iLearn.ucr.edu for course announcements.

(You can log on to iLearn with your UCR NetID.)

Date Topic Reading Presenters Slides Notes
Sep 23 Introduction   Course logistics  
Sep 28 Cluster-based Services Cluster-Based Scalable Network Services, SOSP 1997   Overview and SNS  
Sep 30 Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service, SOSP 1999 Nicholas
Porcupine,Overview
 
Oct 5 Distributed Storage Systems The Google File System, SOSP 2003
Dynamo: Amazon’s Highly Available Key-value Store, SOSP 2007
Yousra,
Ting-Kai
GFS,
Dynamo
Project proposal due.
Oct 7 Bigtable: A Distributed Storage System for Structured Data, OSDI 2006 Xuetao Bigtable,Overview  
Oct 12 Programming Paradigms for Clusters MapReduce: Simplified Data Processing on Large Clusters, OSDI 2004
Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks, Eurosys 2007
Enrique,
Huy
MapReduce,
Dryad
 
Oct 14 Quincy: Fair Scheduling for Distributed Computing Clusters, SOSP 2009 Yousra Quincy,Overview  
Oct 19 Content Distribution Networks Globally Distributed Content Delivery, IEEE Internet Computing Magazine 2002
Democratizing content publication with Coral, NSDI 2004
Masoud,
Indrajeet
Akamai,
Coral
 
Oct 21 Latency-Driven Replica Placement, IPSJ Journal
Optional Reading: Answering “What-If” Deployment and Configuration Questions with WISE, SIGCOMM 2008
Ting-Kai HotZone,Overview  
Oct 26 Peer-to-Peer Systems Looking Up Data in P2P Systems, Communications of the ACM, 2003
Can Internet Video-on-Demand be Profitable?, SIGCOMM 2007
Sambit,
Nicola
DHTs,
VoD
Project intermediate report 1 due: Related work, design, and implementation progress/plan.
Oct 28 Network Coordinates in the Wild, NSDI 2007 Indrajeet Coordinates,Overview  
Nov 2 Internet Routing Consensus Routing: The Internet as a Distributed System, NSDI 2008
Symbiotic Relationships in Internet Routing Overlays, NSDI 2009
Michael,
Nicola
Consensus Routing,
PeerWise
 
Nov 4 Moving Beyond End-to-End Path Information to Optimize CDN Performance, IMC 2009 Masoud WhyHigh,Overview  
Nov 9 Data-center networks A Scalable, Commodity Data Center Network Architecture, SIGCOMM 2008
ElasticTree: Saving Energy in Data Center Networks, NSDI 2010
Xuetao,
Nicholas
FatTree,ElasticTree  
Nov 11 No class in lieu of Veterans Day.      
Nov 16 c-Through: Part-time Optics in Data Centers, SIGCOMM 2010 Huy c-Through,Overview  
Nov 18 Multicore processors Corey: An Operating System for Many Cores, OSDI 2008
The Multikernel: A new OS architecture for scalable multicore systems, SOSP 2009
Sambit,
Enrique
Corey,Barrelfish Project intermediate report 2 due: Initial results.
Nov 23 An Analysis of Linux Scalability to Many Cores, OSDI 2010 Michael Linux Scalability,Overview  
Nov 25 No class in lieu of Thanksgiving.      
Nov 30 Project presentations: Part 1      
Dec 2 Project presentations: Part 2      

Projects

Each student is expected to complete a research project, write up the material describing the project in a similar manner to research papers, and present the project in class (towards the end of the quarter). A suggested list of project topics is posted below, but the students are also encouraged to choose their own topics relevant to the class. A good guideline for picking a project topic is to choose one that you would be interested in pursuing for publication later if the results obtained by the end of the course turn out to be promising.

A candidate list of topics for the course project is available here. (To access from outside UCR, use WebVPN, which you can access using your NetID.)

Students are strongly encouraged to work on projects in groups of two or more, with the expectation commensurately higher for larger groups. Apart from the final project presentation and report, each group will also be required to submit intermediate writeups 3 weeks (project proposal and plan of action) and 7 weeks (methodology and preliminary results) into the class. One quarter can go by in a blur. Hence, these intermediate checkpoints are to help maximize the chances of success of your project.

Requirements from intermediate project milestones due on the following days: