Topics for exam
Here are topics that I feel cover the essence of the class. You should be
prepared to answer questions related to them.
- Transparencies and abstractions: What do different distributed system concepts provide w.r.t. transparency? Given a distributed system technique (e.g.: replication), be able to describe how it provides higher level abstractions over the implementation mechanism to aid programmers, and describe what sorts of transparencies it provides. Questions related to transparencies and abstractions will be along the lines of:
"system Blah is based on [insert technique here]. How does [technique] allow system Blah to provide failure transparency?"
I won't say:
"System blah is based on [technique]. List the transparencies.".
- Networking technologies: Fundamental ideas about TCP, IP, point-to-point messaging, multicast and group messaging.
- Failure models: If I describe a distributed system (e.g.: replication via multicast), be prepared to describe the types of failures that you should consider and what mechanisms would be employed to deal with them (e.g.: timeouts for failure detection). Understand failure masking (e.g.: how a system can turn arbitrary failures into omission failures).
- System and architecture models: Demonstrate understanding of fundamental properties of, and differences between client/server, peer-to-peer, remote procedure call, and similar architectural and system models. Understand what differentiates synchronous from asynchronous systems, and what mechanisms are used to constrain asynchronous systems to allow them to be treated as approximately synchronous.
- Interprocess communication: Understand difference between basic message passing and higher level abstractions like RPC or RMI. Be able to discuss the software layers that an RMI system implements to support the abstraction on top of a lower level messaging layer, such as data marshalling layers.
- Clocks, time, global states: Demonstrate understanding of issues that arise due to distributed clocks and the methods for synchronization of them. Be able to discuss how skew and drift impact accuracy, and how protocols can be built to reduce synchronization error (e.g.: multiple message handshaking protocols like NTP.) Clocks and time are very important - I expect you to be able to understand this topic. (In other words, you are guaranteed a question on clocks/time).
- Concurrency control: Locks, semaphores, mutual exclusion, deadlock, livelock, starvation, fairness, sequential consistency, transactions, atomicity. Understand all of those topics (many of them, like fairness, starvation, etc... are just definitions), and be able to identify where they come into play in a real system. I will probably give pseudocode or describe a system design to you, and expect you to answer questions related to concurrency control in the context of the pseudocode or system.
- Performance considerations: Understand how relaxation of constraints such as sequential consistency, linearizability, partial vs total orderings of events, and so on, impact performance. Many systems that we saw were designed to relax constraints while preserving correctness in order to increase performance. For example, transactions versus mutual exclusion in order to increase concurrency by allowing overlap of concurrent processes under some circumstances. Be able to describe serialization of concurrent processes that can result from aggressive locking schemes.
- Fundamental patterns: Many patterns came up in the class to address failure and performance issues, such as replication for fault tolerance. Be able to discuss how these patterns may be implemented, and identify when they are appropriate. These fundamental patterns include replication, coordination methods (e.g.: elections), agreement protocols.
- Specific case studies: Understand how the above topics (patterns, performance, concurrency control, etc...) are present in some key case studies that we looked at. These include DNS, NTP, NFS, Google file system, etc... I will expect you to understand these systems at a high level - such as the relationships between the components of the system (like how DNS servers relate, or how NTP strata are structured) and how they interact to deal with situations like failure. I will not expect you to understand the intimate details of the actual messages that they pass back and forth - just the concepts. For example, being able to explain how DNS timestamps on cached entries help reduce the overhead of replication by relaxing the expected consistency of DNS entries in the short time after updates are made.
Send me any questions between now and exam time. I will post answers on here for all to see.