Foundations of Programming and Security

Presented Virtually June 14-26, 2021

Jane Street
Microsoft
Amazon Web Services
ACM Sigplan
Tezos

In 2018, there was an one-week lecture series on the basics of programming language theory for students with little or no prior experience presented by Paul Downen of the University of Oregon and Jan Hoffmann of Carnegie Mellon University. Those videos are available on-line and available on youtube.

2021 Speakers

Stephanie Balzer — Carnegie Melon University

Session-Typed Concurrent Programming

Message-passing is a successful concurrency paradigm that has been adopted by various practical languages such as Erlang, Go, and Rust. The Servo a browser engine developed in Rust, for example, exploits message-passing to heavily parallelize tasks that are executed sequentially in existing web browsers. Messages are exchanged along channels, which are typed with enumeration types. Whereas typing ensures that only messages of the appropriate type are communicated along channels, it fails to guarantee protocol adherence. Session types can be used to type communication channels to statically verify protocol adherence. Session types were introduced in the context of process calculi, but made their way into practical languages as libraries or embedded DSLs. In these lectures, we focus on a particular family of session types that stand in close correspondence with intuitionistic linear logic. We first focus on linear session types, which describe the communication protocols between a providing process and its client. We discover together their grounding in linear logic and the strong properties of protocol adherence and deadlock-freedom they entail. Then we extend the calculus to support shared session types, allowing a process to have multiple clients and accommodating a wider range of programming examples. Without any precaution, the extension puts protocol adherence and deadlock-freedom at risk. We explore these challenges and discuss how they can be addressed.

The plan is to complement the lectures with hands-on sessions during which the students are introduced to Ferrite a session type library in Rust supporting linear and shared session types.

Ugo Dal Lago — University of Bologna

From Program Equivalences to Program Metrics

Program equivalence is the most crucial concept in semantics: programs that are considered indistinguishable are allowed to receive the same semantics, while every way to give semantics to a language implicitly suggests a notion of equivalence between programs (namely the one equating programs having the same meaning). This course aims at introducing and motivating this fundamental concept in the abstract, then showing how it can be spelled out in some standard, concrete calculi. We then argue about its limitations, which become particularly evident in all contexts where program correctness or program indistinguishability are valid only in an approximate sense. Finally, we will show how the theory of equivalence between programs can be rendered quantitative, thus becoming a theory of distances between programs and therefore being based on metrics rather than relations.

Robert Harper — Carnegie Melon University

Principles of Programming Languages

This series of lectures is intended to introduce the student to defining programming languages using operational methods. These methods require a minimum of mathematical machinery, yet are sufficiently powerful to describe a broad range of language concepts. The central tool is that of an inductive definition of a judgment, or assertion. The statics of a language consists of an inductive definition of its abstract syntax, including binding and scope of variables, and an inductive definition of a typing judgment, which expresses context-sensitive constraints on the formation of programs. The dynamics of a language consists of an inductive definition of a labelled transition system that defines how programs are executed. These lectures will explore the systematic use of the methods of process calculus to define the states of the dynamics. A necessary condition for a language to be well-defined is that the statics and the dynamics cohere in that the statics can be viewed as a specification of the behavior of program fragments. The basic form of coherence, called type safety, is expressed by progress and preservation theorems that rule out ill-defined states arising in any execution. More sophisticated forms of coherence make use of type-indexed families of predicates or relations, called logical relations, to express deeper properties such as Reynolds’s parametricity theorem.

Reference: Robert Harper, Practical Foundations for Programming Languages, Cambridge University Press.

Justin Hsu — University of Wisconsin–Madison

Reasoning about Probabilistic Programs

Probabilistic programs and randomized algorithms play key roles in many growing areas of computer science. While the mathematical semantics of such programs is mostly well-understood, techniques for verifying and reasoning about probabilistic programs are an active area of research. Existing tools are limited to simple programs, and have difficulty scaling to verify many properties of interest. In this mini-course, we will present the classical verification techniques for probabilistic programs, and survey some recent methods.

Patricia Johann — Appalachian State University

Semantics of Advanced Data Types

The course will deal with advanced data types in functional languages such as Haskell and Agda. Among these are the generalizations of ADTs known as nested types, the generalizations nested types known as GADTs, and other higher-kinded types.

We will first introduce a formal syntax for a robust class of advanced data types, and construct fully functorial initial algebra semantics for them. For this, we will work in locally presentable categories. Since local presentability is a modest hypothesis, needed to handle even the simplest ADTs, our construction will show that if a category supports fully functorial initial algebra semantics for ADTs, then it does so for nested types, GADTs, and other higher-kinded data types as well.

We will next show how the semantics we construct gives rise to a notion of deep induction for nested types. Standard structural induction for nested types (including ADTs) inducts over only the top-level structure of data, leaving any data internal to the top-level structure untouched. By contrast, deep induction inducts over all of the structured data present. It therefore not only provides principled and practically useful induction rules appropriate for richly structured nested types (including ADTs), but also specializes to solve the long-standing problem of deriving induction rules for bushes and other truly nested types.

Any remaining time in the course will be used to argue that this same semantics gives rise to parametric models for nested types, as well as to discuss parametricity for GADTs.

Alejandro Russo — Chalmers University

Information-flow Control Libraries

Information-Flow Control (IFC) emerges as a promising technology to harden programs against information leakage and corruption. To avoid such problems, IFC restricts programmers from writing code that irresponsibly distributes (modifies) sensitive data. Special purpose IFC-languages have been developed over the years but the impact in practice has been somewhat limited. Rather than producing new languages from scratch, IFC can also be guaranteed via libraries. As long as developers follow the libraries' APIs, it is guaranteed that their code will not reveal sensitive information. We believe that this approach makes IFC technology more likely to be adopted. The course introduces security problems regarding protecting sensitive data, the foundations for IFC, and many IFC libraries' principles and programming languages' techniques. The material presented in the course is based on recent research results.

Alexandra Silva — University College London

Kleene Algebras and Applications

We review material on Kleene algebra and extensions thereof, with a focus on using equivalence proofs as a verification technique.

Nikhil Swamy — Microsoft Research

Proof-oriented Programming in F*

Software systems that provide high assurances of safety, correctness, and security are a critical societal need. This course offers an introduction to the F* programming language and proof assistant, focusing on techniques to imbue programming in a variety of paradigms (including functional, imperative, concurrent, and distributed programming) with mathematical specifications and proofs.

Tarmo Uustalu — Reykjavik University

Monads and Interaction

It is standard in the semantics of programming languages to use monads to model notions of computation that involve effects such as input/output, manipulation of store, nondeterminism. Unlike a purely functional computation, an effectful computation cannot return a value on its own: it issues requests to the outside world and needs these responded to make progress. To run, it thus needs to be paired with an environment that is coeffectful, can service these requests. The two need to understand each other and work together. My message in this course is that environments and interaction are important characters in the act of computations. Notions of coeffectful environment are naturally modelled with comonads. Protocols of communication between computations and environments then admit mathematization by what we have christened interaction laws.

I will give a brief general introduction to monads, comonads and their applications in semantics, but will then focus on interaction specifically. My material involves a fair amount of quite abstract constructions, but all come illustrated with concrete examples and almost all mean something for programming.