Suggested PhD Projects

These projects are merely suggestions and illustrate the broad themes of the research group. Potential research students are encouraged and welcome to produce their own suggestions in these research areas or other research areas in the field. All applicants are invited to contact the academic associated with the project when making an application.

A tool for well-behaved APIs

Contact: Laura Bocchi

In this project you will develop a well founded tool to support the design and development of Web APIs, and REST APIs in particular. Common specifications for Web APIs (e.g., using the OpenAPI standard) only describe which resources are exposed by a server, but not the order in which they need to be executed. In many real-world scenarios, a server may need to establish and enforce an ordering for calls to the APIs it provides. For example one may want to enforce that “payment” happens after “authentication” and before “shipment”. These causalities are, at present, expressed in natural language in the documentation of an API, making their understanding and implementation error prone. This project has a theoretical component and a practical component (the balance is negotiable depending on the applicant’s skills and interests). On the theoretical side, you will extend existing API specification with behavioural specifications. One open problem is to ensure the correct composition of behavioural specifications pertaining different aspects (e.g., security, application logic). On the practical side, you will develop a tool that uses behavioural specifications to automatically generate of stubs and test suites, to ensure and verify that the desired causalities are met.

Omniscient Debugging of Haskell Programs

Contact: Olaf Chitil

An omniscient debugger records a trace during program execution and provides a viewing tool to navigate though the execution backwards and forwards. This project aims to build an omniscient debugger for Haskell programs that provides free navigation through the execution along different dependencies, in particular to follow the origin of any (incorrect) part of a data structure. The new debugger will build on some earlier debuggers for Haskell: Hat, Hoed and HatLight. However, all these debuggers were separate of the compiler. In contrast, the new debugger shall follow the existing lightweight stepping debugger in the Glasgow Haskell interpreter and be implemented as a modification of the interpreter. Thus it can take advantage of accessing the virtual machine and working together with untraced compiled code. In practice it is important for debugged programs to consist of both traced and untraced modules for several reasons: It limits the size of the trace, the untraced (majority) of the code executes at normal speed and it allows inclusion of modules in unsupported languages. The integration into the interpreter will also resolve the problems of Hat, Hoed and HatLight with some Haskell language features and keeping up with continuous language changes, especially the type system.

Delta Debugging of Static Errors in Rust

Contact: Olaf Chitil

Delta debugging is a method for determining parts of a program (or program input) that are most relevant for a given observed bug. So it is a method that helps localising the cause of a bug. Delta debugging systematically checks variants of a program (or program input) for the presence of the bug. Delta debugging has been mostly used to localise in a program the cause of a runtime error; my former PhD student Jo Sharrad applied delta debugging to localise the cause of a type error in a Haskell (or OCaml) program. The system programming language Rust has a rich type system to eliminate many classes of memory-safety and thread-safety bugs at compile-time. Hence its constraint-based type system is complex and actually still evolving. Compiler error messages can be difficult to understand. The hypothesis of this research project is that delta debugging can help the Rust programmer to understand why the compiler rejects a specific program.

Program Optimisations in Functional Reactive Programming

Contact: Marco Paviotti

Functional Reactive Programming (FRP) is a way of writing programs that react to changing values over time—for example animations, user interfaces, simulations,

or controllers. In FRP, programs are often written using streams or signals that describe values evolving step by step. Although FRP programs look high-level and declarative, they often hide complex recursive behaviour. Many FRP programs repeatedly: transform streams, combine signals, or build and consume time-based structures. This PhD project studies how to recognise general recursive patterns in FRP programs and use them to perform non-trivial optimisations, such as: removing

unnecessary intermediate streams, fusing multiple signal transformations into one, improving time and memory efficiency without changing observable behaviour. The aim of the project is to make optimisations safe and mathematically sound, using techniques from programming language theory. In particular, by using

guarded recursion (Birkedal et al. 2011) which is a technique that ensures recursive definitions evolve one time step at a time and provides a precise way to reason about infinite or ongoing reactive programs.

For more info feel free to reach out to me via email or check out my website (https://mpaviotti.github.io).

Fine-grained parallelism in compilation

Contact: Michael Vollmer

Often software engineers do not simply build lone applications, they build entire software ecosystems. Companies like Google and Facebook rely on a single, massive code repository (monorepo) for the company’s source code, with the implication that a single change may trigger all downstream code to rebuild and retest. Similarly, open source software package collections like NixOS rely on build farms that frequently rebuild the full set of packages. In order to make this practical, the compilation must be parallel. The most common way to make software build jobs parallel is to process different source code files in parallel whenever possible (this is what “make -j” does), but this coarse-grained parallelism is limited, as big parallel build jobs end up spending a lot of their time bottlenecked on sequential compilation tasks. Ultimately, we need to go further: we need both coarse-grained (between files) and fine-grained (within a file) parallelism. To achieve the latter, the compiler passes themselves must be made parallel, and that’s what this project is about. Taking on this project will require some background in parallel or concurrent programming, and ideally some experience with compilers or programming language implementation.

Using types to specify precise, byte-level data representation of objects in programming languages

Contact: Michael Vollmer

It is common for high-level programming languages to have some mechanisms for allocating objects in memory, and then for connecting and organising these objects into larger structures. For example, languages like C++ and Java support *classes*, and languages like Haskell and ML support *algebraic data types*. Of those four languages, only C++ offers any significant control over the actual memory layout of objects or how and when they are allocated and de-allocated. Unlike C++, the other languages promise *type safety*, and so, the conventional wisdom goes, they must therefore disallow fine-grained control over details of memory representation. However, safe systems programming languages like Rust have shown that it is practical to guarantee type safety while still exposing some low-level details of memory usage. This project aims to continue along those lines and explore how a high-level programming language might give programmers control over the *exact, byte-level representation* of data in their programs while using types to ensure that the resulting program is *memory-safe*. Taking on this project will require some exposure to low-level systems programming in either C/C++ or Rust, as well as an interest in either programming language design or compiler implementation.

The semantics of concurrent systems

Contact: Mark Batty

The team here at Kent is working to understand the behaviour of modern concurrent systems, and we have exposed — perhaps surprisingly — that humanity does not understand the concurrent behaviour of mainstream computer systems very well at all, yet. We have a broad goal of verifying code written above a typical stack comprising mainstream CPU (ARM, x86), mainstream compiler (GCC, LLVM) and mainstream operating system (Linux). Our work has found and fixed bugs in this stack, in particular making changes to the International Standard Organisation’s C and C++ standards. We have found major problems in optimised concurrent language design (C, C++, Java) that have remained an open for a decade, but we are well on our way to a solution, albeit with many things left to do. We use various methods in this work (typically one person has one or two abilities listed here): (1) We write code to simulate the concurrency designs expressed in specifications. (2) We write mathematical expressions of these designs. (3) We use hand proof and mechanised proof (Coq/HOL/Isabelle) to establish valuable properties of the designs. (4) We reverse engineer parts of the system to understand aspects that are described incompletely by specifications. (5) We build logics for reasoning above the systems we have, rather than above idealised models. (6) We dig into the internals of compilers. (7) We work with prototype hardware designs from project partners like ARM, to help understand their upcoming designs. We are looking for new PhD students who would like to contribute to this project. We have the capacity to build skills that you don’t yet have, but we hope that you will be driven to understand the true behaviour of modern computer systems.

Multi-stage semantics and typing via graded modalities

Contact: Dominic Orchard

Staged computation (otherwise known as meta programming) is a powerful technique for writing programs that generate programs. This might be in a simple macro language, or a more involved compile-time code generator, or even a multi-stage pipeline of multiple languages or sublanguages, including JIT. It turns out that intuitionistic modal S4 logic generates a useful type system for staged computations (through a Curry-Howard like isomorphism), detailed in the work of Davies and Pfenning (A modal analysis of staged computation, 2001). The aim of this project is to generalise their approach to graded modalities, capturing truly multi-stage programs: not just 2 stages, but n-stages, with their interactions controlled and described by a graded modal type system. This approach can then be combined with other kinds of graded modality to explain the role of data across multiple levels of a staged compilation pipeline.

Programming Languages and Systems Research Group

Featured story

A tool for well-behaved APIs

Contact: Laura Bocchi

Omniscient Debugging of Haskell Programs

Contact: Olaf Chitil

Delta Debugging of Static Errors in Rust

Contact: Olaf Chitil

Program Optimisations in Functional Reactive Programming

Contact: Marco Paviotti

Fine-grained parallelism in compilation

Contact: Michael Vollmer

Using types to specify precise, byte-level data representation of objects in programming languages

Contact: Michael Vollmer

The semantics of concurrent systems

Contact: Mark Batty

Multi-stage semantics and typing via graded modalities

Contact: Dominic Orchard