Jump to accessibility statement Jump to content
Group photo of members

Suggested PhD Projects

This projects are merely suggestions and illustrate the broad themes of the research group. Potential research students are encouraged and welcome to produce their own suggestions in these research areas or other research areas in the field. All applicants are invited to contact the academic associated with the project when making an application.

Mathematical specification of C++ concurrency libraries

Contact: Mark Batty

We write rigorous mathematical specifications of messy real-world parts of the computer system, providing a formal basis for reasoning about code, and helping to improve the official definitions of the system. Our fixes for flaws in the low-level concurrency specifications of C and C++ have been adopted by the International Standards Organisation (ISO). We have shown that fundamental flaws remain, but we have a new definition that fixes these. The ISO are developing definitions of higher-level concurrency libraries, e.g. distributed counters and concurrent queues. This project will build mathematical specifications of the new C++ library definitions, incorporate them into our mathematical formalism, and validate implementations of the library either using mathematical proof or automated solvers. This is an ambitious theoretical project with the potential for real-world impact. It will involve developing skills in formal specification, mechanised theorem proving (e.g. HOL, Coq, Isabelle), and the use of automated solvers (e.g. QBF).

Scalable automated solving for concurrency

Contact: Mark Batty

We have a new specification of concurrency for languages like C, C++ and Java that fixes known flaws in their respective designs. Our specification has a property called compositionality: whereas previous specifications defined the behaviour only whole programs, our specification gives meaning to program subcomponents like libraries. Existing tools for executing programs according to the detailed concurrency specifications of C/C++/Java can cope with roughly 20-line programs, leaving real code well out of reach. In our compositional specification, we can carve up the program into parts and use faster solvers for simpler components, reusing results in the manner of Facebook’s Infer. This project will focus on leveraging the compositionality of our specification to build a high-performance executable version of the specification. This project will involve building a tool (e.g. with OCaml), and using various solving techniques drawn from work on C/C++ and Java.

Behavioural APIs for robust communicating systems

Contact: Laura Bocchi

In this project you will develop a theory and tool for design and implementation of robust communicating systems. The project will centre on Behavioural APIs. Behavioural APIs are abstract specifications of application-level protocols; they go beyond traditional APIs in that they specify conversation patterns among components/services (and not just sets of supported operations). Behavioural API have been successfully applied to guarantee safety (e.g., absence of deadlocks) via code generation, verification and monitoring. The project will focus on extending known techniques by gradually dropping ideal-world assumptions and handling unexpected scenarios (like unexpected delays and failures).

From correct real-time models to correct real-time programs

Contact: Laura Bocchi

Verification of real-time models (e.g., timed automata) is supported by well-established theories and tools. Once a real-time model is proved correct, one must make sure that correctness properties do not “get lost in implementation” (i.e., that the implementation of the model preserves its desirable properties). In this project you will develop a formally grounded framework that supports development of correct programs out of correct models. You will work on automata theory (e.g., on novel notions of refinement), programming languages (e.g., on mapping models into APIs for some language, such as Erlang, Go, or Java), and tool development.

Contracts for communicating Erlang processes

Contact: Olaf Chitil

Erlang is a concurrent functional programming language designed by Ericsson with dynamic process creation and asynchronous message passing. The aim of the research project is first to define a language for expressing properties of process creation and communication, for example communication protocols. These properties are expressed as contracts that state which program part to blame if the property does not hold. Contracts shall prevent many concurrency-related runtime errors but still allow common Erlang programming patterns. The second challenge is to devise methods for checking these contracts. Checking could be either statically at compile time or dynamically at run-time. Static type systems have been defined on top of the dynamically typed Erlang language, but these concentrate on the traditional, non-concurrent aspects of the language. The project will build on a wide area of previous research on type systems, tracing, dynamic and static contracts for functional languages and session types.

Tracing real-world functional programs

Contact: Olaf Chitil

The Haskell tracer Hat (www.haskell.org/hat) consists of a trace generation system plus various tools for viewing a trace. Hat demonstrated its enormous value for locating faults in Haskell programs. However, it is not used in practice, because its trace generation system only handles Haskell 98, a small subset of the real-world Haskell language compiled by the Glasgow Haskell compiler ghc. Thus the heart of this project is the design and implementation of a local modification of the Glasgow Haskell compiler such that its executables generate traces at run-time. This work will be followed by detailed studies of using traces of real-world Haskell programs. The experience will directly lead to ideas for improvements. For example, it is already clear that for most purpose only the computations of some program parts have to be traced whereas standard libraries and other parts are trusted to be correct and hence are not traced. Although trusting is a well-known concept in tracing, a useful scheme still needs to be developed.

CRS compilation via nested types

Contact: Stefan Kahrs

Combinatory reduction systems are fairly hard to implement correctly and efficiently, because a simple implementation will be inefficient because it uses repeated renamings of bound variables (de Bruin index adjustments), while an efficient one has a hard task to get the renamings right. However, the de Bruin indices can be expressed through nested types too which would guarantee (a large part) of the correctness of the renaming. The task is to bring these two worlds together.

Coalgebraic invariant verification

Contact: Stefan Kahrs

Certain invariants of rewrite systems, functional programs and the like can be proven by verifying them for (all) finite coalgebras. This requires an induction over finite coalgebras which is very tedious by hand – therefore the aim is to push this into theorem provers/checkers such as Isabelle or Coq.

Generalising type checking, using slicing and symbolic execution

Contact: Stephen Kell

Type-checking is a static program analysis which conservatively rejects programs not provably free of certain run-time errors. It proceeds mostly syntactically, in a way prescribed by the target language semantics. Two usability problems are conservatism and poor error messages. This project is about check for the same run-time errors in a more flexible way, thereby bringing more benefits to more code and more programmers. The outline approach is to reformulate type-checking as a special case of a more general analysis: an underapproximate program exploration (symbolic execution) combined with a program abstraction techniques that can opportunistically recover precise checking (program slicing). Dedicated funding on this or a related project is available from the Institute in Verified Trustworthy Software Systems.

Live programming in a Unix process

Contact: Stephen Kell

The Unix process environment is ubiquitous, but often considered impoverished both semantically (lacking metadata such as type information) and in run-time services (lacking facilities for reflection, garbage collection and dynamic update). For this reason, high-level language runtimes are realised as virtual machines (VMs) that are mostly opaque to the surrounding Unix environment, fragmenting the programming experience and multiplying the relevant tools and APIs. My liballocs system instead transparently extends the Unix process environment with these missing services, working towards a more “open box” interoperable environment. This project is about continuing this work with a particular focus on dynamic software update and other “live” programming facilities. A key challenge is “knowing where the pointers are”, for which Unix rudiments do exist (in debugging and relocation information) but are underdeveloped. By extending these, the goal is to allow reviously infeasible cases of run-time change and/or live development, while maintaining an acceptably low reduction in baseline performance.

Reconciling blocking and nonblocking concurrency abstractions

Contact: Stefan Marr

Blocking concurrency abstractions such as locks, channels, and join operations are designed to ensure deterministic execution. However, their blocking semantics are inherently incompatible with nonblocking programming models such as the communicating event loop model. In such nonblocking models, the benefits of asynchronicity are easily leveraged, but developers need to work around the lack of determinism explicitly. The goal of this project to extend the versatility of blocking and non-blocking programming models by introducing new programming abstractions, without weakening the inherent guarantees of either kind of model. A first task in that direct will be to identify underlying principles of blocking and nonblocking abstractions and the use cases they enable.

Building blocks for dynamic languages with state-of-the-art performance

Contact: Stefan Marr

Dynamic language implementations share many concepts and features, but share rarely implementations. With the Truffle/Graal platform, we get state-of-the-art performance by reusing the existing just-in-time compiler, garbage collectors, memory models, and threading system. However, there is still much duplication of effort to realize basic operations, control-flow structures, AST-level optimizations, debugger support, or concurrency abstractions. While some of is is justified by language differences, there remains a large potential for reuse to simplify language implementation and design exploration. The goal of this project is to identify and design reusable components that provide sufficient flexibility to account for language differences, while enabling reuse and state-of-the-art performance.

Multi-stage semantics and typing via graded modalities

Contact: Dominic Orchard

Staged computation (otherwise known as meta programming) is a powerful technique for writing programs that generate programs. This might be in a simple macro language, or a more involved compile-time code generator, or even a multi-stage pipeline of multiple languages or sublanguages, including JIT. It turns out that intuitionistic modal S4 logic generates a useful type system for staged computations (through a Curry-Howard like isomorphism), detailed in the work of Davies and Pfenning (A modal analysis of staged computation, 2001). The aim of this project is to generalise their approach to graded modalities, capturing truly multi-stage programs: not just 2 stages, but n-stages, with their interactions controlled and described by a graded modal type system. This approach can then be combined with other kinds of graded modality to explain the role of data across multiple levels of a staged compilation pipeline.

Program synthesis from type-based resource-sensitive specifications

Contact: Dominic Orchard

Type systems are one of the most widely deployed verification techniques in programming today. Type systems offer a simple “first defense” against a range of programming mistakes, directly within a language. Since types provide partial specifications of program behaviour, it is sometimes possible to use a type specification (a type signature) to derive parts of a program automatically (program synthesis) therefore speeding up programming. Type-based program synthesis is especially powerful in the context of more advance type theories, such as dependent types, where the synthesis amounts to proof search in logic. Various languages such as Agda and Idris now provide interactive modes where the human and the computer work together to write a program, with a programmer writing a specification and the compiler synthesising as much of a candidate program as possible, and so on. The aim of this project is to take program synthesis to the next level by combining dependent types with linear and graded types. Linear and graded types provide mechanisms for explaining where and how data is used, allowing a view of data as a resource with possible constraints attached. Linear and graded types further restrict the set of possible programs and thus offer a way to improve program synthesis by further constraining a program via “resource sensitive” types. This project will develop both the theory and practice, with a chance to implement the program synthesis / proof search techniques on top of the state-of-the-art language Granule.

Programming tools for journalists

Contact: Tomas Petricek

Increasing number of data analyses and models that journalists work and want to explain to their readers rely on some programming – data analyses need to fetch data from variety of government data sources, clean them and visualize them while reports about budget depend on economic models or simulations. Code that needs to be written and run for such tasks is often much simpler than what professional software developers need to write, but the programming languages that journalists need to use are the full and complex languages used by software developers.

In this project, we will identify an interesting area of “non-expert programming” that is relevant to journalism and we will design and build a programming environment that is focused on being simple, transparent and easy to understand, so that it can be used by non-experts such as journalists. Two possible areas include analysis of public data (e.g. published at http://data.gov.uk) and building simple agent-based economic models (ABMs), but the applicants are also welcome to propose other interesting problems.

History and philosophy of programming paradigms

Contact: Tomas Petricek

Programming languages can be classified into programming paradigms they support, such as functional languages, object-oriented languages, logic languages and so on. Although programmers are eager to make immodest claims about programming paradigms, it is not clear at all what such paradigm actually is. Is a paradigm defined by a set of features? A way of thinking? And is there such a thing as programming paradigm at all?

This research project aims to combine computer science with history and philosophy of science. The idea is to explore the technical nature of programming paradigms from a broader philosophical and historical perspective. We will look at the actual history of such paradigms, beyond the popular stories told by the practitioners and we will contrast the nature of “programming paradigms” with philosophy of science notions such as “research paradigms”. The research will also have a technical side – for example, a small programming language can be developed to shed light on various boundaries between programming paradigms.