OPIG logo
Oxford logo

Current Research

We carry out research on many topics in protein informatics, including the following areas. Please use the thumbnails to navigate to summaries of the research by group members in each area.

Protein Structure

Image of ocrook

Oliver Crook (Postdoc)

The pharmaceutical industry regularly uses Hydrogen Deuterium exchange mass spectrometry (HDX-MS) to inform key decisions in small molecule, antibody, and vaccine R&D. However, the statistical analysis of HDX-MS remains primitive, holding back important - potentially life-changing - discoveries. One key complication is that peptide spectra are manually assessed for quality, and peptide masses are frequently corrected by domain experts. Furthermore, excessive amounts of HDX-MS data are discarded, and inappropriate statistical methods are routinely applied. I develop scalable and extensible software methods to improve reproducibility and interpretation in structural mass spectrometry, along with statistical and machine learning tools for analyzing such data.

Image of couteiral

Carlos Outeiral (Postdoc)

Protein engineering — the design of protein variants with desirable properties — is a central pursuit in biotechnology. In therapeutic discovery, after a promising antibody candidate has been found, it is often necessary to reduce immunogenicity, eliminate aggregation or increase plasma half-life while preserving binding affinity. In synthetic biology, engineered enzymes — for example, PETases that can rapidly degrade plastic, or designed enzymes that can catalyse new reactions — can be improved by increasing thermal stability and enhancing expressibility while conserving, or even boosting, catalytic efficiency. These pursuits have traditionally been carried out experimentally, either by rationally designing mutations, or with directed evolution, techniques which are limited to a small number of tested variants. In recent years, novel computational tools have arisen that can screen hundreds of thousands or millions of variants in short times. I am interested in progressing this field by developing multimodal deep learning methods, which incorporate diverse sources of biological information, to deliver the next generation of protein engineering algorithms.

Image of asiddiqui

Alexi (Hussain) Siddiqui (DPhil)

Understanding protein function requires the probing of both structure and dynamics. Traditional methods have limitations when attempting to capture dynamic behaviour. Hydrogen-Deuterium Exchange Mass-Spectrometry (HDX-MS) quantitatively assesses conformational dynamics and empirical models have been used to link the data to molecular dynamics (MD) simulations, potentially offering a more complete view of protein behaviour. My research is focused on reliable and robust methods for generating accurate conformations relevant with respect to experimental HDX-MS data. This is relevant to recently released structural prediction models. Using the wealth of existing data, we want to apply these methods for new insights. All developed code will be released, readily adaptable to existing analysis pipelines.

Image of q(annie)

Qurat ul ain (Annie) (DPhil)

Current approaches for protein design require multiple iterations of the design-make-test experimental cycle and provide limited control over the properties of the resulting molecules. I’m developing deep learning-based methods for designing de novo proteins with specific physicochemical properties. Computationally, this becomes a multi-objective optimisation problem where the output must be novel, diverse and physically plausible - how exciting! Feel free to reach out if you’re interested in similar topics and would like to have a chat.

Image of gabrahams

Gabriel Abrahams (DPhil)

Proteins are remarkable nano-machines that carry out the myriad functions required for life to exist. Engineering proteins to have novel functions has a vast range of applications, ranging from medical developments such as combating anti-microbial resistance, to climate friendly industrial manufacturing. In my DPhil, I am working to develop a machine learning pipeline to steer directed evolution: a method for utilising the power of natural evolution to produce proteins with desirable capabilities that were not required to survive in nature. These experiments will be performed in the lab, in a massively high throughput screening platform currently being developed by the Engineered Biotechnology Research Group at Oxford.


Image of mraybould

Matthew Raybould (Postdoc)

My research applies immunoinformatics to improve therapeutic design and to better our understanding of the immune response. During my DPhil, I captured and compared structural representations of therapeutic and natural antibodies, leading to new structure-aware approaches for in silico developability assessment and screening library design. My research is now focused on incorporating structural awareness to improve our ability to identify broader sets antibodies with functional commonality and to define the functional boundaries of different classes of adaptive immune receptor.

Image of awatson

Alex Greenshields Watson (Postdoc)

I am fascinated by the receptors of the adaptive immune system, specifically TCRs and Antibodies. These proteins hold the key to understanding our response to disease and our entire immune history. My post-doctoral research involves exploring the limitations of deep learning structure predictors in the context of these proteins, as well as developing new deep learning models to facilitate identification of the crucial regions which mediate epitope binding. In addition, I help maintain publicly available sequence and structure databases such as Observed Antibody Space (https://opig.stats.ox.ac.uk/webapps/oas/) and The Structural T-Cell Receptor Database (https://opig.stats.ox.ac.uk/webapps/stcrdab-stcrpred/).

Image of ahummer

Alissa Hummer (DPhil)

Antibodies and alternative antibody molecules (e.g. nanobodies) are increasingly important classes of therapeutics characterized by high binding specificity and affinity for targets. However, developing antibody therapeutics is time consuming and costly. My research aims to address this by developing machine learning methods to improve computational antibody design against a desired epitope. In particular, I will be focusing on in silico antibody affinity maturation and humanization.

Image of lchinery

Lewis Chinery (DPhil)

Recently, the need for rapid vaccine and antibody-therapeutic development has become widely recognised. Thanks to the vast volume of antibody data now available, computational methods offer perhaps the greatest opportunity to both speed up and reduce the cost of this development process. My DPhil aims to advance this area further by using Machine Learning techniques to identify promising antibody-antigen leads with diverse sequence and structure profiles that can then be taken into the lab.

Image of ggordon

Gemma Gordon (DPhil)

Antibodies are a highly successful class of biotherapeutic, however, their high molecular weight poses some challenges during production and manufacture. Nanobodies offer potential as an alternative, as they are much smaller and can show comparable specificity and affinity. However, developing biotherapeutics is non-trivial; issues can arise during manufacturing that may impede the success of the product. My research will concentrate on developing computational tools to highlight and predict developability issues in potential nanobody therapeutics.

Image of nquast

Nele Quast (DPhil)

I develop and train deep learning models for T-cell receptor structures. I'm interested in training models that retain equivariance, merge sequence and structure information and can be injected with conditions or constraints. I'm also interested in the structure of the interface between TCRs and their pMHC antigen. Beyond my research I'm passionate about improving gender representation in STEM and have acted as president of the Oxford Wom*n in CS society.

Image of oturnbull

Oliver Turnbull (DPhil)

For an antibody to make an effective therapeutic, it must both bind to its target and be free from developability issues, such as aggregation, poly-specificity, and poor expression levels. By either limiting our search space to developable antibodies, or building methods to engineer out developability issues, the success rate of therapeutic antibodies can be increased. My DPhil aims to tackle both problems using generative machine learning methods.

Image of fspoendlin

Fabian Spoendlin (DPhil)

Antibodies are an important component of the immune system and are increasingly used as therapeutics. Recent advances in protein structure modelling make it possible to accurately predict the structure of antibodies from their amino acid sequence. A limitation of current structure prediction tools is that they only predict the structure of a single conformation of an antibody. However, antibodies are flexible molecules that frequently transition between a set of distinct structural conformations and flexibility is key to many functional properties. During my DPhil, I aim to develop antibody structure prediction tools that capture the flexibility of antibodies and predict the structure of multiple conformations.

Image of bmcmaster

Benjamin McMaster (DPhil)

T cells are a key part of our immune system, responsible for fighting pathogens and regulating immune responses. To identify foreign invaders T cells, use their receptors (TCRs) to rapidly screen and identify antigens. Although, key to our health and survival, the map between TCR composition and antigens is still poorly understood. I aim to apply newly develop deep learning models in protein structure prediction to TCR data to better understand the rules that govern antigen-specific T cell response.

Image of hcapel

Henriette Capel (DPhil)

Antibodies are an important class of biotherapeutics. The process of engineering a therapeutic antibody is time and cost intensive, with many of them failing due to developability issues, such as low expression, low solubility, and high aggregation. My research aims to develop computational methods to improve the outputs of antibody developability workflows. These tools will incorporate experimental data including negative data, which is essential to guide the antibody engineering process.

Image of iellmen

Isaac Ellmen (DPhil)

Antibodies work by binding to their targets (antigens), and either inhibiting their function or activating other components of the immune system. Predicting the mode by which an antibody binds to its cognate antigen is called antibody-antigen complex modelling or docking. While general protein complex prediction has seen great improvements in recent years, driven by methods such as AlphaFold Multimer, antibody-antigen complexes are still difficult to model because we rarely have useful homologs to provide co-evolutionary information. My DPhil project is focused on developing new machine learning docking models to more accurately predict antibody-antigen complexes.

Small Molecules

Image of gmorris

Garrett Morris (Associate Professor)

My primary focus is on methods development in computer-aided drug discovery, chiefly in high throughput docking, ligand-based virtual screening, network pharmacology, cheminfomatics, bioinformatics, machine learning and more recently protein engineering. Current research projects include: addressing the limitations of scoring functions in docking, in particular to improve our understanding of molecular recognition of small molecules; handling receptor flexibility in protein-ligand docking; and fragment-based drug discovery.

Image of rsanchez

Ruben Sanchez (Postdoc)

Fragment based-drug screening is a popular approach for drug discovery in which a set of small chemical compounds is assayed against biological targets in order to identify weekly binding hits that can be later exploited to produce lead compounds. My research focuses on the development of machine learning tools that exploit 3D information about hits and protein targets with the aim of facilitating drug discovery. I am particularly interested in automatic fragment merging and compound scoring using non-supervised machine learning techniques. In collaboration with the XChem team, we pursue to implement ready-to-use applications that can help experimentalist with no computational skills to perform better experiments.

Image of mferla

Matteo Ferla (Postdoc)

In fragment-based drug discovery, low molecular weight hits are identified in high throughput screens, such as in the XChem facility at the Diamond Light Source with which I collaborate, and subsequently elaborated into larger compounds leveraging the premise that analogues bind in a similar manner. Previously, I developed Fragmenstein, a tool that mergers and places compounds by stitching the template molecules together to create a monster conformer requiring energy minimisation. This approach has spawned several derived methods, which I apply in the hit elaboration part of the ASAP consortium (https://asapdiscovery.org/) to create novel antivirals.

Image of acarbery

Anna Carbery (Postdoc)

Fragment-based drug design campaigns rely on initial fragment screens thoroughly exploring the target binding site. However, this is not guaranteed with current fragment libraries, which are designed to be as diverse as possible. In collaboration with XChem at Diamond Light Source, I am using historic fragment screens to further understand protein-fragment interactions. This will provide a basis for the generation of target-specific fragment libraries that can more comprehensively explore the binding site and increase the potential for diverse lead compounds.

Image of fboyles

Fergus Boyles (Research Software Engineer)

Accurately predicting the binding affinity of a small molecule to a protein target is a key problem in both molecular docking and virtual screening. My research involves investigating how machine learning techniques can be used to effectively leverage the increasing abundance of binding affinity and protein structure data to improve the scoring functions used to predict binding affinity. I am particularly interested in identifying which molecular features are most informative of binding activity and how this varies between families of proteins.

Image of mmokaya

Maranga Mokaya (DPhil)

Ideally, in drug discovery, once a target has been identified and characterised, it should be possible to efficiently and exhaustively search chemical space for bioactive molecules with specific physio-chemical properties. Recent progress in computational capabilities and machine- learning methods mean that De Novo molecular generation tools are a step toward the above ideal. In collaboration with Exscientia, my research will focus on further developing machine learning tools for molecular design. Subsequent work will involve the elucidation and development of chemical synthesis pathways that will enable us more effectively make new compounds to tackle disease.

Image of lklarner

Leo Klarner (DPhil)

Accelerating the discovery of novel and more effective therapeutics is an important pharmaceutical problem in which deep learning is playing an increasingly significant role. However, real-world drug discovery tasks are often characterized by a scarcity of labelled data and significant data drift—a setting that poses a challenge to standard deep learning methods. My research focuses on developing new algorithms that are able to leverage scientific prior knowledge and constraints to overcome these challenges and show improved robustness and generalisation in practical, out-of-distribution settings.

Image of swills

Steph Wills (DPhil)

Fragment-based drug discovery (FBDD) involves the screening of low-molecular-weight compounds against a target of interest that can be optimized to become larger, more potent lead-like compounds. In collaboration with XChem, I will be exploring how to exploit the rich structural data that result from crystallographic fragment screens to guide fragment-to-lead optimization, primarily using fragment merging approaches. Initial work will focus on improving the efficiency with which we can sample accessible chemical space by identifying fragment merges from commercially available compound libraries, thus overcoming issues with synthetic accessibility. Subsequent work will explore how to prioritize molecules for purchase and/or synthesis and the use of de novo design to generate novel compounds.

Image of lvost

Lucy Vost (DPhil)

Fragment-based drug discovery consists of developing compounds for a target beginning from fragments that are known to weakly bind to it. When elaborating on a fragment in such campaigns, information about known ligands and the protein pocket can both be leveraged to maximise the binding ability of the end result. However, using information about known ligands has been demonstrated to bias the drug design process towards compounds similar to those already in use. In collaboration with IBM Research, I am investigating ways to perform fragment elaboration using exclusively information from the protein pocket.

Image of mbuttenschoen

Martin Buttenschoen (DPhil)

I am interested in machine learning models that predict protein-ligand binding. Currently I am working with graph neural networks and generative models.

Image of gdurant

Guy Durant (DPhil)

Computational tools in drug discovery are typically tested on clean and often flawed benchmarks, especially with machine-learning based tools, leading to optimistic characterisation of the tool's ability. My research interests are exploring how this difference in proposed performance and real-life performance can be accounted for and measured, specifically for predicting the binding affinity between a small molecule drug and a protein target. I will hopefully build upon this to improve the accuracy and generalisability of these models and then incorporate structural uncertainty into these models decision making.

Image of kfieseler

Kate Fieseler (DPhil)

My research aims to increase the efficiency of hit finding and development in small molecule drug discovery. By utilizing statistical and machine learning techniques, we can tackle the difficult task of deconvoluting structural and biophysical data of thousands of compounds, which can reduce the time and improve the accuracy of identifying hits. In my collaboration with XChem, I am working directly with crystallographers and chemists to develop these computational tools that will reduce their workload.

Image of ivalsson

Isak Valsson (DPhil)

My research interests involve developing robust machine learning methods for early drug-discovery problems. Recently, I’ve been working on developing structure based scoring functions that work better in an out-of-distribution (OOD) setting, i.e. when the training data occupies a different area in chemical space than the test data. Additionally, I’ve been exploring different ways to benchmark the OOD performance of binding affinity predictors, and different ways of featurising bound protein-ligand complexes.

Image of dadlard

Dylan Adlard (DPhil)

Antimicrobial resistance (AMR) is one of the leading public health concerns of the 21st century and is becoming an increasingly intractable problem as the continued overuse of antimicrobials in health and agriculture is exacerbating the rate at which resistance is developing and propagating. In collaboration with Oracle, my DPhil project therefore focuses on building generalisable machine learning and deep learning models featurised with structural and physiochemical information of the drug target to predict AMR against Mycobacterium tuberculosis within a diagnostic framework. I am also a member of the Modernising Medical Microbiology group at the NDM.

Image of yziv

Yael Ziv (DPhil)

My current research interests lie in structure-based drug design, where the primary focus is on developing small molecules that have a strong and specific affinity for a particular 3D protein structure. My research focuses on deep-learning generative models, exploring innovative approaches to integrate targeted protein information into the design process. I am particularly interested in the necessity for novel deep-learning methods that can effectively incorporate the targeted protein, all while prioritizing the generation of molecules that are both chemically and physically plausible.

Image of araja

Arun Raja (DPhil)

My DPhil research involves the development of geometric deep learning methods for small molecule drug discovery grounded in physics and chemistry. Specifically, I am using deep learning for quantum-level representations of molecules as a precursor for tasks in lead molecule optimization such as property prediction and protein-ligand binding affinity prediction.

Image of smoney-kyrle

Sam Money-Kyrle (DPhil)

Accurate prediction of molecular properties, such as protein-ligand affinity, off-target binding, toxicity, and mutagenicity, is an intrinsic component of small molecule drug discovery. My research focuses on the development of robust and generalisable computational tools that elucidate greater structural understanding of the interactions between small molecules and proteins. I am particularly interested in the application of novel deep learning methods and evaluating whether these approaches are capable of learning inherent biochemistry binding interactions.

Image of cclark

Charlie Clark (DPhil)

The chemokine signalling network drives inflammation and therefore many inflammatory diseases. However, it has evolved to be highly redundant to resist pathogenic shutdown, and so successful anti-chemokine therapeutics must target multiple chemokines simultaneously. During my DPhil, I will apply experimental and computational techniques to this ‘poly-pharmacological’ problem by characterising promiscuous therapeutics that can target multiple chemokines simultaneously and tackle chemokine-driven inflammatory disease.