Learning differential models (and other logic rules) from data with uncertainty (Project #11)

University of Oslo, Oslo Centre for Biostatistics and Epidemiology, Institute of Basic Medical Sciences 

Three year PhD position

Description

We want to discover the rules that govern relationship between factors, functions, or variables from data. This is useful for prediction, generalisation, and to understand processes and systems. We start with rules that can be written as differential equations for systems evolving in time. This belongs to physics-informed machine learning. The task can be for example to model the evolution of the volume of a tumour of a patient. In such a case, and in many other real-world situations, not much data is available, as measurements are clinically intensive. In this PhD project, we will develop new methods and algorithms in the situation when data is scarce. One approach will be to combine inference on the differential equation with a stochastic simulation of the system under study, so to be able to generate synthetic data which are then useful for inference. Symbolic regression (SR) is an interpretable approach to learn the time dynamics of the system under study, as it represents the differential equation as a (possibly sparse) tree. Because the data are always measured with error, and the simulation algorithm is stochastic, the estimated differential equation is uncertain and we will find new ways to quantify such uncertainty, for example in terms of distributions over trees of varying dimension. We will study how the uncertainty depends on data, and on the choice of basis functions used in SR; we will investigate the presence of phase transitions of the system under study, expressed by sudden changes in the closed form of the estimated differential equation.

This PhD project aims to develop new methods with theoretical understanding, but will also test the new approaches on real patient data, with the aim to discover for the first time the differential equations governing a breast tumour growth. We will also investigate the estimation of other types of rules, beyond differential equations, for example logic relations, from noisy data.  

Specific project requirements

  • Master’s degree in statistics, mathematics, machine learning, theoretical computer science or a related quantitative subject with proven competence in statistics and/or machine learning.  

  • Excellent and documented experience in scientific programming is necessary. 

Supervisors

 

Published Jan. 29, 2024 9:35 PM - Last modified Jan. 29, 2024 9:35 PM