Structure-generating models for transition metal complexes (Project #5)

University of Oslo, Department of Mathematics

Three year PhD position 

Description

Current generative artificial intelligence approaches are typically based on Large Language Models (LLM). LLMs enable to estimate the probability of a sequence and to continue a started sequence. Graph-structured data is relevant in many fields, as they represent knowledge (e.g., in the form of taxonomies, ontologies, knowledge graphs) and disciplines such as chemistry (e.g., molecular graphs, chemical reaction pathways). Although LLMs can to some extent process graph structured data when it is simplified into sequences, this has drawbacks as the models then do not directly operate on graphs. Furthermore, these models typically do not take the semantics (e.g., axioms from an ontology) and the dependence structure of the data into account.  

The goal of this project is to investigate the limitations of current approaches and to develop new methods to estimate the probability of a graph, to continue and extend graphs through LLMs. These approaches will then be applied to graphs that combine descriptions of the molecular structure with logical statements from an ontology related to chemistry. We will focus on Transition Metal Complexes (TMCs). TMCs are of great interest because of their potential uses in a wide variety of material technologies, including as metallodrugs for chemotherapy or catalysts for industrial chemical processes.

Specific project requirements

  • Master degree in computer science, data science, mathematics, statistics, or other relevant field 

  • A solid background in statistics

  • Good programming skills (e.g. Python) and ability to work with version control tools (e.g. Git)

  • Experience with (graph) neural networks, graph-structured data, semantic technologies (RDF/OWL/SPARQL), and generative models is an advantage but is not necessary, as these can be learned.

Supervisors

 

Published Jan. 29, 2024 9:34 PM - Last modified Jan. 29, 2024 9:34 PM