Description
Current generative artificial intelligence approaches are typically based on Large Language Models (LLM). LLMs enable to estimate the probability of a sequence and to continue a started sequence. Graph-structured data is relevant in many fields, as they represent knowledge (e.g., in the form of taxonomies, ontologies, knowledge graphs) and disciplines such as chemistry (e.g., molecular graphs, chemical reaction pathways). Although LLMs can to some extent process graph structured data when it is simplified into sequences, this has drawbacks as the models then do not directly operate on graphs. Furthermore, these models typically do not take the semantics (e.g., axioms from an ontology) and the dependence structure of the data into account.
The goal of this project is to investigate the limitations of current approaches and to develop new methods to estimate the probability of a graph, to continue and extend graphs through LLMs. These approaches will then be applied to graphs that combine descriptions of the molecular structure with logical statements from an ontology related to chemistry. We will focus on Transition Metal Complexes (TMCs). TMCs are of great interest because of their potential uses in a wide variety of material technologies, including as metallodrugs for chemotherapy or catalysts for industrial chemical processes.
Specific project requirements
-
Master degree in computer science, data science, mathematics, statistics, or other relevant field
-
A solid background in statistics
-
Good programming skills (e.g. Python) and ability to work with version control tools (e.g. Git)
-
Experience with (graph) neural networks, graph-structured data, semantic technologies (RDF/OWL/SPARQL), and generative models is an advantage but is not necessary, as these can be learned.
Supervisors
- Researcher Basil Ell, basile@ifi.uio.no contact person for inquiries about the position
- Associate Professor Johan Pensar
- Associate Professor Riccardo De Bin