Postdoctoral Researcher position in Advancing Reliable LLM-based Data Curation Systems


About the project

We invite applications for a postdoctoral research position in the Foundations of Algorithmic Verification group led by Prof. Joël Ouaknine. The successful candidate will work in close collaboration with an industrial partner, delving deep into the verifications of Large Language Models (LLMs) based software programs, and contributing to bridging scientific research and applications.

Project Insight: We are embarking on a pioneering project that aims to develop reliable LLM-based data curation systems for data verification and data enrichment tasks such as verifying or discovering entity relationships from textual documents and/or the Web.

An LLM-based data curation system deconstructs complex data problems into manageable sub-problems, each addressed using LLMs. However, these models can introduce uncertainties and errors, including hallucinations, which hinder their adoption in industrial production environments where high accuracy is critical.

Consider a knowledge graph enrichment system designed to identify or infer relationships between two entities within a document. This system may utilize a long-context LLM, capable of processing the entire document, or employ a Retrieval Augmented Generation (RAG) process, including GraphRAG, to pinpoint and analyze the most relevant information. However, research suggests that both strategies can yield inaccuracies, presenting challenges for their deployment in production environments.

This project aims to propose a verification methodology that ensures the reliability and accuracy of an LLM-based data curation system at both the sub-component and whole-program levels.

Additionally, the project will focus on several critical research areas:
  1. Effective retrieval of pertinent information from documents.
  2. Balanced integration of RAG and long-context LLMs to mitigate trade-offs.
  3. Detection and correction of "hallucinations" or incorrect inferences by LLMs.
  4. Verification of LLM-based reasoning to ensure result accuracy.
  5. Optimization of overall system efficiency.
The postdoctoral researcher will contribute to defining the methodology and develop and refine this approach, assisting in the development of a system optimized for data curation using LLMs.

Focus of the position:
  1. Research and development of innovative verification methods to ensure the reliability and accuracy of LLM-based data curation programs.
  2. Actively collaborate with industrial partners and engage in creative design and development of an LLM-based data curation system.
While the successful candidate will be hired by, and work at, the Max Planck Institute for Software Systems in Saarbrücken, frequent collaborations with, and visits to, research partners, in particular TU Wien (Vienna, Austria), UCL (London), University of Calabria (Rende, Cosenza, Italy), and to industrial partners are necessary. In addition, the successful candidate is expected to spend one or more internships in industry. The project will build on methods and software provided by our industrial partners. We are thus looking for a candidate who is keen and able to liaise with industry, and who is interested in transformational research, working on practical problems of industrial relevance.

Your qualifications and responsibilities

Required: Beneficial: For informal enquiries, please contact Prof. Joël Ouaknine (joel@mpi-sws.org).

To apply, please send a cover letter and CV by email to Ms. Lena Schneider (lschneid@mpi-sws.org).

Applications will be reviewed until a suitable candidate is found. To ensure full consideration, please submit your application on or before 25 Nov. 2024. We expect to hold online interviews in early December 2024.


Back to Joël Ouaknine's home page


Imprint / Data Protection