Offre d'emploi (Non permanent)
Post-doctoral position in dissimilarity learning for interactive clustering
Présentation
The research performed in SCHISM will focus on interactive pattern-mining,
and interactive clustering in a chemoinformatics context. Interactive data-
mining is a recent research direction that breaks with the older paradigm of
specifying parameter settings for algorithms, letting the algorithm run, interpret
the results of the operation, and, based on this interpretation, adjust parame-
ter settings to restart the process. Interactive data-mining, on the other hand,
proposes partial or preliminary results to the user, collects their feedback, and
uses this feedback to bias the mining process going forward.
The overall goal of SCHISM is to develop a robust approach to interactive
data-mining that integrates both pattern-mining and clustering, and to deliver
a prototype that allows users to launch pattern mining or clustering algorithms,
visualize the results, give feedback, and rerun mining operations, which will take
the given feedback into account.
Mission
The general idea of the research work carried out at LITIS lab for the SCHISM
project is to rely on dissimilarity representations to both infer relevant cluster-
ings and to propose tools for visualization, analysis and expert feedback under
the form of constraints on the clusters. The goal is to de_ne a common repre-
sentation so that the data/constraints are projected into smaller spaces while
preserving the properties of the neighborhood. To do so, a preliminary study
1
has been carried out the last few months on using Random Forest (RF) mod-
els to measure dissimilarities and on using it afterwards for inferring relevant
clusterings.
The post-doctoral researcher will be in charge of (i) deepening the use of RF
dissimilarities for deriving a relevant clustering and (ii) proposing mechanisms of
analysis and interaction with the expert. More speci_cally, this latter task must
meet a two-fold objective, namely allowing the expert to select and analyze (sub-
)clusterings in order to propose a feedback, and translating this feedback into
the form of dissimilarity constraints in order to adjust the result. This research
work will therefore involve becoming familiar with Random Forest methods and
Profile du candidat
The successful applicant will:
1. possess or be on track to complete a PhD in computer science, or applied
mathematics with a focus on machine learning or data-mining, Fluency in
written and spoken English or French is essential;
2. have strong programming skills (Java, Python, etc.) and in-depth under-
standing of statistics and machine learning;
3. have a productive publication record;
4. have a strong work ethic and time management skills along with the ability
to work independently and within a multidisciplinary team as required.
Salary will be in line with European and French guidelines w.r.t. years of
research experience. Additional funding is available for travel.
Compétences requises
Organisation
Sent to Laurent Heutte (laurent.heutte@univ-rouen.fr) and Simon Bernard (simon.bernard@univ-rouen.fr)
Your application should include:
1. curriculum vitae
2. statement of past research accomplishments, career goal and how this
position will help you achieve your goals
3. two representative publications
4. contact information for three references