Sujets de recherche | TRAIL Factory

User-centric XAI and Visualization tools

Objectives:

Enhance the trustworthiness of nonlinear dimensionality reduction algorithms by developing explainability methods for low-dimensional (typically 2 or 3-D) nonlinear embeddings, as analyzed in the context of exploratory data visualization tasks.

Research RoadMap:

- Review of the state of the art: current explainability techniques for nonlinear embeddings.
- Identification of their limitations.
- Design of an explainability framework to address them.

- Development of a Python software enabling the use of our framework.
- Data sets: public databases and single-cell transcriptomics.
- Codes to be publicly available on GitHub.

- Writing of a paper detailing our proposed framework, presenting its usage and motivation, and discussing its performances with respect to the state of the art.
- Overleaf project already on track. Software already developed, only requiring fine tuning.

Expected deliverables (TRAIL FACTORY):

- A first paper to be submitted to an international conference.
- The associated Python software to be deployed on the TRAIL Factory.
- A second paper focusing on a use-case of the proposed framework, to be submitted to an international conference, probably of the information visualization community.
- A possible extension of these first two papers into an academic journal submission.

Bias Detection and Mitigation

Objectives:

Détection de biais et/ou mitigation de biais dans les jeux de données (images ou texte).

Research RoadMap:

- Analyse de l'existant : types de biais
- Sélection des types de biais qu'on souhaite détecter/mitiger.
- Mise à disposition d'une checklist pour détecter des biais initialement présent dans un jeu de données.

- Analyse de l'existant : méthodes de mitigation de biais dans les jeux de données images et texte.
- Choix de bases de données de biais sur lesquelles appliquer les méthodes.
- Application des méthodes d'explicabilité post-hoc pour étudier les biais présents détectables.

- Création d'une toolkit reprenant un ensemble de méthodes de mitigation de biais.
- Application des méthodes sur les jeux de données.
- Application des méthodes d'explicabilité post-hoc pour étudier les différences avant/après mitigation de biais.

Expected deliverables (TRAIL FACTORY):¶

- Checklist de données pour détecter des biais initiaux sur les données.
- Publication d'un article lié à l'analyse et/ou résultat obtenus pour la détection de biais (review, toolkit, méthode).
- Toolkit pour détecter/mitiger des biais sur les images/textes.

Deep Active Learning

Research RoadMap:

Identify the technical challenges when combining Active learning and Deep learning

Berlin Proposal

Identity the preprocessing steps and models for Additive Manufacturing images
Obtain at the end scripts for DAL framework that can be used for use cases.

Post-Berlin

Experiments on update vs from scratch training
Explore the Explainable Active Learning (the idea would be to build a toolbox for it and then adapt it for a type of interesting dataset that we want to explore)
Integrate Deep active learning in ALAMBIC

Expected deliverables (TRAIL FACTORY):

Guidelines for the use of Active learning in practice

Federated Recommender System for the medical field

Abstract

The medical field has always been attracted to the development of information technologies. More recently, recommendation systems (RS) in the health field have received more and more attention. These can be developed for two kinds of users: one the one hand, patients who can use these systems to become more actively involved in their own healthcare or, on the other hand, health professionals to help them with their clinical decisions. Health recommendation systems (HRS) aim to recommend diets, physical activities, doctors,... The development of HRS in pathology diagnosis, drug recommendation and other riskier areas is, however, currently challenged by the protection of private data, the sensitivity of medical data, and so on, that must be considered to guarantee the quality of the recommendations.
The objective of federated learning is to train a single model while keeping the data storage local on several different devices. In this way, the privacy of the data is fully preserved. In the hospital context, and more specifically in the context of HRS, the federated approach represents an important potential to face the challenges stated above.
The objective of this project is, at the end of the two-week workshop, to develop a functional federated health recommendation system based on real data obtained through CETIC. Moreover, a scientific article be written, describing the developed system. In parallel to this functional brick, our ambition is to develop a second brick using existing ontologies of the medical domain to infer new data from the received database.

Research roadmap

- Literature review

- Evaluation of the HRS architecture based on Federated learning with a small database (POC)

- Exploitation of NLP (extraction of relevant content from medical reports)

- Implementation of homomorphic encryption (HE)

- Experimentation with larger structured data sets

- Experimentation of several aggregation methods

- Evaluation of other recommendation engines based on neural networks

Tasks in progress

- Writing on Overleaf a first paper for the international conference in Lisbon (https://healthinf.scitevents.org/CallForPapers.aspx)

- Evaluation of the HRS architecture based on Federated learning

Tasks performed

Evaluation of the HRS architecture based on Federated learning with a small database (POC)

We have built a system, called F-DRS, which aims at recommending the most appropriate drug(s) for a patient according to his/her clinical profile (e.g. pathology, medical examinations, age, gender,...). The particularity of the proposed system lies in the integration of FL to build the recommendation model. In this way, our system overcomes the above-mentioned medical data privacy issues. The Recommender Algorithm used by F-DRS is the
Neural Collaborative Filtering Algorithm. This algorithm works on the principle of Collaborative Filtering. As for our FL architecture, it is based on the Flower framework.

Distributed & Secured Artificial Intelligence

Intelligence artificielle distribuée et sécurisée