Sujet de thèse :
Titre du sujet de thèse :
Stress-Based Modulation of Hippocampal Replay-Inspired Techniques in Reinforcement Learning
Acronyme du projet :
SMaRT-RL
Dans quel AXE du DIM C-BRAINS se situe ce sujet de thèse :
axe3
Mots clés séparés par des virgules :
Stress modelling, Reinforcement learning, Hippocampal replay, Computational modeling, Cognitive and affective neuroscience
Présentation de l'équipe :
Institut ou centre :
ETIS lab, CY Cergy Paris Université
Equipe (Autre) :
NEUROCYBERNETICS
Responsable de l'équipe :
Nistor Grozavu (Director of ETIS) , Alexandre Pitti (Coordinator of Neurocybernetics Team)
Mail de l'équipe :
direction@etis-lab.fr ; etis_secretariat@etis-lab.fr
Téléphone de l'équipe :
+33134256633 , +33134257541 , +33611264590
Site web de l'équipe :
https://www.etis-lab.fr/neuro/
Informations administratives de l'équipe :
Organisme Gestionnaire et ses coordonnées :
CY Cergy Paris Université
e-mail: services.valorisation@cyu.fr
Nom et prénom du Gestionnaire administratif :
Mahmoud Shakhsian Mohammad
Email du Gestionnaire administratif :
contrats-conventions@ml.u-cergy.fr
Téléphone du Gestionnaire administratif :
Coordonnées de la personne habilitée à signer la convention avec le gestionnaire du DIM C-BRAINS :
Nom et prénom de la personne habilité à signer :
Gatineau Laurent
Statut :
Président
Coordonnées complètes :
adresse: 33 Blvd du Port, 95011 Cergy-Pontoise
e-mail: presidence@cyu.fr
Encadrement de la thèse :
Nom complet du directeur-trice de thèse :
Lola Cañamero
Statut du directeur-trice de thèse :
Full Professor
HDR :
Oui
Date d’obtention ou Date prévisionnelle d’obtention de l’HDR + Commentaire si nécessaire :
2006 (équivalent), voir document justificatif joint
Nombre de doctorants actuellement encadrés par le directeur de thèse + date de soutenance prévisionnelle :
5 doctorants (12/24 taux d’encadrement: 100%, 11/25 taux d’encadrement: 50%, 11/26 taux d’encadrement: 40%, 12/26 taux d’encadrement: 30%, 11/27 taux d’encadrement: 100%)
Ecole doctorale d'affiliation du responsable d'équipe :
STIC (Sciences et Technologies de l'Information et de la Communication - ED EM2PSI
Université :
CY Cergy Paris Université
Email du directeur-trice de thèse :
lola.canamero@cyu.fr
Nom complet du co-encadrant-e de thèse :
Elisa Massi
Email du co-encadrant-e de thèse :
elisa.massi@ensea.fr
Adresse complète, téléphone et email de l'équipe partenaire (si cotutelle) :
Sujet de thèse :
Résumé du sujet de thèse (1000 caractères) :
Animals and humans process vast amounts of information but retain only the most relevant experiences, with emotions playing a key role in memory consolidation. Emotions, particularly stress, enhancing attention and arousal, influence brain regions like the hippocampus. In neuroscience, it is known that hippocampal reactivations help recall and consolidate experiences. In artificial intelligence (AI), strategies inspired by these reactivations, especially in reinforcement learning, have been shown to accelerate learning. When it comes to emotionally charged circumstances, the relationship between stress, its most prominent hormone, cortisol, and memory functions in the hippocampus, is complex. This project aims to explore the link between stress and hippocampal reactivations via a computational model, and test it in artificial agents. This bio-inspired approach could (a) reveal new insights into the generation of hippocampal replay mechanisms and (b) improve AI's learning efficiency.
Sujet complet de la thèse (8000 caractères) :
Most animals and humans encounter an immense amount of information and experiences throughout their lives, yet they manage to retain and recall only the most relevant, or a partial, but meaningful representation of them. Not everything experienced is stored in long-term memory; emotions play a key role in this memory consolidation process [Bower (1983) Phil. Trans. Royal Soc. London. B, Bio. Sci. 302.1110:387-402]. Emotions strongly influence cognitive processes, by enhancing our level of arousal and attentional resources, modulating the activity of brain regions such as the amygdala, the prefrontal cortex, and the hippocampus [Tyng et al., (2017) Front. Psychol]. Since the work of Scoville and Milner [(1957) Journ. Neurosc. 9.8:2907-2918], the role of the hippocampus as a cognitive mapping center has been studied in neuroscience, particularly through neurophysiological studies on spatial tasks in rodents. This research highlighted the fact that certain patterns of sequential activation of hippocampal neurons, observed during task execution, are then replayed during sleep or periods of calm wakefulness in the animal. These activities have been called hippocampal reactivations and are recognized as a powerful mechanism used, particularly by place cells, to recall, organize, and consolidate past experiences and infer future ones.
From an artificial intelligence (AI) perspective, it is also known that offline replay and updating of values associated with an agent’s actions can accelerate learning after a small number of real interactions with rewarding or punishing events (for example, Lin [(1992) Mach. Learn. 8-293-321]). The great interest in implementing computational strategies inspired by hippocampal reactivations in AI lies in tasks where past experiences and acquired knowledge must be re-evaluated and refined to perform better in future decision-making steps. This is the case with reinforcement learning (RL) paradigms [Sutton and Barto (2018) MIT press], where initially, when no prior knowledge is usually available, the best strategy is to interact with an environment through trial and error, and only when the level of experience increases the agent can exploit its previous knowledge to reach a sequence of actions approaching optimal behavior. In mammals and rodents, this consolidation of knowledge does not depend solely on the animal performing the same actions in the same situations: memories, particularly targeted recalls of experiences, are fundamental for effective learning from a small set of accumulated real experiences. Since the first RL algorithms which exploit strategies inspired by hippocampal reactivations [Sutton (1990) Mach. learn. proceed.], several researchers have proposed RL-based computational models inspired by these neuroscientific findings that are capable of reproducing the major experimental results regarding the quantity and type of reactivations generated [Khamassi and Girard (2020) Bio. Cybern. 114.2:231-248].
One of the major forces that impact the emotional state is stress, induced by the interaction between the animal, other animals, and the environment, both perceived through the animal’s senses. The influences from an external stressful environment and positive social interactions have been recently modeled by Khan and Cañamero [(2022) Front. Rob. AI 9] as the interaction between two hormones, cortisol and oxytocin, and their homeostatic balance. Concerning the role of emotional states related to stress and cortisol, Cañamero’s research has for example modeled their influence on learning and adaptation [Hiolle, Lewis and Cañamero (2014) Front. Neurorob. 8], behavior development [Lones, Lewis and Cañamero (2017) IEEE Trans. Cogn. Devel. Sys. 10.2-445-454], pain perception [L’Haridon and Cañamero (2023) ACII] and the appearance of compulsive behavior in Obsessive-Compulsive Disorder [Lewis, Canamero and Fineberg (2019) Comp. Psych.; Lewis and Canamero (2019) ACII]. Regarding the role of anxiety and stress on the hippocampus memory mechanisms, many key issues still need to be investigated: it has been known from various animal and human studies that stress impairs many memory functions at the level of the hippocampus [Kim, Pellman, and Kim (2015) Learn. Mem. 22.9:411-416] but, recently, Sherman et al. [(2023) Journ. Neurosc. 43.43:7198-7212] found that cortisol could also enhance the hippocampal associative memory functions.
The thesis project aim at a better understanding of the relationship between stress and the generation of hippocampal reactivations by means of a computational model that could be systematically tested in simulation or eventually on a real robot. This will bring the great advantage of testing functional hypotheses we have about the relationship between stress and hippocampal replay, with our computational model, in a very controlled experimental set-up, when we could repetitively simulate different emotional profiles and sensitivities on our agents. An additional point to this research will be to observe and analyzed the proposed model embodied on a robotic platform with the aim to look systematically at what are the effects of such a model in a very controlled context that still present elements of stochasticity and unpredictability that make a robot interacting with the real world a step closer to animal experiments, compared to pure simulations.
As analyzed in Massi et al. [Front.Neurorob. 16], the adoption of RL techniques inspired by hippocampal reactivations for AI has just begun. After validating a strategy that combines offline reactivations generation through a model-based agent with reactivations generated by a model-free method, the question remains of how to optimize the timing of this offline reactivation generation and its quantity. So, the proposed project aims to link the generation of offline reactivations to the internal emotional state of an agent. The idea is that, by following the concept of homeostasis and the regulation of key emotion-related hormones, such as oxytocin and cortisol [Avila-Garcia and Cañamero (2004) SAB], in a learning task, the agent will change its internal emotional state in relation to (a) its performance in task completion (e.g., in a spatial navigation task, effectively avoiding punishments to quickly reach a reward state), (b) egocentric external stimuli (e.g., in a spatial navigation task, proximity to walls or other agents approaching), and (c) a combination of the above two elements. This emotional internal state will allow the agent to trigger reactivations during moments of intense stress, for example, and not at just any moment in the task, where they may prove unnecessary. With this bio-inspired approach to AI, we will test and validate optimal strategies for offline reactivation generation within RL algorithms, with the aim of improving and accelerating artificial learning and disclose new possible driving mechanisms for hippocampal reactivations in neuroscience.
So far, in RL, many strategies inspired by neuroscientific evidence on the hippocampus have been proposed and tested to have a spontaneous and optimal generation of different types of replay-like activities [Mattar and Daw (2018) Nat. Neuro. 21.11:1609-1617; Diekmann and Cheng (2023) ELife 12]. Still a model that bases the generation of such reactivations on emotions and more specifically on the internal emotional state of an agent is lacking. That’s why the proposed project aims to theorize and test such a model which could spontaneously enable the generation of RL-based replay to improve memory consolidation and learning processing over different tasks and help shedding light on our understanding on the functional relationship between emotional states (stress in particular) and hippocampal replay. This research objective can be accomplished also thanks to the validation of our results against experimental behavioural data from rodents that could be provided by our collaborators.
Implication attendue de l’étudiant dans ce sujet et compétences souhaitées :
The student will work under the supervision of Elisa Massi and Lola Cañamero, and in collaboration with neuroscientists and biologists (e.g., the research team of Karim Benchenane at ESPCI Paris), to develop and test computational models, and eventually validate them against experimental data from rodents. The ideal candidate would have obtained a master degree in computational neuroscience, computer science or engineering, have interest, high motivation and aptitude to carry out interdisciplinary research, particularly in applying neuroscientific theories in artificial intelligence, and have appropriate knowledge of mathematics and programming skills. Prior knowledge about the bases of Reinforcement Learning would be desirable.
Faisabilité du projet en 3 ans - Précisez les étapes :
The PhD project will be conducted over a 3 years period of full-time research activity, under the supervision of Assistant Prof. Elisa Massi and Full Prof. Lola Cañamero. The supervisory team has complementary expertise:
Elisa Massi (www.etis-lab.fr/massi-elisa/) has expertise in reinforcement learning (RL) and computational modeling of behavior, decision-making, and learning, in particular in spatial navigation scenarios. Her previous work focused on studying and modeling the interactions and functional principles of different areas of the nervous system (e.g., cerebellum, hippocampus) to better control simulated artificial agents and robots. Relevantly to this project, she is currently working on a model of emotion-driven spatial exploration to unravel the dynamics between emotions (in particular stress and anxiety) and the global exploration of a novel environment. Following the research she conducted during her postdoc, she collaborates with the research team of Karim Benchenane (www.bio.espci.fr/-Karim-Benchenane-Memoire-Oscillations-et-etat-de-vigilance-) at ESPCI Paris (www.espci.psl.eu/fr/), who are conducting mice experiments studying behavior, sleep states, and memory in relation to fear-related and stressful situations.
Lola Cañamero (www.etis-lab.fr/canamero-lola/) has longstanding expertise in biologically-inspired embodied artificial intelligence, robotics, and artificial life models of affective phenomena (motivation and emotion) grounded on mechanisms of homeostasis and hormonal modulation, and their influence on decision making, learning, development, and social interaction. As part of her research she has investigated, in interdisciplinary collaborations, the roles of stress and the (social) context in behavior development, decision making, learning and adaptation, pain perception, and group dynamics, as well as in mental health disorders such as Obsessive-Compulsive Disorder. In this work, she has used artificial agent simulations and robots to model, implement and test neuroscientific hypotheses.
For this PhD project, we will propose the following milestones and deliverables schedule:
1st year: (a) literature and state-of-the-art on the role of stressful stimuli on memory and, in particular, on hippocampal reactivations and studying about the past bio-inspired algorithms for the generation of replay in RL. (b) Design and validation of the computational model linking exogenous stressful stimuli to the emotional internal state of the agent, mainly based on hormonal responses and the concept of homeostasis. (c) Possibility of a confrontation and further validation of this model with experimental data on rodents from our collaborators. (d) Writing and submission of a first paper about the model and the validation experiments
2nd year: (a) Extension of the model for the spontaneous generation, guided by the emotional state of the agent, of RL updates, inspired by hippocampal replay, for task accomplishment. (b) Experiments to validate the extended version of the model, testing whether/how it could guide and facilitate the accomplishment of a task (e.g., reaching a goal state during navigation, avoiding obstacles). (c) Writing and submission of a paper about the extended model and the subsequent experiments on the dynamics of the internal emotional state of an agent linked to stressful stimulations and task accomplishment.
3rd year: (a) Refinement of the model for the generation of replay-like activity modulated by the agent’s stress, in RL. (b) Performing simulated or embodied experiments (on a robot) to validate the extended version of the model. (c) Additional cross-analyses with the rodent data from our collaborators. (d) Writing and submission of a third paper about the generation of replay-like activity modulated by the agent’s stress, in RL, including cross-analyses with the rodent data from our collaborators. (e) Writing and submission of the thesis’s manuscript.