Event
Bandit techniques for Reinforcement Learning and experimental sciences
Odalric-Ambrym Maillard
Chargé de recherche HDR
Inria
Date: 20 January 2026, Tuesday
Time: 3 pm, Singapore
Venue: S16-06-118, Seminar Room
In this talk, I will provide a short overview
of recent results in multi-armed bandit for reinforcement learning theory.
I will show how a novel paradigm from revent
advances in bandit theory is reshaping the exploration-exploitation challenge
at large, yielding improved algorithms including in full-blown
reinforcement learning.
We will also explore the strucutre of generic
MDPs.
In the second part, I will discuss a
stimulating example of the sim-to-real gap in experimental sciences (especially
agroecosystems), revealing a fresh set of challenges to be adressed both from
an applied and methodological standpoint.