Stochastic Halpern iteration in normed spaces and applications to reinforcement learning.

In this seminar, I will present recent results on the oracle complexity of the stochastic Halpern iteration with minibatching, a method designed to approximate fixed points of nonexpansive and contractive operators in finite-dimensional normed spaces. Under the assumption of uniformly
bounded variance from the stochastic oracle, we show that the method achieves an oracle complexity of $\tilde{O}(\varepsilon^{-5})$ to obtain an $\varepsilon$-accurate expected fixed-point residual for nonexpansive operators. This improves upon previously known rates for the stochastic Krasnoselskii-Mann iteration.

We also establish a general lower bound of $\Omega(\varepsilon^{-3})$ that holds for a broad class of methods, including all averaged iterations, even when minibatching is used. In the case of $\gamma$-contractive operators, a suitable modification of the method yields an improved complexity bound of $O(\varepsilon^{-2}(1-\gamma)^{-3})$.

As an application, we propose new model-free algorithms for solving both average and discounted reward Markov Decision Processes (MDPs). Notably, in the average reward setting,our approach applies to weakly communicating MDPs without requiring prior knowledge of system parameters.

Date: May 19, 2025 at 14:30:00 h
Venue: Sala de Seminario John Von Neumann, CMM, Beauchef 851, Torre Norte, Piso 7.
Speaker: Pablo Contreras Fernández
Affiliation: Universidad Diego Portales.
Coordinator: Haritha Cheriyath
More info at:
Event website
Abstract:
PDF

Posted on May 16, 2025 in Seminars, SIPo (Seminario de Investigadores Postdoctorales)