Machine learning and stochastic control

H. Pham

TP: Samy Mekkaoui and Alexandre Alouadi

This course presents recent developments at the interface between stochastic control and machine learning. It is structured around three main themes:

Part I: Neural network algorithms for PDEs and stochastic control

Approximation using neural networks, combined with the efficiency of gradient descent algorithms, has recently led to remarkable advances in solving high-dimensional partial differential equations (PDEs), particularly those arising from optimal control in finance. We will present the main methods developed in the literature, based on deterministic or probabilistic formulations: - Deep Galerkin, - Deep BSDE, - Deep backward BSDE, - Control learning and value function iteration. These results will be illustrated by several numerical tests.

Part II: Deep reinforcement learning

When the dynamics of the system are unknown, optimal strategies can be learned directly from interactions with the environment. This is the principle of reinforcement learning (RL), an approach that is increasingly used in stochastic control. We will review the fundamentals of RL as well as its continuous-time extensions: policy gradients, actor-critic methods, Q-learning, and algorithms adapted to continuous state and action spaces.

Part III: Generative diffusion models and applications to sequential data

This part introduces generative modeling methods based on dynamic optimal transport and Schrödinger bridges, which are at the heart of new diffusion-type models. We will detail the theoretical foundations (static vs. dynamic optimal transport, entropic regularization), the associated algorithms (Schrödinger bridges via Sinkhorn, simulation of controlled stochastic trajectories), and their recent applications to the generation of time series, e.g., financial.

Practical sessions will be devoted to the implementation and application of the algorithms studied, with concrete cases of optimal control or time series generation.

Bibliography

[1] M. Germain, H. Pham, X. Warin: Neural networks-based algorithms for stochastic control and PDEs in finance, Machine Learning and Data Sciences for Financial Markets: a guide to contemporary practices, Cambridge University Press, 2023, Editors: A. Capponi and C. A. Lehalle

[2] B. Hambly, R. Xu: Recent advances in reinforcement learning in finance, Mathematical Finance, 2023

[3] M. Hamdouche, P. Henry-Labordère, H. Pham: Generative modeling for time series via Schrödinger bridge, 2023.

[3] Y. Jia and X.Y. Zhou: Policy gradient and Actor-Critic learning in continuous time and space: theory and algorithms, 2022, Journal of Machine Learning and Research.

[4] Y. Jia and X.Y. Zhou: q-Learning in continuous time, Journal of Machine Learning and Research, 2023.

[6] R. Sutton and A. Barto: Introduction to reinforcement learning, second edition 2016,

[7] V. De Bortoli, J. Thornton, J. Heng, A. Doucet: Diffusion Schrödinger bridge with applications to score-based generative modeling, 2021, NeurIPS.