Monte Carlo methods for solving reinforcement learning problems | by Oliver S | Sep, 2024

Richard S. Sutton's Reinforcement Learning Analysis with Custom Python Implementations, Episode III

We continue to delve deeper into Sutton's great book on RL (1) and here we focus on Monte Carlo (MC) methods. These can learn from experience alone, i.e. they do not require any kind of model of the environment, as required, for example, by the dynamic programming (DP) methods we introduced in the previous post.

This is extremely tempting, as the model is often unknown or it is difficult to model transition probabilities. Consider the game of Blackjack:although we fully understand the game and the rules, solving it through DP methods would be very tedious: we would have to calculate all kinds of probabilities, for example, given the cards currently being played, what is the probability of a “blackjack”, what is the probability of another seven being dealt…? Through MC methods, we don't have to deal with any of this, and we just play and learn from the experience.

Because they do not use a model, MC methods are unbiased. They are conceptually simple and easy to understand, but they have high variance and cannot be solved iteratively (bootstrapping).

As mentioned, here we will present these methods following Chapter 5 of Sutton's book…

Monte Carlo methods for solving reinforcement learning problems | by Oliver S | Sep, 2024

Technical Terrence Team

4 Stocks to Watch on Wednesday: DLTR, HPE and More

Leave a Reply Cancel reply

Recommended.

Christie's sells its first Ordinal Bitcoin made by Ryan Koopmans and Alice Wexell

7 Essential Data Quality Checks with Pandas

New survey shows game developers are very concerned about layoffs and AI

A new chapter for DeFi

Are stocks telling us a pullback is coming?

Categories

Important Links

Monte Carlo methods for solving reinforcement learning problems | by Oliver S | Sep, 2024

Richard S. Sutton's Reinforcement Learning Analysis with Custom Python Implementations, Episode III

Related

Technical Terrence Team

4 Stocks to Watch on Wednesday: DLTR, HPE and More

Leave a Reply Cancel reply

Recommended.

Christie's sells its first Ordinal Bitcoin made by Ryan Koopmans and Alice Wexell

7 Essential Data Quality Checks with Pandas

New survey shows game developers are very concerned about layoffs and AI

A new chapter for DeFi

Are stocks telling us a pullback is coming?

Categories

Important Links

Get daily news updates to your inbox!