Recent Posts

Transformer

In this blog post, we will be going over the Transformer architecture. Introduced by Google in 2017 to solve language translation, Transformers revolutionize...

Markov Sequence Model

A sequence model is a machine learning model that captures patterns in sequential or chronological data and uses them to make predictions.

Online Monte Carlo and TD learning

In Monte Carlo, we played multiple episodes, accumulated rewards through out and averaged it. But there is a real uncertainity about the episodic length in r...

Bellman Equation using Policy Iteration

In Bellman-based methods, we do not start with a known policy. The idea is to begin with a random policy, evaluate how good that policy is, improve it based ...

Deriving the Bellman equation

The Bellman equation is a fundamental recursive relationship in reinforcement learning, expressing the value of a state in terms of immediate rewards and the...