Tags

RL

Online Monte Carlo and TD learning

In Monte Carlo, we played multiple episodes, accumulated rewards through out and averaged it. But there is a real uncertainity about the episodic length in r...

Bellman Equation using Policy Iteration

In Bellman-based methods, we do not start with a known policy. The idea is to begin with a random policy, evaluate how good that policy is, improve it based ...

Deriving the Bellman equation

The Bellman equation is a fundamental recursive relationship in reinforcement learning, expressing the value of a state in terms of immediate rewards and the...

Back to Top ↑

NLP

Transformer

In this blog post, we will be going over the Transformer architecture. Introduced by Google in 2017 to solve language translation, Transformers revolutionize...

Markov Sequence Model

A sequence model is a machine learning model that captures patterns in sequential or chronological data and uses them to make predictions.

Back to Top ↑