Data Analytics Course

Data Analytics Training

Introduction to AI

What is Artificial Intelligence (AI)? Types, Uses, Benefits, Challenges, Working

Full History of AI (Timeline, Founder, Evolution, Development)

Types of Artificial Intelligence (AI): Classification With Examples

Weak AI vs Strong AI Difference: 2024 Comparison

20 Applications of AI in 2024 (Artificial Intelligence Uses)

Computer Vision

What is Computer Vision? Applications, Examples, Models, Challenges

Image Preprocessing in ML: Techniques, Tools, Uses

Image Recognition in Machine Learning: Examples, Applications, Algorithm, Techniques, Tools

What is Object Detection? Algorithms, Model, Uses

Generative Adversarial Networks (GANs) in Deep Learning: Full Guide 2024

Image Segmentation: Types, Techniques, Applications, Challenges

What is Transfer Learning? Models, Examples, Full Guide

10 Best Image Recognition Tools & Software in 2024

Machine Learning Basics

Data Preprocessing in Machine Learning: Techniques, Steps, Methods, Tools

Reinforcement Learning in Machine Learning & AI: Full Guide 2024

3 Types of Machine Learning (With Examples)

What is Semi-Supervised Learning in ML? Uses, Working, Benefits, Algorithm

What is Active Learning in ML? Types, Uses, Benefits, Tools

What is Supervised Learning? Examples, Algorithms, Types, Working

What is Unsupervised Learning? Examples, Algorithms, Types

What is Machine Learning in AI? Ultimate Guide 2024

Deep Learning Fundamentals

Neural Networks in Machine Learning & AI (Algorithm, Types, Uses)

Activation Functions in Neural Networks: Types, Role, Full Guide

Backpropagation in Neural Networks: Algorithm, Types, Working

Gradient Descent in Machine Learning: Algorithm, Types, Optimization

Top 8 Deep Learning Frameworks (2024 Comparison)

What is Perceptron in Neural Network? Algorithm, Types, Components

Recurrent Neural Network (RNN) in Deep Learning: Explained

Convolutional Neural Network (CNN): Algorithm, Architecture, Layers, Working

AI in Real-world Applications

Artificial Intelligence (AI) in Robotics and Automation (2024 Guide)

AI in Space Exploration & Scientific Research (Uses, Applications, Challenges)

Artificial Intelligence (AI) in Gaming: Ultimate Guide 2024

Artificial Intelligence (AI) in Agriculture: Role, Use, Examples

Advanced AI Topics

Variational Autoencoder (VAE): Architecture, Models, Full Guide

Recommender Systems

What is Recommender System? Algorithm, Types, Benefits, Applications

Collaborative Filtering Recommendation System: Algorithm

What is Content-based Recommendation System? How Does it Work?

Singular Value Decomposition (SVD): A Complete Guide

Matrix Factorization for Recommender Systems

Reinforcement Learning

Markov Decision Processes (MDP) in Machine Learning

<p>Comprehensive tutorial to learn the core concept of AI</p>

AI Tutorial

AI Interview Questions

Learn Complete AI in this step-by-step beginner's guide to mastering artificial intelligence concepts effortlessly. Get Started Now!

AI Tutorial 2024 (Step by Step Guide for Beginners)

AI Quiz

Comprehensive tutorial to learn the core concept of AI

Introduction

What is Markov Decision Process (MDP)?

Key Terms Related to Markov Decision Process in Machine Learning

Partially Observable Markov Decision Process

Markov Decision Process Formulation

Markov Decision Process Example

Applications of Markov Decision Process

Markov Chain vs Markov Process

<p><span id="docs-internal-guid-249d4c07-7fff-fa36-d349-e7df668dadf5"></span></p>
<p dir="ltr">The Markov Decision Process (MDP) is a mathematical tool or framework used for decision-making models where the outcomes are partially controllable and partially random. This framework can address most of the <a href="https://www.tutorialsfreak.com/ai-tutorial/reinforcement-learning" target="_blank" rel="noopener">reinforcement learning (RL)</a> problems.</p>

<p dir="ltr">The Markov decision process in <a href="https://www.tutorialsfreak.com/ai-tutorial/what-is-artificial-intelligence" target="_blank" rel="noopener">artificial intelligence</a> is a stochastic decision-making process used to model the decision-making of a dynamic system. It is used where outcomes are partly random and partly controlled. MDPs assess actions that a decision-maker must take according to the current state and environment of the system.&nbsp;</p>
<p dir="ltr">MDP in machine learning relies on different variables, including actions of the agent, environment, and rewards, to decide the next optimal action of the system. Based on different factors, such as available states, set of actions, and decision-making frequency, they are divided into four types- infinite, finite, continuous, and discrete.</p>
<p dir="ltr">Markov model has been around since the early 1950s, and the Russian mathematician Andrey Markov, who played a crucial part in shaping stochastic processes, inspired the name Markov.&nbsp;</p>
<p dir="ltr">Initially, Markov decision processes reinforcement learning were used to solve problems associated with inventory management and control, routing, and queuing optimization. However, now, MDPs are used in <a href="https://www.tutorialsfreak.com/ai-tutorial/ai-robotics" target="_blank" rel="noopener">robotics</a>, studying optimization problems through dynamic programming, economics, automatic control, manufacturing, etc.&nbsp;</p>
<p dir="ltr">We also use the Markov decision process to design intelligent machines that must function longer in an environment where actions can generate uncertain outcomes. It is primarily popular in two subareas of artificial intelligence- probabilistic planning and <a href="https://www.tutorialsfreak.com/ai-tutorial/reinforcement-learning" target="_blank" rel="noopener">reinforcement learning (RL)</a>.&nbsp;</p>
<p dir="ltr">The probabilistic planning discipline uses a known model to achieve the goals of an agent and focuses on guiding machines to make decisions while helping them learn to behave to attain their objectives. On the other hand, reinforcement learning enables applications to learn from feedback that the agent receives from the environment.&nbsp;</p>
<p dir="ltr"><strong>A Markov Decision Process (MDP) model includes:</strong></p>
<p><span id="docs-internal-guid-e8a8d5e1-7fff-d989-7b0e-b2a7980ab7ca"></span></p>
<ul>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation">A set of possible world states.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation">A real-valued reward function.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation">A set of Models.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation">A set of possible actions.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation">A policy.</p>
</li>
</ul>

<p dir="ltr">Here are a few terms you must understand that are used throughout the blog.</p>
<ul>
<li dir="ltr">
<h3>State</h3>
</li>
</ul>
<p dir="ltr">A state in the Markov decision process in artificial intelligence is a set of tokens representing the current state of the agent. It can be either the exact position of the robot in the house, its current posture, or the alignment of its legs. It depends on the way you address a problem.&nbsp;</p>
<ul>
<li dir="ltr">
<h3>Model</h3>
</li>
</ul>
<p dir="ltr">A model, or transition model, given the effect of an action in a state. To be specific, &lsquo;T&rsquo; is a transition, where being in a state &lsquo;S&rsquo; and taking an action &lsquo;a&rsquo; get us to state &lsquo;S&rsquo;&rsquo; (S and S&rsquo; can be the same). For stochastic actions, we define a probability P, representing the probability of reaching a state S&rsquo; if an action is taken in state S. According to Markov property, the effects of an action taken in a state are based on only that state and not on the history.&nbsp;</p>
<ul>
<li dir="ltr">
<h3>Actions</h3>
</li>
</ul>
<p dir="ltr">Actions refer to choices the agent makes at the current time step. It is a set of all possible actions. For example, a robot can move right or left leg, lift an object, raise an arm, or turn left or right. We already know the set of actions or decisions the agent will take.</p>
<ul>
<li dir="ltr">
<h3>Reward</h3>
</li>
</ul>
<p dir="ltr">A reward in the Markov model means a real-valued reward function. Reward (R) shows the reward for being in the state S, whereas R(S, a) shows the reward for being in a state and taking an action. R(S, a S&rsquo;) refers to the reward for being in a state S, taking an action a, and reaching the state S&rsquo;.</p>
<ul>
<li dir="ltr">
<h3>Policy</h3>
</li>
</ul>
<p dir="ltr">A policy shows the thought process behind choosing an action. It is a solution to the Markov decision process in AI and refers to the mapping from S to a, indicating an action taken while in state S. High rewarding actions have a high probability and vice-versa. In the case of an action having a low probability, it doesn&rsquo;t mean it won&rsquo;t be selected at all, but it is just less likely to be picked.&nbsp;</p>
<ul>
<li dir="ltr">
<h3>Agent</h3>
</li>
</ul>
<p dir="ltr">A reinforcement learning agent is the entity we train to make the right decisions. For example, training a robot to move around a house without scratching.</p>
<ul>
<li dir="ltr">
<h3>Environment</h3>
</li>
</ul>
<p><span id="docs-internal-guid-6014e0e4-7fff-a07e-9589-75749b834719"></span></p>
<p dir="ltr">The environment in MDP in machine learning means the surroundings an agent interacts with. For example, the room or premises where the robot moves. An agent can&rsquo;t manipulate the environment but can control its own actions. So, the robot might not decide the place of a chair but can move around it.</p>

<p dir="ltr">A Partially Observable Markov Decision Process (POMDP) is an extension of the standard Markov Decision Process (MDP) that accounts for situations where the agent does not have complete information about the state of the environment.&nbsp;</p>
<p dir="ltr">In a POMDP, the agent faces uncertainty not only in the environment's dynamics but also in its observations.&nbsp;</p>
<p><span id="docs-internal-guid-a24cc625-7fff-1d38-d40d-e9da1faaa3f3"></span></p>
<p dir="ltr">POMDPs find applications in various domains where decision-making must account for incomplete or noisy information, such as <a href="https://www.tutorialsfreak.com/ai-tutorial/ai-robotics" target="_blank" rel="noopener">robotics</a>, autonomous systems, natural language processing, and healthcare, among others. They provide a powerful framework for modeling and solving problems in uncertain environments.</p>

<p dir="ltr">Formulating a Markov Decision Process (MDP) involves defining the key components and characteristics of the decision-making problem. Here's a step-by-step guide on how to formulate an MDP:</p>
<h3 dir="ltr">Step 1: Define the Components</h3>
<p dir="ltr"><strong>Identify the core components of the problem:</strong></p>
<ul>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>States (S):</strong> Determine the set of possible states that represent different situations or configurations of the environment. States should encapsulate all relevant information about the system.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>Actions (A):</strong> Specify the actions or decisions that the agent can take. Actions influence the state transitions and impact the system's behavior.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>Rewards (R):</strong> Define the rewards associated with state-action pairs. Rewards represent the immediate desirability or cost of taking a specific action in a particular state.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>Transitions (T):</strong> Describe the dynamics of the environment, specifying the probabilities of transitioning from one state to another based on the agent's actions.</p>
</li>
</ul>
<h3 dir="ltr">Step 2: Define the Objective</h3>
<p dir="ltr">Clearly articulate the objective or goal of the decision-making problem. Decide what the agent aims to achieve, whether it's maximizing cumulative rewards, minimizing costs, reaching a specific state, or another desired outcome.</p>
<h3 dir="ltr">Step 3: Ensure the Markov Property</h3>
<p dir="ltr">Ensure that the problem satisfies the Markov property, which states that the future state (and rewards) depends only on the current state and action, regardless of the entire history of states and actions. This memoryless property simplifies modeling and computation.</p>
<h3 dir="ltr">Step 4: Formulate Policies</h3>
<p dir="ltr">Introduce the concept of policies (denoted as &pi;). Policies represent strategies or decision rules for the agent. A policy defines which action to take in each state. Policies can be deterministic (one action per state) or stochastic (a probability distribution over actions).</p>
<h3 dir="ltr">Step 5: Define the Objective Function</h3>
<p dir="ltr">Define an objective function or criterion that quantifies the agent's goal. This could be the expected cumulative reward, which the agent aims to maximize over time.</p>
<h3 dir="ltr">Step 6: Solving the MDP</h3>
<p dir="ltr">Depending on the complexity of the problem, select appropriate methods and algorithms to find an optimal policy (&pi;*) that maximizes (or minimizes) the objective function. Common approaches include:</p>
<ul>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>Dynamic Programming:</strong> Value iteration or policy iteration.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>Monte Carlo methods</strong>: Estimate value functions through sampling.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>Temporal Difference learning:</strong> Update value functions based on temporal differences.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>Reinforcement Learning:</strong> Use deep <a href="https://www.tutorialsfreak.com/ai-tutorial/reinforcement-learning" target="_blank" rel="noopener">reinforcement learning </a>techniques for complex problems.</p>
</li>
</ul>
<h3 dir="ltr">Step 7: Policy Execution</h3>
<p dir="ltr">Once an optimal policy (or a good policy approximation) is found, it can be executed in the real system or environment, guiding the agent to make decisions that maximize its expected long-term rewards.</p>
<h3 dir="ltr">Step 8: Continuous Improvement</h3>
<p><span id="docs-internal-guid-e7567803-7fff-f370-3ed5-b6b4519d2d54"></span></p>
<p dir="ltr">Implement a feedback loop to continuously update the policy as the agent interacts with the environment. This allows the agent to adapt to changing conditions and improve its decision-making over time.</p>

<p dir="ltr">Let's consider a simple example of a Markov Decision Process (MDP) known as the "Frozen Lake" problem. This problem is often used to illustrate the basic concepts of MDPs and <a href="https://www.tutorialsfreak.com/ai-tutorial/reinforcement-learning" target="_blank" rel="noopener">reinforcement learning</a>.</p>
<h3 dir="ltr">Problem Description:</h3>
<p dir="ltr">Imagine a frozen lake represented as a grid. The agent starts at the top-left corner and needs to reach the bottom-right corner while avoiding holes in the ice. The agent can take four possible actions at each grid cell: move up, move down, move left, or move right. The ice is slippery, so the agent may not always move in the intended direction.</p>
<p dir="ltr"><strong>Here are the key components of this MDP:</strong></p>
<ul>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>States (S):</strong> Each grid cell in the frozen lake represents a state. There are multiple states in the grid.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>Actions (A)</strong>: The agent can take four actions: "Up," "Down," "Left," and "Right."</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>Rewards (R):</strong> The agent receives a reward of +1 for reaching the goal (bottom-right corner) and a reward of -1 for falling into a hole. All other transitions have a reward of 0.</p>
</li>
<li dir="ltr" aria-level="1">
<p dir="ltr" role="presentation"><strong>Transitions (T): </strong>Due to the slippery ice, transitions are probabilistic. If the agent chooses to move in a certain direction, there's a 0.7 probability that it will move in the intended direction and a 0.3 probability that it will move in a random direction.</p>
</li>
</ul>
<h4 dir="ltr">Objective:</h4>
<p dir="ltr">The objective of the agent is to find an optimal policy (a strategy) that maximizes the expected cumulative reward while navigating from the start to the goal.</p>
<h4 dir="ltr">Example:</h4>
<p dir="ltr">Let's look at a simplified 4x4 grid representing a portion of the frozen lake. In this grid, "S" represents the start, "G" represents the goal, "H" represents a hole, and "F" represents a safe frozen cell.</p>
<p dir="ltr"><code>S&nbsp; F&nbsp; F&nbsp; F</code></p>
<p dir="ltr"><code>F&nbsp; H&nbsp; F&nbsp; H</code></p>
<p dir="ltr"><code>F&nbsp; F&nbsp; F&nbsp; H</code></p>
<p dir="ltr"><code>H&nbsp; F&nbsp; F&nbsp; G</code></p>
<p dir="ltr">In this grid, the agent starts at "S" and needs to find a path to "G" while avoiding holes "H." The agent's actions are uncertain due to the slippery ice.</p>
<h3 dir="ltr">Solving the MDP:</h3>
<p dir="ltr">To solve this MDP and find the optimal policy, various reinforcement learning algorithms can be applied, such as Q-learning or policy iteration. The agent learns to take actions that maximize the expected cumulative reward over time and, over many iterations, discovers the optimal strategy for reaching the goal while avoiding holes.</p>
<p><span id="docs-internal-guid-109c865d-7fff-6d11-7763-86b86c4e4e59"></span></p>
<p dir="ltr">The optimal policy will guide the agent to take actions that increase the chances of reaching the goal and receiving positive rewards.</p>

<p dir="ltr">Markov Decision Processes (MDPs) find applications in a wide range of fields where decision-making under uncertainty is crucial. Here are some notable applications of MDPs:</p>
<ul>
<li dir="ltr">
<h3><a href="https://www.tutorialsfreak.com/ai-tutorial/reinforcement-learning" target="_blank" rel="noopener">Reinforcement Learning</a> and <a href="https://www.tutorialsfreak.com/ai-tutorial/ai-robotics" target="_blank" rel="noopener">Robotics</a>:</h3>
</li>
</ul>
<p dir="ltr">MDPs are at the core of reinforcement learning, where agents learn to make decisions by interacting with environments. Robots use MDPs to plan and execute actions, enabling them to navigate, manipulate objects, and perform tasks.</p>
<ul>
<li dir="ltr">
<h3>Game Playing:</h3>
</li>
</ul>
<p dir="ltr">MDPs are used in <a href="https://www.tutorialsfreak.com/ai-tutorial/ai-in-gaming" target="_blank" rel="noopener">artificial intelligence for game playing</a>. Game agents, such as chess or Go-playing programs, employ MDPs to make optimal moves and decisions to win games.</p>
<ul>
<li dir="ltr">
<h3>Finance and Portfolio Management:</h3>
</li>
</ul>
<p dir="ltr">In finance, MDPs help optimize portfolio allocation and trading strategies. Traders use MDPs to make decisions on buying or selling financial assets to maximize returns while considering risks.</p>
<ul>
<li dir="ltr">
<h3>Healthcare and Treatment Planning:</h3>
</li>
</ul>
<p dir="ltr">MDPs are applied in healthcare for treatment planning and personalized medicine. They assist in determining optimal treatment paths for patients with chronic diseases, considering various factors like patient history and drug interactions.</p>
<ul>
<li dir="ltr">
<h3>Energy Management:</h3>
</li>
</ul>
<p dir="ltr">MDPs play a role in energy management systems. They help control the operation of smart grids, optimizing energy distribution and consumption while minimizing costs and environmental impact.</p>
<ul>
<li dir="ltr">
<h3>Autonomous Vehicles:</h3>
</li>
</ul>
<p dir="ltr">Self-driving cars and drones use MDPs to make real-time decisions on navigation, obstacle avoidance, and route planning while considering traffic, weather, and safety.</p>
<ul>
<li dir="ltr">
<h3>Natural Language Processing (NLP):</h3>
</li>
</ul>
<p dir="ltr">In NLP, MDPs can be used for dialogue management and chatbot interactions. They help chatbots make decisions about what responses to generate based on user input and conversation history.</p>
<ul>
<li dir="ltr">
<h3>Supply Chain Management:</h3>
</li>
</ul>
<p dir="ltr">MDPs are employed in supply chain optimization. They assist in making decisions about inventory management, demand forecasting, and logistics to minimize costs and improve efficiency.</p>
<ul>
<li dir="ltr">
<h3>Environmental Management:</h3>
</li>
</ul>
<p dir="ltr">Conservationists use MDPs to manage natural resources and wildlife. These models aid in making decisions about habitat preservation, species conservation, and ecosystem management.</p>
<ul>
<li dir="ltr">
<h3><a href="https://www.tutorialsfreak.com/ai-tutorial/ai-in-gaming" target="_blank" rel="noopener">Game AI</a> and Simulation:</h3>
</li>
</ul>
<p dir="ltr">Video game developers use MDPs to create intelligent non-player characters (NPCs) that exhibit complex behaviors and adapt to player actions.</p>
<ul>
<li dir="ltr">
<h3>Recommendation Systems:</h3>
</li>
</ul>
<p dir="ltr">MDPs are used in <a href="https://www.tutorialsfreak.com/ai-tutorial/recommender-system" target="_blank" rel="noopener">recommendation systems</a> to decide what products, movies, or content to recommend to users based on their preferences and behaviors.</p>
<ul>
<li dir="ltr">
<h3>Agriculture and Precision Farming:</h3>
</li>
</ul>
<p dir="ltr">MDPs help optimize crop management and irrigation systems in agriculture. They make decisions about when and how much water, fertilizer, or pesticides to apply to maximize yields.</p>
<ul>
<li dir="ltr">
<h3>Marketing and Advertising:</h3>
</li>
</ul>
<p dir="ltr">Marketers use MDPs for optimizing ad campaigns. These models decide which ads to display to users, considering factors like user demographics and ad effectiveness.</p>
<ul>
<li dir="ltr">
<h3>Pharmaceutical Drug Discovery:</h3>
</li>
</ul>
<p dir="ltr">In drug discovery, MDPs are applied to identify potential drug candidates and optimize drug development processes.</p>
<ul>
<li dir="ltr">
<h3>Security and Anomaly Detection:</h3>
</li>
</ul>
<p dir="ltr">MDPs are used in security applications to detect anomalies and make decisions about security protocols and threat responses.</p>

<p><span id="docs-internal-guid-5f6a60b4-7fff-77ab-aeb4-1b97cc9397f2"></span></p>
<div dir="ltr" style="margin-left: 0pt;" align="left">
<table style="border: none; border-collapse: collapse;"><colgroup><col width="146" /><col width="231" /><col width="225" /></colgroup>
<tbody>
<tr style="height: 52.75pt;">
<td style="vertical-align: bottom; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Aspect</span></p>
</td>
<td style="vertical-align: bottom; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Markov Chain</span></p>
</td>
<td style="vertical-align: bottom; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: bold; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Markov Process</span></p>
</td>
</tr>
<tr style="height: 106.75pt;">
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Definition</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">A Markov Chain is a mathematical model that describes the transitions between a finite set of states over discrete time steps.</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">A Markov Process is a general term that encompasses Markov Chains and extends to continuous-time processes as well.</span></p>
</td>
</tr>
<tr style="height: 88.75pt;">
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Time Representation</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Typically discrete-time, where transitions occur at fixed time intervals.</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Can be discrete-time (like Markov Chains) or continuous-time, allowing transitions at any point in time.</span></p>
</td>
</tr>
<tr style="height: 52.75pt;">
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">State Space</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Finite or countable set of states.</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Can have a finite, countable, or continuous state space.</span></p>
</td>
</tr>
<tr style="height: 70.75pt;">
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Transition Probabilities</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Describes the probabilities of moving from one state to another in the next time step.</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Transition rates are used to describe how the system moves between states.</span></p>
</td>
</tr>
<tr style="height: 88.75pt;">
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Memorylessness</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Markov Chains are memoryless; future transitions depend only on the current state and not on the past history.</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Markov Processes can be memoryless (like Markov Chains) or have memory, depending on the specific process.</span></p>
</td>
</tr>
<tr style="height: 70.75pt;">
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Discrete vs. Continuous Variables</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Typically involves discrete variables (states).</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Can involve both discrete and continuous variables, depending on the process.</span></p>
</td>
</tr>
<tr style="height: 88.75pt;">
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Examples</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Board games (e.g., Monopoly), random walks, weather patterns, and discrete-event simulations.</span></p>
</td>
<td style="vertical-align: middle; background-color: #ffffff; padding: 5pt 5pt 5pt 5pt; overflow: hidden; overflow-wrap: break-word; border: solid #d9d9e3 0.54545475pt;">
<p dir="ltr" style="line-height: 1.2; margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 11pt; font-family: Arial,sans-serif; color: #000000; background-color: transparent; font-weight: 400; font-style: normal; font-variant: normal; text-decoration: none; vertical-align: baseline; white-space: pre-wrap;">Queueing systems, continuous-time financial models, Brownian motion, and continuous-state systems.</span></p>
</td>
</tr>
</tbody>
</table>
</div>

Learn Markov Decision Processes (MDP) in machine learning. Explore algorithms, applications, and the strategic decision-making power of MDP in this tutorial.