An Overview of Data Sampling Technique Markov Chain Monte Carlo (MCMC)

17 min readNov 14, 2023

Markov Chain Monte Carlo (MCMC) is a specialized technique within the broader class of Monte Carlo methods. It is widely used for sampling from complex probability distributions, particularly when direct sampling is challenging or impossible. MCMC methods involve constructing a Markov chain, where each step in the chain is a sample from a distribution that depends on the previous state. The chain iteratively explores the space of possible values, eventually converging to the target distribution. Here’s a comprehensive overview of Markov Chain Monte Carlo:

1. Basic Principles:

Markov Chain:

  • A sequence of random variables.
  • The probability distribution of each variable depends solely on the immediately preceding one.
  • Each random variable’s future state depends only on its current state, not the entire history. This property is known as the Markov property.
  • Transitions between states in a Markov chain are determined by transition probabilities.
  • Markov chains exhibit a memoryless property, making them useful for modeling systems where the future state depends only on the present state.
  • Markov chains can be discrete or continuous depending on the nature of the underlying process.
  • For example, in a simple weather model, the state of the weather tomorrow depends only on today’s weather, not on past weather conditions.
  • Some Markov chains reach a stationary distribution, where the probabilities of being in each state remain constant over time.
  • Transition probabilities can be represented in a transition matrix, where each entry specifies the probability of transitioning from one state to another.
  • While Markov chains simplify modeling, they may not capture long-term dependencies or external influences.
  • In MCMC, a Markov chain is constructed to generate correlated samples converging to a target distribution, often used in Bayesian statistics.

Monte Carlo:

  • Monte Carlo methods involve the use of random sampling to obtain numerical results…