What is Inference in Bayesian Networks


Introduction
Bayesian networks are models that represent the relationships between random variables through a directed acyclic graph (DAG). Each node in the DAG represents a random variable, whereas the edges represent the dependencies between them. The ability of Bayesian networks to depict probabilistic relationships between variables makes them useful in a wide range of applications such as diagnosis, prediction, and decision-making. In this article, we will focus on one of the essential aspects of Bayesian networks, inference, which is the process of computing the probabilities of target variables given evidence on others.
Background
The fundamental Bayesian rule states that the probability of a hypothesis H given the evidence E is proportional to the probability of the evidence given the hypothesis and the prior probability of the hypothesis, i.e., P(H|E) ∝ P(E|H)P(H). In Bayesian networks, inference is the process of evaluating the posterior probability distribution of the query variable(s) given evidence. The posterior probability is proportional to the product of the prior probability and the likelihood of the evidence, i.e., P(X|e) = αP(X) Π P(ei|Pa(ei)). X is the query variable(s), e is the evidence, α is the normalization constant, and Pa(ei) represents the parents of the evidence node ei. However, computing the posterior probability distribution over all possible configurations of query and evidence variables is often computationally infeasible, especially for complex networks. Therefore, Bayesian networks rely on various inference algorithms that leverage the graph structure to perform efficient inference.
Inference Algorithms
Bayesian networks inference algorithms can be broadly classified into two categories: exact and approximate.
Exact Inference Algorithms
Exact inference algorithms compute the exact posterior probability distribution over the query variables given evidence. These algorithms guarantee the correct probabilities but may take a significant amount of time for larger networks. There are three exact inference algorithms, namely enumeration, variable elimination, and clique tree propagation.
  • Enumeration
  • Enumeration (also known as brute-force) involves enumerating all possible configurations of query variables and evidence variables and calculating the posterior probability of the query variables given the evidence. This approach is guaranteed to work, but its efficiency decreases exponentially as the number of variables and nodes in the network increases.
  • Variable Elimination
  • Variable elimination recursively computes the marginal probability distribution over query variables by summing out (“eliminating”) non-query variables from the joint distribution, while keeping the conditional probabilities of evidence variables. It is an efficient algorithm that leverages the structure of the Bayesian network to avoid computing unnecessary probabilities. Variable elimination maintains a factor graph structure as it eliminates variables, and it uses factors or potential functions to represent conditional probability distributions.
  • Clique Tree Propagation
  • Clique tree propagation or belief propagation is an exact inference algorithm that is more efficient than variable elimination. It works by creating a clique tree or a junction tree that is an undirected graph that respects the factorization of the joint probability distribution. Clique tree propagation leverages message passing between cliques to compute the marginal distribution of query variables while keeping the evidence fixed. It is typically faster than variable elimination by exploiting the component structure of the Bayesian network.
Approximate Inference Algorithms
Approximate inference algorithms are probabilistic methods that provide an approximate solution to the exact inference problem. These methods offer a trade-off between accuracy and time efficiency. The most common approximate methods include Monte Carlo methods, Markov Chain Monte Carlo (MCMC), and variational methods.
  • Monte Carlo Methods
  • Monte Carlo inference algorithms rely on sampling from the Bayesian network to approximate the posterior probabilities. These algorithms are often divided into two subclasses, namely rejection sampling and importance sampling. Rejection sampling involves generating samples from the prior distribution and then rejecting samples that do not contain the evidence. Importance sampling generates samples from distributions that are similar to the posterior distribution and uses a weight or importance factor to correct sampling bias.
  • Markov Chain Monte Carlo (MCMC)
  • MCMC is a popular approximation algorithm in Bayesian network inference, especially in cases where exact methods fail due to the size and complexity of the network. MCMC algorithms generate a Markov Chain whose stationary distribution is the posterior distribution. These methods generate a chain that slowly converges to the posterior distribution, and the samples collected from the chain are used to estimate the posterior probabilities. The most common MCMC algorithms are Gibbs sampling and Metropolis-Hastings (MH) algorithm.
  • Variational Methods
  • Variational methods are another popular method for approximate Bayesian network inference. These methods are based on the idea of finding a simpler approximation to the target distribution by posing the inference problem as an optimization problem. Variational methods optimize the Kullback-Leibler divergence between the approximate distribution and the true distribution. The most common variational methods are the well-known expectation-maximization (EM) algorithm and the mean-field approximation.
Conclusion
Bayesian networks are powerful probabilistic models that can represent and analyze complex systems that might be too difficult to understand intuitively. They can provide insights into the relationships between random variables and facilitate the computation of the probability distribution of the network given evidence. Exact inference algorithms such as enumeration, variable elimination, and clique tree propagation work for smaller networks, while for larger networks, approximate inference algorithms such as Monte Carlo methods, MCMC, and variational methods are more efficient. The choice of an inference algorithm will depend on the nature of the application, the size of the network, and the trade-off between time and accuracy.
Loading...