Revision as of 19:56, 21 December 2007 edit128.61.30.243 (talk) →References← Previous edit | Revision as of 18:07, 7 March 2008 edit undoUriah923 (talk | contribs)Extended confirmed users2,621 edits cleanupNext edit → | ||
Line 1: | Line 1: | ||
'''Importance sampling''' is a general technique for estimating the properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. Depending on the application, the term may refer to the process of sampling from this alternative distribution, the process of inference, or both. | '''Importance sampling''' is a general technique for estimating the properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. Depending on the application, the term may refer to the process of sampling from this alternative distribution, the process of inference, or both. | ||
= Basic theory = | == Basic theory == | ||
More formally, let <math>X</math> be a ] in <math>S</math>. Let <math>p</math> be a | More formally, let <math>X</math> be a ] in <math>S</math>. Let <math>p</math> be a | ||
probability measure on <math>S</math>, and <math>f</math> some function on <math>S</math>. Then the expectation of <math>f</math> under <math>p</math> can be written as | probability measure on <math>S</math>, and <math>f</math> some function on <math>S</math>. Then the expectation of <math>f</math> under <math>p</math> can be written as | ||
Line 55: | Line 55: | ||
There are two main applications of importance sampling methods, which naturally, are interrelated. While the aim of both applications is to estimate statistics of random variables, the field of probabilistic inference focuses more on the estimation of <math>p</math> or related statistics, while the field of simulation focuses more on the choice of the distribution <math>q</math>. Nevertheless, the basic theory and tools are identical. | There are two main applications of importance sampling methods, which naturally, are interrelated. While the aim of both applications is to estimate statistics of random variables, the field of probabilistic inference focuses more on the estimation of <math>p</math> or related statistics, while the field of simulation focuses more on the choice of the distribution <math>q</math>. Nevertheless, the basic theory and tools are identical. | ||
= Application to probabilistic inference = | == Application to probabilistic inference == | ||
Such methods are frequently used to estimate posterior densities or expectations in state and/or parameter estimation problems in probabilistic models that are too hard to treat analytically. | Such methods are frequently used to estimate posterior densities or expectations in state and/or parameter estimation problems in probabilistic models that are too hard to treat analytically. | ||
= Application to simulation = | == Application to simulation == | ||
'''Importance sampling''' (IS) is a ] reduction technique that can be used in the ]. The idea behind IS is that certain values of the input ] in a ] have more impact on the parameter being estimated than others. If these "important" values are emphasized by sampling more frequently, then the ] variance can be reduced. Hence, the basic methodology in IS is to choose a distribution which "encourages" the important values. This use of "biased" distributions will result in a biased estimator if it is applied directly in the simulation. However, the simulation outputs are weighted to correct for the use of the biased distribution, and this ensures that the new IS estimator is unbiased. The weight is given by the ], that is, the ] of the true underlying distribution with respect to the biased simulation distribution. | '''Importance sampling''' (IS) is a ] reduction technique that can be used in the ]. The idea behind IS is that certain values of the input ] in a ] have more impact on the parameter being estimated than others. If these "important" values are emphasized by sampling more frequently, then the ] variance can be reduced. Hence, the basic methodology in IS is to choose a distribution which "encourages" the important values. This use of "biased" distributions will result in a biased estimator if it is applied directly in the simulation. However, the simulation outputs are weighted to correct for the use of the biased distribution, and this ensures that the new IS estimator is unbiased. The weight is given by the ], that is, the ] of the true underlying distribution with respect to the biased simulation distribution. | ||
Line 142: | Line 142: | ||
An associated issue is the fact that the ratio <math>\sigma^2_{MC} / \sigma^2_{IS} \,</math> overestimates the run-time savings due to IS since it does not include the extra computing time required to compute the weight function. Hence, some people evaluate the net run-time improvement by various means. Perhaps a more serious overhead to IS is the time taken to devise and program the technique and analytically derive the desired weight function. | An associated issue is the fact that the ratio <math>\sigma^2_{MC} / \sigma^2_{IS} \,</math> overestimates the run-time savings due to IS since it does not include the extra computing time required to compute the weight function. Hence, some people evaluate the net run-time improvement by various means. Perhaps a more serious overhead to IS is the time taken to devise and program the technique and analytically derive the desired weight function. | ||
== References == | |||
⚫ | * ''Importance sampling - Applications in communications and detection'', Rajan Srinivasan, Springer-Verlag, Berlin, 2002. | ||
⚫ | * ''Stochastic Simulation'', B. D. Ripley, 1987, Wiley & Sons | ||
⚫ | * ''Sequential Monte Carlo Methods in Practice'', by A Doucet, N de Freitas and N Gordon. Springer, 2001. ISBN 978-0387951461 | ||
⚫ | * ''Introduction to rare event simulation'', James Antonio Bucklew, Springer-Verlag, New York, 2004. | ||
* P. J.Smith, M.Shafi, and H. Gao, "Quick simulation: A review of importance sampling techniques in communication systems," IEEE J.Select.Areas Commun., vol. 15, pp. 597-613, May 1997. | * P. J.Smith, M.Shafi, and H. Gao, "Quick simulation: A review of importance sampling techniques in communication systems," IEEE J.Select.Areas Commun., vol. 15, pp. 597-613, May 1997. | ||
* M. Ferrari, S. Bellini, "Importance Sampling simulation of turbo product codes," ICC2001, The IEEE International Conference on Communications, vol. 9, pp. 2773-2777, June 2001. | * M. Ferrari, S. Bellini, "Importance Sampling simulation of turbo product codes," ICC2001, The IEEE International Conference on Communications, vol. 9, pp. 2773-2777, June 2001. | ||
* Tommy Oberg, Modulation, Detection, and Coding, John Wiley & Sons, Inc., New York, 2001. | * Tommy Oberg, Modulation, Detection, and Coding, John Wiley & Sons, Inc., New York, 2001. | ||
* R. Srinivasan., Importance Sampling. New York: Springer, 2002. | * R. Srinivasan., Importance Sampling. New York: Springer, 2002. | ||
* Arouna. Adaptative Monte Carlo Method, A Variance Reduction Technique. Monte Carlo Methods and Their Applications. 2004 | * Arouna. Adaptative Monte Carlo Method, A Variance Reduction Technique. Monte Carlo Methods and Their Applications. 2004 | ||
== See also == | == See also == | ||
* ] | * ] | ||
* ] | * ] | ||
Line 162: | Line 160: | ||
==External links== | ==External links== | ||
* , Eric C. Anderson, Lecture notes for Stat 587C | * , Eric C. Anderson, Lecture notes for Stat 587C | ||
* homepage on University of Cambridge | * homepage on University of Cambridge | ||
* European journal of Physics. PDF document. | * European journal of Physics. PDF document. | ||
* Winter Simulation Conference | * Winter Simulation Conference | ||
== References == | |||
⚫ | * ''Importance sampling - Applications in communications and detection'', Rajan Srinivasan, Springer-Verlag, Berlin, 2002. | ||
⚫ | * ''Stochastic Simulation'', B. D. Ripley, 1987, Wiley & Sons | ||
⚫ | * ''Sequential Monte Carlo Methods in Practice'', by A Doucet, N de Freitas and N Gordon. Springer, 2001. ISBN 978-0387951461 | ||
⚫ | * ''Introduction to rare event simulation'', James Antonio Bucklew, Springer-Verlag, New York, 2004. | ||
] | ] |
Revision as of 18:07, 7 March 2008
Importance sampling is a general technique for estimating the properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. Depending on the application, the term may refer to the process of sampling from this alternative distribution, the process of inference, or both.
Basic theory
More formally, let be a random variable in . Let be a probability measure on , and some function on . Then the expectation of under can be written as
If we have random samples , generated according to , then an empirical estimate of is
In that case, we can easily obtain the Monte-Carlo empirical estimate of
Unfortunately, when the samples are generated from a different distribution than the one that we are interested in, we can no longer use this straightforward estimate. However, we may use the importance sampling technique, which consists of placing different importance on each sample, depending on how likely it was for it to have been generated by the distribution that we're interested in, , rather than the actual sampling distribution, .
More formally, consider another probability measure, , with the same support as . From the definition of the expectation given above, we have
where , is known as the importance weight and the distribution is frequently referred to as the sampling or proposal distribution. Then, if we have random samples , generated according to , a Monte Carlo estimate of follows from the above equation by viewing the problem as that of estimating the expectations and .
where are the normalised importance weights.
The technique is completely general and the above analysis can be repeated essentially exactly also for other choices of , for example when it represents a conditional distribution. Note that when is the uniform distribution, we are just estimating the (scaled) integral of over , so the method can also be used for estimating simple integrals.
There are two main applications of importance sampling methods, which naturally, are interrelated. While the aim of both applications is to estimate statistics of random variables, the field of probabilistic inference focuses more on the estimation of or related statistics, while the field of simulation focuses more on the choice of the distribution . Nevertheless, the basic theory and tools are identical.
Application to probabilistic inference
Such methods are frequently used to estimate posterior densities or expectations in state and/or parameter estimation problems in probabilistic models that are too hard to treat analytically.
Application to simulation
Importance sampling (IS) is a variance reduction technique that can be used in the Monte Carlo method. The idea behind IS is that certain values of the input random variables in a simulation have more impact on the parameter being estimated than others. If these "important" values are emphasized by sampling more frequently, then the estimator variance can be reduced. Hence, the basic methodology in IS is to choose a distribution which "encourages" the important values. This use of "biased" distributions will result in a biased estimator if it is applied directly in the simulation. However, the simulation outputs are weighted to correct for the use of the biased distribution, and this ensures that the new IS estimator is unbiased. The weight is given by the likelihood ratio, that is, the Radon-Nikodym derivative of the true underlying distribution with respect to the biased simulation distribution.
The fundamental issue in implementing IS simulation is the choice of the biased distribution which encourages the important regions of the input variables. Choosing or designing a good biased distribution is the "art" of IS. The rewards for a good distribution can be huge run-time savings; the penalty for a bad distribution can be longer run times than for a general Monte Carlo simulation without importance sampling.
Mathematical Approach
Consider estimating by simulation the probability of an event , where is a random variable with distribution and probability density function , where prime denotes derivative. A -length independent and identically distributed (i.i.d.) sequence is generated from the distribution , and the number of random variables that lie above the threshold are counted. The random variable is characterized by the Binomial distribution
Importance sampling is concerned with the determination and use of an alternate density function (for X), usually referred to as a biasing density, for the simulation experiment. This density allows the event to occur more frequently, so the sequence lengths gets smaller for a given estimator variance. Alternatively, for a given , use of the biasing density results in a variance smaller than that of the conventional Monte Carlo estimate. From the definition of , we can introduce as below.
where
is a likelihood ratio and is referred to as the weighting function. The last equality in the above equation motivates the estimator
This is the IS estimator of and is unbiased. That is, the estimation procedure is to generate i.i.d. samples from and for each sample which exceeds , the estimate is incremented by the weight evaluated at the sample value. The results are averaged over trials. The variance of the IS estimator is easily shown to be
Now, the IS problem then focuses on finding a biasing density such that the variance of the IS estimator is less than the variance of the general Monte Carlo estimate. For some biasing density function, which minimizes the variance, and under certain conditions reduces it to zero, it is called an optimal biasing density function.
Conventional biasing methods
Although there are many kinds of biasing methods, the following two methods are most widely used in the applications of IS.
Scaling
Shifting probability mass into the event region by positive scaling of the random variable with a number greater than unity has the effect of increasing the variance (mean also) of the density function. This results in a heavier tail of the density, leading to an increase in the event probability. Scaling is probably one of the earliest biasing methods known and has been extensively used in practice. It is simple to implement and usually provides conservative simulation gains as compared to other methods.
In IS by scaling, the simulation density is chosen as the density function of the scaled random variable , where usually for tail probability estimation. By transformation,
and the weighting function is
While scaling shifts probability mass into the desired event region, it also pushes mass into the complementary region which is undesirable. If is a sum of random variables, the spreading of mass takes place in an dimensional space. The consequence of this is a decreasing IS gain for increasing , and is called the dimensionality effect.
Translation
Another simple and effective biasing technique employs translation of the density function (and hence random variable) to place much of its probability mass in the rare event region. Translation does not suffer from a dimensionality effect and has been successfully used in several applications relating to simulation of digital communication systems. It often provides better simulation gains than scaling. In biasing by translation, the simulation density is given by
where is the amount of shift and is to be chosen to minimize the variance of the IS estimator.
Effects of System Complexity
The fundamental problem with IS is that designing good biased distributions becomes more complicated as the system complexity increases. Complex systems are the systems with long memory since complex processing of a few inputs is much easier to handle. This dimensionality or memory can cause problems in three ways:
- long memory (severe intersymbol interference (ISI))
- unknown memory (Viterbi decoders)
- possibly infinite memory (adaptive equalizers)
In principle, the IS ideas remain the same in these situations, but the design becomes much harder. A successful approach to combat this problem is essentially breaking down a simulation into several smaller, more sharply defined subproblems. Then IS strategies are used to target each of the simpler subproblems. Examples of techniques to break the simulation down are conditioning and error-event simulation (EES) and regenerative simulation.
Evaluation of IS
In order to identify successful IS techniques, it is useful to be able to quantify the run-time savings due to the use of the IS approach. The performance measure commonly used is , and this can be interpreted as the speed-up factor by which the IS estimator achieves the same precision as the MC estimator. This has to be computed empirically since the estimator variances are not likely to be analytically possible when their mean is intractable. Other useful concepts in quantifying an IS estimator are the variance bounds and the notion of asymptotic efficiency.
Variance Cost Function
Variance is not the only possible cost function for a simulation, and other cost functions, such as the mean absolute deviation, are used in various statistical applications. Nevertheless, the variance is the primary cost function addressed in the literature, probably due to the use of variances in confidence intervals and in the performance measure .
An associated issue is the fact that the ratio overestimates the run-time savings due to IS since it does not include the extra computing time required to compute the weight function. Hence, some people evaluate the net run-time improvement by various means. Perhaps a more serious overhead to IS is the time taken to devise and program the technique and analytically derive the desired weight function.
References
- Importance sampling - Applications in communications and detection, Rajan Srinivasan, Springer-Verlag, Berlin, 2002.
- Stochastic Simulation, B. D. Ripley, 1987, Wiley & Sons
- Sequential Monte Carlo Methods in Practice, by A Doucet, N de Freitas and N Gordon. Springer, 2001. ISBN 978-0387951461
- Introduction to rare event simulation, James Antonio Bucklew, Springer-Verlag, New York, 2004.
- P. J.Smith, M.Shafi, and H. Gao, "Quick simulation: A review of importance sampling techniques in communication systems," IEEE J.Select.Areas Commun., vol. 15, pp. 597-613, May 1997.
- M. Ferrari, S. Bellini, "Importance Sampling simulation of turbo product codes," ICC2001, The IEEE International Conference on Communications, vol. 9, pp. 2773-2777, June 2001.
- Tommy Oberg, Modulation, Detection, and Coding, John Wiley & Sons, Inc., New York, 2001.
- R. Srinivasan., Importance Sampling. New York: Springer, 2002.
- Arouna. Adaptative Monte Carlo Method, A Variance Reduction Technique. Monte Carlo Methods and Their Applications. 2004
See also
- Monte Carlo Method
- Stratified sampling
- Recursive stratified sampling
- Particle filter - a Sequential Monte Carlo method, which uses importance sampling
External links
- Monte Carlo Methods and Importance Sampling, Eric C. Anderson, Lecture notes for Stat 587C
- Sequential Monte Carlo Methods (Particle Filtering) homepage on University of Cambridge
- Introduction to importance sampling in rare-event simulations European journal of Physics. PDF document.
- Adaptive monte carlo methods for rare event simulation: adaptive monte carlo methods for rare event simulations Winter Simulation Conference