Adaptive sampling - Misplaced Pages

Adaptive sampling is a approach to sampling that uses heuristics to provide efficiency. The term adaptive sampling represents a general approach to the problem of sampling, rather than being a special method itself. Meaning it can be combined with suitable other approaches/methods.

In some real world problems, sampling is implicitly/explicitly needed and used to obtain practical solutions. The sampling process will need resources and efficient usage of these resources is usually crucial. This is why there are multiple sampling methods instead of the brute-force approach.

Let f(x) be a function that is to be sampled. For simplicity, let C(x,s) be the cost for sample x given the previous set of samples s (For simplicity, we can assume that C(x,s) is constant since sampling cost usually does not depend on the previous samples and the sampling input x to the function. In time-critical systems, where the cost for each sample is strongly related to computation time; usually there are other parameters to the function C like the current time...); and G(x, s) be the gain (anti-cost) from sampling the function at x, given the set of previous samples s. For example, it can be assumed that G(x, s)=0 if x has already been sampled. The sampling problem is then maximizing our cumulative gain minus cumulative cost. Which usually comes down to sampling the function n times until the next sample's estimated/deterministic cost C(x,s) is smaller than the gain G(x,s) of that sample.

Adaptive sampling then assumes that given necessary knowledge about the problem, there is a theoretically optimal sequence s of samples that will maximize the information (gain) induced by that sample; and it is possible to estimate s using heuristics. Adaptive sampling usually focuses on estimating the next optimal sample input x, given the previous set of samples. Thus, being adaptive to the current knowledge about the function.

Computational Molecular Biology

In computational molecular biology, adaptive sampling is used to efficiently simulate protein folding when coupled with molecular dynamics simulations.

Background

Proteins spend a large portion – nearly 96% in some cases – of their folding time "waiting" in various thermodynamic free energy minima. Consequently, a straightforward simulation of this process would spend a great deal of computation to this state, with the transitions between the states – the aspects of protein folding of greater scientific interest – taking place only rarely. Adaptive sampling exploits this property to simulate the protein's phase space in between these states. Using adaptive sampling, molecular simulations that previously would have taken decades can be performed in a matter of weeks.

Theory

If a protein folds through the metastable states A -> B -> C, researchers can calculate the length of the transition time between A and C by simulating the A -> B transition and the B -> C transition. The protein may fold through alternative routes which may overlap in part with the A -> B -> C pathway. Decomposing the problem in this manner is efficient because each step can be simulated in parallel.

Applications

Adaptive sampling is used by the Folding@home distributed computing project in combination with Markov state models.

Disadvantages

While adaptive sampling is useful for short simulations, longer trajectories may be more helpful for certain types of biochemical problems.

References

Robert B Best (2012). "Atomistic molecular simulations of protein folding". Current Opinion in Structural Biology (review). 22 (1): 52–61. doi:10.1016/j.sbi.2011.12.001. PMID 22257762.
^ TJ Lane; Gregory Bowman; Robert McGibbon; Christian Schwantes; Vijay Pande; Bruce Borden (September 10, 2012). "Folding@home Simulation FAQ". Folding@home. Stanford University. Archived from the original on 2012-09-13. Retrieved September 10, 2012.
^ G. Bowman; V. Volez; V. S. Pande (2011). "Taming the complexity of protein folding". Current Opinion in Structural Biology. 21 (1): 4–11. doi:10.1016/j.sbi.2010.10.006. PMC 3042729. PMID 21081274.
David E. Shaw; Martin M. Deneroff; Ron O. Dror; Jeffrey S. Kuskin; Richard H. Larson; John K. Salmon; Cliff Young; Brannon Batson; Kevin J. Bowers; Jack C. Chao; Michael P. Eastwood; Joseph Gagliardo; J. P. Grossman; C. Richard Ho; Douglas J. Ierardi, Ist (2008). "Anton, A Special-Purpose Machine for Molecular Dynamics Simulation". Communications of the ACM. 51 (7): 91–97. doi:10.1145/1364782.1364802.
Ron O. Dror; Robert M. Dirks; J.P. Grossman; Huafeng Xu; David E. Shaw (2012). "Biomolecular Simulation: A Computational Microscope for Molecular Biology". Annual Review of Biophysics. 41: 429–52. doi:10.1146/annurev-biophys-042910-155245. PMID 22577825.

Categories: