Quantum mechanics emerges from classical mechanics by the relaxation of the requirement of commutativity among the observables as assumed by classical probability theory. The most immediate and striking consequence of this relaxation is the insufficiency of real-valued probability distributions to encode the interference phenomena observed by experiment. In response, we generalize the notion of probability distributions from real-valued distributions over the set of possible outcomes which combine convexly, to complex-valued distributions over the set of possible outcomes which combine linearly. The complex-valued probabilities are able to encode the observed interference patterns in their relative phases. Such quantum probability distributions do not describe mutually exclusive outcomes in which only one outcome exists prior to measurement, but rather describes outcomes in which all possible outcomes simultaneously exist prior to measurement and which interfere in a wave-like manner.
The increase in predictive power offered by quantum mechanics came with the price of computational difficulties. Unlike the classical world, whose dimensionality scales additively with the number of subsystems, the dimensionality scaling of quantum systems is multiplicative. Thus, even small systems quickly become intractable without approximation techniques. Luckily, it is rarely the case that knowledge of the full state space is required to accurately model a given system, as most information may be contained in a relatively small subspace. Many of the most successful approximation techniques of the last century, such as Born–Oppenheimer and variational techniques like Density Functional Theory, rely on this convenient notion for their success. With the rapid development of machine learning, a field which specializes in dimensionality reduction and feature extraction of very large datasets, it is natural to apply these novel techniques for dealing with the canonical large data problem of the physical sciences.
The Universal Approximation Theorems are a collection of results concerning the ability of artificial neural networks to arbitrarily approximate different classes of functions. In particular, the Restricted Boltzmann Machine (RBM) is a shallow two-layer network consisting of
Let
The RBM is a natural choice for representing wave-functions of systems of spin
Letting
The trial state wave-functions
The variational energy functional
In this demonstration, we assume the prototypical Ising spin model for a one-dimensional lattice of spin
Evaluating the energy functional
SUBROUTINE metropolis_hastings:
markov_chain(1) ← random_sample(n)
FOR i ∈ [2,N] DO
s ← markov_chain(i-1)
rind ← random_index(lo=1, hi=n)
s_prop ← invert_spin(config=s, at=rind)
IF r_unif(0,1) < |ψ(s_prop)/ψ(s)|^2 THEN
markov_chain(i) ← s_prop
ELSE
markov_chain(i) ← s
END IF
END FOR
RETURN markov_chain
END SUBROUTINE metropolis_hastings
In practice, we allow for a thermalization period, or "burn-in" period, during which the sampling process moves the initial random sample into the stationary distribution before we can begin recording samples. As we can see, the acceptance probabilities in the Metropolis-Hastings algorithm and the form of the local energy involve only ratios of the wave-functions
The stochastic optimization algorithm is a first order optimization that involves infinitesimal variations to the parameters
Let
Similarly, we define a path
By the time-dependent Schrödinger equation, the state
To determine the actual form of the tangent vector
It must be noted that the initialization of the parameters can have a dramatic effect on the performance of the algorithm. The initial state