Update README.md

acbbullock · Nov 20, 2022 · 25de144 · 25de144
1 parent b4dd829
commit 25de144
Showing 1 changed file with 20 additions and 16 deletions.
diff --git a/README.md b/README.md
@@ -89,23 +89,25 @@ where $\delta\alpha_l \in \mathbb{C}$ is an infinitesimal variation in the $l$-t
  $$\ket{\psi(\alpha + \delta\alpha)} \approx \ket{\psi(\alpha)} + \sum_l \delta\alpha_l \frac{\partial}{\partial \alpha_l} \ket{\psi(\alpha)} \in \mathcal{H}$$
 on $V$. Here, the $\delta\alpha_l$ represent the amount of change in each direction needed to linearly approximate the function $\ket{\psi}$ at $\alpha + \delta\alpha \in V$ from the neighboring point $\alpha \in \mathcal{M}$, so that the affine approximation becomes exact in the limit $\delta\alpha_l \to 0$.
 
-Similarly, we may define a path $\alpha:[t, t + \delta t] \to \mathcal{M}$ in $\mathcal{M}$ such that $\alpha(t) = \alpha \in \mathcal{M}$ and $\alpha(t + \delta t) = \alpha + \delta\alpha \in V$ along with a unitary propagator $U(\delta t) = \exp(-i \delta t H)$ such that, at some nearby point $\alpha(t + \delta t) \in V$, we have
- $$\ket{\psi(\alpha(t + \delta t))} = U(\delta t)\ket{\psi(\alpha(t))} = \ket{\psi(\alpha(t))} - i \delta t H \ket{\psi(\alpha(t))} + \frac{(-i \delta t)^2}{2} H^2 \ket{\psi(\alpha(t))} + \cdots \in \mathcal{H}$$
-which is approximated to first order as an affine function
- $$\ket{\psi(\alpha(t + \delta t))} \approx \ket{\psi(\alpha(t))} - i \delta t H \ket{\psi(\alpha(t))} \in \mathcal{H}$$
-on $V$. Here, the Hamiltonian $H$ is the infinitesimal generator of the one-parameter unitary group of time translations whose elements are the unitary transformations $U(t_2 - t_1):\mathcal{H} \to \mathcal{H}$ on the state space $\mathcal{H}$ for any $t_1, t_2 \in \mathbb{R}$, so that the affine approximation becomes exact in the limit $\delta t \to 0$. Note that the propagator $U(\delta t)$ is recovered (up to a constant) by solution of the time-dependent Schrödinger equation $i \frac{d}{dt} \ket{\psi(\alpha(t))} = H \ket{\psi(\alpha(t))} \Rightarrow \ket{\psi(\alpha(t_2))} = U(t_2-t_1) \ket{\psi(\alpha(t_1))}$ by taking $t_1 = t$ and $t_2 = t + \delta t$.
-
-By performing a Wick rotation $\tau = it$ and defining a path $\alpha:[\tau, \tau + \delta \tau] \to \mathcal{M}$ in $\mathcal{M}$ such that $\alpha(\tau) = \alpha \in \mathcal{M}$ and $\alpha(\tau + \delta \tau) = \alpha + \delta\alpha \in V$, we may propagate the state $\ket{\psi(\alpha(\tau))}$ similarly with a non-unitary propagator $U(\delta \tau) = \exp(- \delta \tau H)$ such that
- $$\ket{\psi(\alpha(\tau + \delta \tau))} \approx \ket{\psi(\alpha(\tau))} - \delta \tau H \ket{\psi(\alpha(\tau))} \in \mathcal{H}$$
-is an affine function on $V$ which becomes exact in the limit $\delta\tau \to 0$. The propagator $U(\delta \tau)$ is recovered (up to a constant) by solution of the imaginary-time Schrödinger equation $-\frac{d}{d\tau} \ket{\psi(\alpha(\tau))} = H \ket{\psi(\alpha(\tau))} \Rightarrow \ket{\psi(\alpha(\tau_2))} = U(\tau_2-\tau_1) \ket{\psi(\alpha(\tau_1))}$ by taking $\tau_1 = \tau$ and $\tau_2 = \tau + \delta \tau$. Using this fact, we may compare the first order terms of $\ket{\psi(\alpha(\tau + \delta \tau))}$ and $\ket{\psi(\alpha + \delta\alpha)}$ to find that
+Similarly, we define a path $\alpha:[\tau, \tau + \delta \tau] \to \mathcal{M}$ in $\mathcal{M}$ where $\delta\tau > 0$ such that $\alpha(\tau) = \alpha \in \mathcal{M}$ and $\alpha(\tau + \delta \tau) = \alpha + \delta\alpha \in V$, and expand the path about $\tau \in \mathbb{R}$ to see
+ $$\alpha(\tau + \delta \tau) = \alpha(\tau) + \delta\tau \frac{d \alpha}{d\tau}(\tau) + \frac{\delta\tau^2}{2} \frac{d^2 \alpha}{d\tau^2}(\tau) + \cdots \approx \alpha(\tau) + \delta\tau \frac{d \alpha}{d\tau}(\tau) \in \mathcal{M}$$
+which is an affine function on the closed interval $[\tau, \tau + \delta \tau] \subset \mathbb{R}$. Evaluating $\ket{\psi}:\mathcal{M} \to \mathcal{H}$ at $\alpha(\tau + \delta \tau) \in V$, we find that
+ $$\ket{\psi(\alpha(\tau + \delta \tau))} \approx \ket{\psi(\alpha(\tau))} + \delta\tau \frac{d}{d\tau}\ket{\psi(\alpha(\tau))} \in \mathcal{H}$$
+is an affine function on $V$ which becomes exact in the limit $\delta\tau \to 0$. We may compare the first order terms of $\ket{\psi(\alpha(\tau + \delta \tau))}$ and $\ket{\psi(\alpha + \delta\alpha)}$ to find that
  $$\delta \tau \frac{d}{d\tau} \ket{\psi(\alpha(\tau))} = \delta \tau \sum_l \frac{\partial}{\partial \alpha_l} \ket{\psi(\alpha(\tau))} \frac{d \alpha_l}{d\tau}(\tau) = \sum_l \delta\alpha_l(\tau) \frac{\partial}{\partial \alpha_l} \ket{\psi(\alpha(\tau))}$$
-is a linear combination of the tangent vectors $\frac{d \alpha_l}{d\tau}(\tau)$ at $\alpha(\tau) \in \mathcal{M}$, so that the variation $\delta\alpha(\tau) = \delta \tau \frac{d \alpha}{d\tau}(\tau)$ at time $\tau$ is in the direction of the tangent vector $\frac{d \alpha}{d\tau}(\tau)$ at $\alpha(\tau) \in \mathcal{M}$ such that
- $$\alpha(\tau + \delta \tau) = \alpha(\tau) + \delta\alpha(\tau) = \alpha(\tau) + \delta \tau \frac{d \alpha}{d\tau}(\tau)$$
-is the change in the parameters $\alpha(\tau) \in \mathcal{M}$ due to the non-unitary, imaginary-time evolution of the state $\ket{\psi(\alpha(\tau))}$ over the interval $[\tau, \tau + \delta \tau]$.
+is a linear combination of the tangent vectors $\frac{d \alpha_l}{d\tau}(\tau)$ at $\alpha(\tau) \in \mathcal{M}$, so that the total variation $\delta\alpha(\tau) = \delta \tau \frac{d \alpha}{d\tau}(\tau)$ at time $\tau$ is in the direction of the tangent vector $\frac{d \alpha}{d\tau}(\tau)$ at $\alpha(\tau) \in \mathcal{M}$ such that
+ $$\alpha(\tau + \delta \tau) \approx \alpha(\tau) + \delta\alpha(\tau) = \alpha(\tau) + \delta \tau \frac{d \alpha}{d\tau}(\tau)$$
+is the infinitesimal change in the parameters $\alpha(\tau) \in \mathcal{M}$ at time $\tau$ over the interval $[\tau, \tau + \delta \tau]$. Letting $T_\alpha \mathcal{M}$ denote the tangent space of $\mathcal{M}$ at $\alpha(\tau) \in \mathcal{M}$ and $T_{\psi(\alpha)} \mathcal{H}$ denote the tangent space of $\mathcal{H}$ at $\ket{\psi(\alpha(\tau))} \in \mathcal{H}$, we push forward the local vector field $\frac{d \alpha}{d\tau}$ on $V \subset \mathcal{M}$ to the local vector field $\frac{d}{d\tau} \ket{\psi(\alpha)}$ on the image $\tilde{V} = \ket{\psi(V)} \subset \mathcal{H}$ of $V$ under the mapping $\ket{\psi}:\mathcal{M} \to \mathcal{H}$, and note that
+ $$\frac{d}{d\tau} \ket{\psi(\alpha(\tau))} \in T_{\psi(\alpha)} \tilde{V}$$
+is the pushforward of the tangent vector $\frac{d \alpha}{d\tau}(\tau) \in T_\alpha \mathcal{M}$ at $\alpha(\tau) \in \mathcal{M}$.
+
+By the time-dependent Schrödinger equation, the state $\ket{\psi(\alpha(t))} \in \mathcal{H}$ at some time $t$ will evolve according to $i \frac{d}{dt} \ket{\psi(\alpha(t))} = H \ket{\psi(\alpha(t))}$, which is satisfied (up to a constant) by the propagator $U(t_2 - t_1) = \exp[-i(t_2-t_1)H]$ given the Hamiltonian $H$. Here, the Hamiltonian $H$ is the infinitesimal generator of the one-parameter unitary group of time translations whose elements are the unitary transformations $U(t_2 - t_1):\mathcal{H} \to \mathcal{H}$ on the state space $\mathcal{H}$ for any $t_1, t_2 \in \mathbb{R}$. By performing a Wick rotation $\tau = it$, the state $\ket{\psi(\alpha(\tau))} \in \mathcal{H}$ at some imaginary time $\tau$ will evolve according to the imaginary-time Schrödinger equation $-\frac{d}{d\tau} \ket{\psi(\alpha(\tau))} = H \ket{\psi(\alpha(\tau))}$, which is satisfied (up to a constant) by the non-unitary propagator $U(\tau_2 - \tau_1) = \exp[-(\tau_2-\tau_1)H]$. Taking $\tau_1 = \tau$ and $\tau_2 = \tau + \delta\tau$, we propagate the state $\ket{\psi(\alpha(\tau))} \in \mathcal{H}$ by
+ $$\ket{\psi(\alpha(\tau + \delta \tau))} = U(\delta\tau) \ket{\psi(\alpha(\tau))} = \ket{\psi(\alpha(\tau))} - \delta \tau H \ket{\psi(\alpha(\tau))} + \frac{(-\delta\tau)^2}{2}H^2 \ket{\psi(\alpha(\tau))} + \cdots \approx \ket{\psi(\alpha(\tau))} - \delta \tau H \ket{\psi(\alpha(\tau))} \in \mathcal{H}$$
+approximated to first order over the interval $[\tau, \tau + \delta \tau]$, which becomes exact in the limit $\delta\tau \to 0$. Enforcing a different normalization, we may also evolve the state $\ket{\psi(\alpha(\tau))} \in \mathcal{H}$ according to the imaginary-time Schrödinger equation $- \Delta\frac{d}{d\tau} \ket{\psi(\alpha(\tau))} = \Delta H \ket{\psi(\alpha(\tau))}$ with the deviations $\Delta \frac{d}{d\tau} = \frac{d}{d\tau} - \left\langle \frac{d}{d\tau} \right\rangle_{\psi(\alpha)}$ and $\Delta H = H - \langle H \rangle_{\psi(\alpha)}$, which is often more advantageous in a stochastic framework.
 
-Let $T_\alpha \mathcal{M}$ denote the tangent space of $\mathcal{M}$ at $\alpha(\tau) \in \mathcal{M}$ and let $T_{\psi(\alpha)} \mathcal{H}$ denote the tangent space of $\mathcal{H}$ at $\ket{\psi(\alpha(\tau))} \in \mathcal{H}$. To determine the actual form of the tangent vector $\frac{d \alpha}{d\tau}(\tau) \in T_\alpha \mathcal{M}$ at time $\tau$, we push forward the local vector field $\frac{d \alpha}{d\tau}$ on $V \subset \mathcal{M}$ to the local vector field $\frac{d}{d\tau} \ket{\psi(\alpha)}$ on the image $\tilde{V} = \ket{\psi(V)} \subset \mathcal{H}$ of $V$ under the mapping $\ket{\psi}:\mathcal{M} \to \mathcal{H}$ and impose the constraint that the projection
+To determine the actual form of the tangent vector $\frac{d \alpha}{d\tau}(\tau) \in T_\alpha \mathcal{M}$ at time $\tau$, we impose the constraint that the projection
  $$\left\langle \frac{d}{d\tau} \psi(\alpha(\tau)), \bigg[ \Delta \frac{d}{d\tau} + \Delta H \bigg] \psi(\alpha(\tau)) \right\rangle = 0$$
-onto the tangent vector $\frac{d}{d\tau} \ket{\psi(\alpha(\tau))} \in T_{\psi(\alpha)} \tilde{V}$ vanishes, a type of Galerkin condition known as the Dirac-Frenkel-McLachlan variational principle. Here, we require precisely that the tangent vector $[ \Delta \frac{d}{d\tau} + \Delta H ] \ket{\psi(\alpha(\tau))} \in T_{\psi(\alpha)} \mathcal{H}$ formed by the difference of deviations $\Delta \frac{d}{d\tau} = \frac{d}{d\tau} - \left\langle \frac{d}{d\tau} \right\rangle_{\psi(\alpha)}$ and $-\Delta H = \langle H \rangle_{\psi(\alpha)} - H$ is orthogonal to the tangent vector $\frac{d}{d\tau} \ket{\psi(\alpha(\tau))} \in T_{\psi(\alpha)} \tilde{V}$. This condition is motivated by the fact that the subspace $\tilde{V} \subset \mathcal{H}$ is typically of much smaller dimension than $\mathcal{H}$ itself since $\mathcal{M}$ is typically of much smaller dimension than $\mathcal{H}$ as manifolds, resulting in a situation where the tangent vector $\frac{d}{d\tau} \ket{\psi(\alpha(\tau))} \in T_{\psi(\alpha)} \tilde{V}$ is contained to a low dimensional subspace but $H \ket{\psi(\alpha(\tau))} \in T_{\psi(\alpha)} \mathcal{H}$ is not necessarily contained to this subspace. In order for the imaginary-time Schrödinger equation $-\frac{d}{d\tau} \ket{\psi(\alpha(\tau))} = H \ket{\psi(\alpha(\tau))}$ to hold, we must have that the norm of the state $[\frac{d}{d\tau} + H] \ket{\psi(\alpha(\tau))}$ is vanishing, or equivalently that its projection onto the subspace containing $\frac{d}{d\tau} \ket{\psi(\alpha(\tau))}$ is vanishing, i.e. that there is no overlap between $[\frac{d}{d\tau} + H] \ket{\psi(\alpha(\tau))}$ and $\frac{d}{d\tau} \ket{\psi(\alpha(\tau))}$. The same argument holds for the state $[\Delta\frac{d}{d\tau} + \Delta H] \ket{\psi(\alpha(\tau))}$ formed by the deviation operators, which we choose for their advantageous statistical properties. From the overlap condition, we have explicitly
+of the state $[ \Delta \frac{d}{d\tau} + \Delta H ] \ket{\psi(\alpha(\tau))} \in T_{\psi(\alpha)} \mathcal{H}$ onto the tangent vector $\frac{d}{d\tau} \ket{\psi(\alpha(\tau))} \in T_{\psi(\alpha)} \tilde{V}$ vanishes, a type of Galerkin condition known as the Dirac-Frenkel-McLachlan variational principle. This condition is motivated by the fact that the subspace $\tilde{V} \subset \mathcal{H}$ is typically of much smaller dimension than $\mathcal{H}$ itself since $\mathcal{M}$ is typically of much smaller dimension than $\mathcal{H}$ as manifolds, resulting in a situation where the tangent vector $\frac{d}{d\tau} \ket{\psi(\alpha(\tau))} \in T_{\psi(\alpha)} \tilde{V}$ is contained to a low dimensional subspace but $H \ket{\psi(\alpha(\tau))} \in T_{\psi(\alpha)} \mathcal{H}$ is not necessarily contained to this subspace. In order for the imaginary-time Schrödinger equation $-\frac{d}{d\tau} \ket{\psi(\alpha(\tau))} = H \ket{\psi(\alpha(\tau))}$ to hold, we must have that the norm of the state $[\frac{d}{d\tau} + H] \ket{\psi(\alpha(\tau))}$ is vanishing, or equivalently that its projection onto the subspace containing $\frac{d}{d\tau} \ket{\psi(\alpha(\tau))}$ is vanishing, i.e. that there is no overlap between $[\frac{d}{d\tau} + H] \ket{\psi(\alpha(\tau))}$ and $\frac{d}{d\tau} \ket{\psi(\alpha(\tau))}$. The same argument holds for the state $[\Delta\frac{d}{d\tau} + \Delta H] \ket{\psi(\alpha(\tau))}$ formed by the deviation operators. From the overlap condition, we have explicitly
  $$0 = \sum_k \frac{d \alpha_k^\*}{d\tau}(\tau) \frac{\partial}{\partial \alpha_k} \bra{\psi(\alpha(\tau))} \bigg[ \sum_l \frac{d \alpha_l}{d\tau}(\tau) \frac{\partial}{\partial \alpha_l} \ket{\psi(\alpha(\tau))} - \sum_l \frac{d \alpha_l}{d\tau}(\tau) \langle \partial_l \rangle_{\psi(\alpha)} \ket{\psi(\alpha(\tau))} + H \ket{\psi(\alpha(\tau))} - \langle H \rangle_{\psi(\alpha)} \ket{\psi(\alpha(\tau))} \bigg] = \sum_k \frac{d \alpha_k^\*}{d\tau}(\tau) \bigg[ \sum_l \frac{d \alpha_l}{d\tau}(\tau) \langle \partial_k^\dagger \partial_l \rangle_{\psi(\alpha)} - \sum_l \frac{d \alpha_l}{d\tau}(\tau) \langle \partial_k^\dagger \rangle_{\psi(\alpha)} \langle \partial_l \rangle_{\psi(\alpha)} + \langle \partial_k^\dagger H \rangle_{\psi(\alpha)} - \langle \partial_k^\dagger \rangle_{\psi(\alpha)} \langle H \rangle_{\psi(\alpha)} \bigg] = \sum_k \frac{d \alpha_k^\*}{d\tau}(\tau) \bigg[ \sum_l \frac{d \alpha_l}{d\tau}(\tau) S_{kl}(\alpha) + F_k(\alpha) \bigg]$$
 which is true when each term is identically zero, i.e. when
  $$\sum_l \frac{d \alpha_l}{d\tau}(\tau) S_{kl}(\alpha) = - F_k(\alpha)$$
@@ -114,7 +116,9 @@ is true. This system of linear equations can be written in matrix form as
 whose formal solution is
  $$\frac{d \alpha}{d\tau}(\tau) = - S^{-1}(\alpha) F(\alpha)$$
 such that
- $$\alpha(\tau + \delta \tau) = \alpha(\tau) + \delta\alpha(\tau) = \alpha(\tau) + \delta \tau \frac{d \alpha}{d\tau}(\tau) = \alpha(\tau) - \delta \tau S^{-1}(\alpha) F(\alpha)$$
+ $$\alpha(\tau + \delta \tau) \approx \alpha(\tau) + \delta\alpha(\tau) = \alpha(\tau) + \delta \tau \frac{d \alpha}{d\tau}(\tau) = \alpha(\tau) - \delta \tau S^{-1}(\alpha) F(\alpha)$$
 is the change in the parameters $\alpha(\tau) \in \mathcal{M}$ due to the non-unitary, imaginary-time evolution of the state $\ket{\psi(\alpha(\tau))}$ over the interval $[\tau, \tau + \delta \tau]$.
 
-It must be noted that the initialization of the parameters can have a dramatic effect on the performance of the algorithm. The initial state $\ket{\psi(\alpha(0))}$ must be chosen such that $\langle \psi_0, \psi(\alpha(0)) \rangle \neq 0$, or else learning is not possible. The more overlap there is with the ground state, the more efficient the algorithm will be. With at least some overlap, we will expect that $\ket{\psi(\alpha(\tau))} \to \ket{\psi_0}$ as $\tau \to \infty$ for a sufficiently small time step $\delta\tau$.
+It must be noted that the initialization of the parameters can have a dramatic effect on the performance of the algorithm. The initial state $\ket{\psi(\alpha(0))}$ must be chosen such that $\langle \psi_0, \psi(\alpha(0)) \rangle \neq 0$, or else learning is not possible. The more overlap there is with the ground state, the more efficient the algorithm will be. With at least some overlap, we will expect that $\ket{\psi(\alpha(\tau))} \to \ket{\psi_0}$ as $\tau \to \infty$ for a sufficiently small time step $\delta\tau$. This can be seen by noting the change in the energy functional over the interval $[\tau, \tau + \delta \tau]$, by taking the expectation of $H$ in the state $\ket{\psi(\alpha(\tau + \delta\tau))} \approx \ket{\psi(\alpha(\tau))} - \delta \tau \Delta H \ket{\psi(\alpha(\tau))} = \ket{\psi(\alpha(\tau))} + \delta \tau \Delta \frac{d}{d\tau} \ket{\psi(\alpha(\tau))}$, i.e.
+ $$E[\psi(\alpha(\tau + \delta\tau))] = E[\psi(\alpha(\tau))] - 2\delta\tau F^\dagger(\alpha) S^{-1}(\alpha) F(\alpha) + \mathcal{O}(\delta\tau^2)$$
+where $\mathcal{O}(\delta\tau^2)$ denotes the term involving $\delta\tau^2$. Since $\delta\tau > 0$ and $S(\alpha)$ is positive-definite, we have that the change in energy $E[\psi(\alpha(\tau + \delta\tau))] - E[\psi(\alpha(\tau))] < 0$ for a sufficiently small time step $\delta\tau$.