The six-vertex model was first introduced by Linus Pauling in 1935 [Pau35] to study the thermodynamics of crystalline solids with ferroelectric properties. It has since become one of the most compelling models in statistical mechanics. An illustrative realization of the model is given by the hydrogen-bonding patterns of two-dimensional ice—when water freezes, each oxygen atom must be surrounded by four hydrogen atoms such that two of the hydrogen atoms bond covalently with the oxygen atom and two are farther away. The states of the six-vertex model are orientations of the edges in a finite region of the Cartesian lattice such that each internal vertex has two incoming edges and two outgoing edges (i.e., Eulerian orientations of the underlying lattice graph). The model is most often studied on the square lattice with additional edges so that each vertex has degree 4. We assign Boltzmann weights to the six possible vertex types in Figure 1 and define the partition function as , where is the set of Eulerian orientations of and is the number of type- vertices in the configuration .
Elliott Lieb discovered exact solutions to the six-vertex model with periodic boundary conditions for three different parameter regimes in 1967 [Lie67a, Lie67b, Lie67c]. In particular, he famously showed that if all six vertex weights are , the energy per vertex is (Lieb’s square ice constant). His results were then immediately generalized to allow for all parameter settings and external electric fields [Sut67, Yan67]. An equivalence between periodic and free boundary conditions [BKW73] was established soon after, and since then the primary object of study has been the six-vertex model subject to domain wall boundary conditions [ICK92, KZJ00, BPZ02, BF06, BL09, BL10]. In this line of work, there have been several surprisingly profound connections to enumerative combinatorics. For instance, Zeilberger gave a sophisticated computer-assisted proof of the alternating sign matrix conjecture in 1995 that required 88 referees to verify [Zei96]. A year later, Kuperberg [Kup96]
produced an elegant and significantly shorter proof using analysis of the partition function of the six-vertex model with domain wall boundary conditions. Other important connections of the model to combinatorics and probability include tilings of the Aztex diamond and the arctic circle theorem[CEP96, FS06], sampling lozenge tilings [LRS01, Wil04, BCFR17], and enumerating 3-colorings of lattice graphs [RT00, CR16].
While there has been extraordinary progress in understanding properties of the six-vertex model with periodic or domain wall boundary conditions, remarkably less is known when the model is subject to arbitrary boundary conditions. Sampling configurations using Markov chain Monte Carlo (MCMC) has been one of the primary means for discovering more general mathematical and physical properties of the six-vertex model [AR05, LKV17, LKRV18, KS18], and empirically the model is very sensitive to boundary conditions. Numerical studies have often observed slow behavior of local MCMC algorithms under certain parameter settings (e.g., according to [LKRV18], “it must be stressed that the Metropolis algorithm might be impractical in the antiferromagnetic phase, where the system may be unable to thermalize”). However, there are very few rigorous results in computer science on the computational complexity of sampling from the Boltzmann distribution for various weights and boundary conditions. Therefore, this motivates our study of Glauber dynamics, the most popular MCMC sampling algorithm, for the six-vertex model in the ferroelectric and antiferroelectric phases.
At first glance there are six degrees of freedom in the model, but this conveniently reduces to a two-parameter family due to invariants and standard physical assumptions that relate pairs of vertex types. To see this, it is useful to map configurations of the six-vertex model to sets of intersecting lattice paths by erasing all of the edges that are directed south or west and keeping the others[LRS01]. Using this routing interpretation, it is simple to see that the difference in the number of type-5 and type-6 vertices must equal a fixed constant determined by the boundary conditions. In addition to studying invariants, the lattice path representation of configurations turns out to be exceptionally useful for analyzing the Glauber dynamics. Next, the total weight of a configuration should remain unchanged if all the edge directions are reversed in the absence of an external electric field, so we let , , and . This complementary invariance is known as the zero field assumption. It is often convenient, however, to exploit the conservation laws of the model [BL09] and reparameterize the system so that and . This allows us to ignore empty sites and focus solely on the weighted lattice paths. Furthermore, since our goal is to sample configurations from the Boltzmann distribution, we can normalize the partition function by a factor of and consider the weights instead of . We collectively refer to these properties as the invariance of the Gibbs measure for the six-vertex model.
The phase diagram of the six-vertex model is partitioned into three regions: the disordered (DO) phase, the ferroelectric (FE) phase, and the antiferroelectric (AFE) phase. To establish these regions, introduce the parameter
The disordered phase is the set of parameters that satisfy , and Glauber dynamics is expected to be rapidly mixing in this region because there are no long-range correlations in the system. The ferroelectric phase is defined by , or equivalently when or . We show in this paper that Glauber dynamics can be slow mixing at any point in this region (Figure 1(b)). The antiferroelectric phase is defined by , or equivalently when , and our second result significantly extends the antiferroelectric subregion for which Glauber dynamics is known to be slow mixing. The phase diagram is symmetric over the main positive diagonal, and this follows from the fact that and are interchangeable parameters. To see this, consider the automorphism that rotates each of the six vertex types by ninety degrees clockwise. Under the zero field assumption, this is equivalent to simply rotating the entire model, so we can assume without loss of generality that if a mixing result holds for one point in the phase diagram, it also holds at the point reflected over the main diagonal.
Phase diagram of the six-vertex model with (a) previously known and (b) our current slow mixing regions colored in red. Glauber dynamics is conjectured to be rapidly mixing for the entire disordered phase but has only been shown for the uniform distribution indicated by the green pointin both figures.
Cai, Liu, and Lu [CLL19] recently provided strong evidence supporting conjectures about the approximability of the six-vertex model. In particular, they designed a fully randomized approximation scheme (FPRAS) for a subregion of the disordered phase that works for all 4-regular graphs via the winding framework for Holant problems [McQ13, HLZ16]. They also showed that there cannot exist an FPRAS for 4-regular graphs in the ferroelectric or antiferroelectric phases unless . We note that their hardness result uses nonplanar gadgets and does not imply anything about the six-vertex model on regions of with boundary conditions. A dichotomy theorem for the (exact) computability of the partition function of the six-vertex model on 4-regular graphs was also recently proven in [CFX18]. As for the positive results, Luby, Randall, and Sinclair [LRS01] proved rapid mixing of a Markov chain that leads to a fully polynomial almost uniform sampler for Eulerian orientations on any region of the Cartesian lattice with fixed boundaries (i.e., the unweighted case when ). Randall and Tetali [RT00] then used a comparison technique to argue that Glauber dynamics for Eulerian orientations on lattice graphs is rapidly mixing by relating this Markov chain to the Luby-Randall-Sinclair chain. Goldberg, Martin, and Paterson [GMP04] later extended their approach to show that Glauber dynamics is rapidly mixing on lattice graphs with free boundary conditions.
Recently, Liu [Liu18] gave the first rigorous result that Glauber dynamics for the six-vertex model is slow mixing in an ordered phase by showing that local Markov chains are slow mixing in the antiferroelectric subregion defined by , where is the connective constant for self-avoiding walks on the square lattice (Figure 1(a)). He also showed that the directed loop algorithm is slowly mixing in the same antiferroelectric region and for all of the ferroelectric region, but this has no bearing on the efficiency of Glauber dynamics in the ferroelectric region. We also point out that the partition function of the six-vertex model is exactly computable for all boundary conditions at the free-fermion point when , or equivalently , via a reduction to domino tilings and a Pfaffian computation [FS06]. There is strong evidence that exact counting is unlikely anywhere else for arbitrary boundary conditions [CFX18]. Last, we mention that the approximability of the eight-vertex model has also recently been studied [CLLY18].
1.1 Main Results
In this paper we show that Glauber dynamics for the six-vertex model can be slow mixing in the ferroelectric and antiferroelectric phases. We start by proving that there exist boundary conditions such that for any ferroelectric Boltzmann weights, the mixing time is exponential in the number of vertices in the lattice. This is the first rigorous result for the mixing time of Glauber dynamics in the ferroelectric phase.
Theorem 1.1 (Ferroelectric phase).
For any parameter such that or , there exists boundary conditions for which Glauber dynamics mixes exponentially slowly on .
We note that our proof naturally breaks down at the critical line in a way that reveals a trade-off between the energy and entropy of the system. Additionally, our analysis suggests an underlying combinatorial interpretation for the phase transition between the ferroelectric and disordered phases in terms of the adherence strength of intersecting lattice paths and the momentum parameter of correlated random walks.
Our second mixing result builds on the topological obstruction framework developed in [Ran06] to show that Glauber dynamics with free boundary conditions is slow mixing in most of the antiferroelectric region. Specifically, we generalize the recent antiferroelectric mixing result in [Liu18] with a Peierls argument that uses multivariate generating functions for weighted non-backtracking walks instead of the connectivity constant for (unweighted) self-avoiding walks to better account for the discrepancies in Boltzmann weights.
Theorem 1.2 (Antiferroelectric phase).
For any parameter such that , Glauber dynamics mixes exponentially slowly on with free boundary conditions.
We illustrate the new regions for which Glauber dynamics can be slow mixing in Figure 2. Observe that our antiferroelectric subregion significantly extends Liu’s and pushes towards the conjectured threshold.
We take significantly different approaches in analyzing the ferroelectric and antiferroelectric phases. For the ferroelectric phase, we interpret configurations of the model as intersecting lattice paths and construct boundary conditions that induce polynomially-many paths separated by a critical distance that allows all of the paths to (1) behave independently and (2) simultaneously intersect with their neighbors maximally. From here, we analyze the dynamics of a single path in isolation as an escape probability, which eventually allows us to bound the conductance of the Markov chain. The dynamics of a single lattice path are equivalent to a those of a correlated random walk, so in Section 5 we develop a new tail inequality for correlated random walks that accurately bounds the probability of large deviations from the starting position. We note that decomposing the dynamics of lattice models into one-dimensional random walks has recently been shown to achieve nearly-tight bounds for escape probabilities in different settings [DFGX18].
One of the key technical contributions in this paper is our analysis of the tail behavior of correlated random walks in Section 5. While there is a simple combinatorial expression for the position of a correlated random walk written as a sum of marginals, it is not immediately useful for bounding the displacement from the origin. To achieve an exponentially small tail bound for these walks, we first construct a smooth function that tightly upper bounds the marginals and then optimize this function to analyze the asymptotics of the log of the maximum marginal. Once we obtain an asymptotic equality for the maximum marginal, we can upper bound the deviation of a correlated random walk, and hence the deviation of a lattice path in a configuration. Ultimately, this allows us to show that there exists a balanced cut in the state space that has an exponentially small escape probability, which implies that the Glauber dynamics are slow mixing.
In the antiferroelectric phase, the Boltzmann weights satisfy so type- vertices are preferred. It follows that there are two (arrow-reversal) symmetric ground states of maximum probability containing only type- vertices. To move between configurations that agree predominantly with different ground states, the Markov chain must pass through configurations with a large number of type- or type- vertices. Using the idea of fault lines introduced in [Ran06], we use self-avoiding walks to characterize such configurations and construct a cut set with exponentially small probability mass that separates the ground states. Liu [Liu18] follows this Peierls argument approach and bounds the weight of the cut by separately considering the minimum energy gain of the corresponding inverse map and the number of preimages (i.e., the entropy). Instead, we directly bound the free energy (rather than as a product of the upper bounds for the energy and entropy terms) and are able to show slow mixing for a much larger region of the phase diagram. Our key observation for accurately bounding the free energy is that when a fault lines changes direction, the vertices along it switch from type- to type- or vice versa. Therefore, we introduce the notion of weighted non-backtracking walks and solve their multivariate generating function by diagonalizing a system of linear recurrences to exactly account for disparities between the weights of and along fault lines.
We start with some background on Markov chains, Glauber dynamics, and correlated random walks.
2.1 Markov Chains and Mixing Times
Let be an ergodic, reversible Markov chain with finite state space , transition probability matrix , and stationary distribution . The -step transition probability from states to is denoted as
. The total variation distance between probability distributionsand on is
The mixing time of is . We say that is rapidly mixing if its mixing time is , where is the size of each configuration in the state space. Similarly, we say that is slow mixing if its mixing time is for some constant .
The mixing time of a Markov chain is characterized by its conductance (up to polynomial factors). The conductance of a nonempty set is
and the conductance of the entire Markov chain is . It is often useful to view the conductance of a set as an escape probability—starting from stationarity and conditioned on being in , the conductance is the probability that leaves in one step.
Theorem 2.1 ([Lpw17]).
For an ergodic, reversible Markov chain with conductance , .
To show that a Markov chain is slow mixing, it suffices to show that the conductance is exponentially small.
In this paper we the study single-site Glauber dynamics for the six-vertex model. This Markov chain makes local moves by (1) choosing an internal cell of the lattice uniformly at random and (2) reversing the orientations of the edges that bound the chosen cell if they form a cycle. In the lattice path interpretation of the model, these dynamics correspond to the mountain-valley Markov chain that flips corners. Transitions between states are made according to the Metropolis-Hastings acceptance probability so that the Markov chain converges to the desired stationary distribution.
2.2 Correlated Random Walks
A key tool in our analysis for the ferroelectric phase are correlated random walks, which generalize simple symmetric random walks by accounting for momentum. A one-dimensional correlated random walk with momentum parameter starts at the origin and is defined as follows. Let
be a uniform random variable with support. For all subsequent steps , the direction of the process is correlated with the direction of the previous step and satisfies
We denote the position of the walk at time by . It will often be useful to make the change of variables
when analyzing the six-vertex model. In many cases this also leads to cleaner expressions. We use the following probability density function (PDF) for the position of a correlated random walk to develop a new tail inequality (Lemma 3.4) that holds for all values of .
Lemma 2.2 ([Hf98]).
For any integers and , the PDF of a correlated random walk is
3 Slow Mixing in the Ferroelectric Phase
In this section we use a conductance-based argument to show that Glauber dynamics for the six-vertex model can be slow mixing in all of the ferroelectric region. Specifically, we show that there exist boundary conditions that induce an exponentially small, asymmetric bottleneck in the state space, revealing a natural trade-off between the energy and entropy in the system. Viewing the six-vertex model as an intersecting lattice path model, our approach demonstrates how to plant polynomially-many paths in the grid that can (1) be analyzed independently, while (2) being capable of intersecting maximally. This idea of path independence makes our analysis tractable and allows us to interpret the dynamics of a path as a correlated random walk, for which we develop an exponentially small tail bound in Section 5. Since escape probabilities govern mixing times [PS15], we show how to relate the expected maximum deviation of a correlated walk to the conductance of the Markov chain to prove slow mixing. In addition to showing slow mixing up to the conjectured threshold, a surprising feature of our argument is that it potentially gives a combinatorial explanation for the phase transition from the ferroelectric to disordered phase. In particular, Lemma 3.5 demonstrates how the parameters of the model delicately balance the probability mass of the Markov chain.
Next, we exploit the invariance of the Gibbs measure and the lattice path interpretation of the six-vertex model to conveniently reparameterize the Boltzmann weights. Specifically, we let and so that we can ignore empty sites. Note that . We also let and so that the weight of a configuration only comes from straight segments and intersections of neighboring lattice paths.
3.1 Constructing the Boundary Conditions and Cut
We begin with a few colloquial definitions for lattice paths that allow us to easily construct the boundary conditions and make arguments about the conductance of the Markov chain. We call a -step, north-east lattice path starting from a path of length , and if the path ends at we describe it as tethered. If , we define the deviation of to be . Geometrically, path deviation captures the (normalized) maximum perpendicular distance of the path to the line . We refer to vertices along the path as corners or straights depending on whether or not the path turned. If two paths intersect at a vertex we call this site a cross
. Note that this classifies all vertex types in the six-vertex model.
We consider the following independent paths boundary condition for an six-vertex model for the rest of the section. To construct this boundary condition, we consider its lattice path interpretation. First, place a tethered path that enters horizontally and exits horizontally. Next, place translated tethered paths of varying length above and below the main diagonal, each separated from its neighbors by distance . Specifically, the paths below the main diagonal begin at the vertices and end at the vertices , respectively. The paths above the main diagonal begin at and end at . To complete the boundary condition, we force the paths below the main diagonal to enter vertically and exit horizontally. Symmetrically, we force the paths above the main diagonal to enter horizontally and exit vertically. See Figure 2(a) for an illustration of the construction when all paths have small deviation.111Our mixing result holds more generally for boundary conditions with paths separated by distance whenever we have and . It appears that polynomially many paths are required to push the slow mixing region all the way to the critical line separating the ferroelectric phase from the disordered phase.
Next, we construct an asymmetric cut in the state space induced by this boundary condition in terms of its internal lattice paths. In particular, we analyze a set of configurations such that every path in a configuration has small deviation. Formally, we let
Observe that by our choice of separation distance and the deviation limit for , no paths in any configuration of intersect. It follows that the partition function for factors into a product of partition functions, one for each path with bounded deviation. This intuition is useful when analyzing the conductance as an escape probability from stationarity.
3.2 Lattice Paths as Correlated Random Walks
Now we consider weighting the internal paths according to the six-vertex model. The main result in this subsection is that random tethered paths are exponentially unlikely to deviate past , even if drawn from a Boltzmann distribution that favors straights (Lemma 3.1). Let denote the distribution over tethered paths of length such that
Let and . For sufficiently large and , we have
We defer the proof of Lemma 3.1 to Appendix A. Instead, we sketch its key ideas to (1) demonstrate the connection between biased tethered paths and correlated random walks, and (2) to show how the supporting lemmas interact.
First, observe that there is a natural measure-preserving bijection between biased tethered paths of length and correlated random walks of length that return to the origin. Concretely, for a correlated random walk parameterized by , we have
Now we present an asymptotic equality that generalizes the return probability of simple symmetric random walks. This allows us to relax the condition in Equation 1 that a correlated random walk returns to the origin, and instead we bound at the expense of an additional polynomial factor.
Lemma 3.2 ([Gil55]).
For any constant , the return probability of a correlated random walk is
Another result needed to prove Lemma 3.1 is that the PDF for correlated random walks is unimodal.
For any momentum parameter and sufficiently large, the probability of the position of a correlated random walk is unimodal. Concretely, for all , .
Last, we give an upper bound for the position of a correlated random walk. We fully develop this inequality in Section 5 by analyzing the asymptotic behavior of the PDF in Lemma 2.2. Observe that Lemma 3.4 demonstrates exactly how the tail behavior of simple symmetric random walks generalizes to correlated random walks as a function of .
Let and . For sufficiently large, a correlated random walk satisfies
3.3 Bounding the Conductance and Mixing Time
Now we bound the conductance of the Markov chain by viewing as an escape probability. We start by claiming that (as required by the definition of conductance) if and only if the parameters are in the ferroelectric phase. We defer the proof of Lemma 3.5 to Appendix A. Then we use the correspondence between tethered paths and correlated random walks (Section 3.2) to prove that is exponentially small.
Let and be constants. For sufficiently large, we have .
Our analysis of the escape probability from critically relies on the fact that paths in any state are non-intersecting. Combinatorially, we exploit the factorization of the generating function for states in as a product of independent path generating functions.
Let be constants. For sufficiently large, we have .
The conductance can be understood as the following escape probability. Sample a state from the stationary distribution conditioned on , and run the Markov chain from for one step to get a neighboring state . The definition of conductance implies that is the probability that . Using this interpretation, we can upper bound by the probability mass of states that are near the boundary of in the state space, since the process must escape in one step. Therefore, it follows from the independent paths boundary condition and the definition of that
Next, we use a union bound over the different paths in a configuration and consider the event that a particular path deviates by at least . Because all of the paths in are independent, we only need to consider the behavior of in isolation. This allows us to rephrase the conditional event. Relaxing the conditional probability of each term in the sum gives
For large enough , the length of every path is in the range since we eventually have . Therefore, we can apply Lemma 3.1 with the error to each term and use the universal upper bound
It follows from the union bound and previous inequality that the conductance is bounded by
which completes the proof. ∎
Let and . For sufficiently large, we have .
Last, we restate our main theorem and use Theorem 3.7 to show that Glauber dynamics for the six-vertex model can be slow mixing for all parameters in the ferroelectric phase.
Without loss of generality, we reparameterized the model so that , , and . Therefore, Glauber dynamics with the independent paths boundary condition is slow mixing if by Theorem 3.7. Since the rotational invariance of the six-vertex model implies that and are interchangeable parameters, this mixing time result also holds for . ∎
4 Slow Mixing in the Antiferroelectric Phase
Now we consider the mixing time of Glauber dynamics in the antiferroelectric phase. In this region, the Boltzmann weights satisfy , so configurations tend to favor corner (type-) vertices. The main insight behind our slow mixing proof is that when is sufficiently large, the six-vertex model can behave like the low-temperature hardcore model on where configurations predominantly agree with one of two ground states. Liu recently formalized this argument in [Liu18] and showed that Glauber dynamics for the six-vertex model with free boundary conditions requires exponential time when , where is the connective constant of self-avoiding walks on the square lattice [GC01]. His proof uses a Peierls argument based on topological obstructions introduced by Randall [Ran06] in the context of independent sets. In this section, we extend Liu’s result to the region depicted in Figure 1(b) by computing a closed-form multivariate generating function that upper bounds the number of self-avoiding walks and accounts for disparities in their Boltzmann weights induced by the parameters of the six-vertex model.
4.1 Topological Obstruction Framework
We start with a recap of the definitions and framework laid out in [Liu18]. There are two ground states in the antiferroelectric phase such that every interior vertex is a corner: (Figure 3(a)) and (Figure 3(b)). These configurations are edge reversals of each other, so for any state we can color its edges red if they are oriented as in or green if they are oriented as in . See Figure 3(c) for an example. It follows from case analysis of the six vertex types (Figure 1) that the number of red edges incident to any internal vertex is even, and if there are only two red edges then they must be rotationally adjacent to each other. The same property holds for green edges by symmetry. Note that the four edges bounding a cell of the lattice are monochromatic if and only if they are oriented cyclically, and thus reversible by Glauber dynamics. We say that a simple path from a horizontal edge on the left boundary of to a horizontal edge on the right boundary is a red horizontal bridge if it contains only red edges. We define green horizontal bridges and monochromatic vertical bridges similarly. A configuration has a red cross if it contains both a red horizontal bridge and a red vertical bridge, and we define a green cross likewise. Let be the set of all states with a red cross, and let be the set of all states with a green cross. We have by Lemma 4.1.
Next, we define the dual lattice to describe configurations in . The vertices of are the centers of the cells in , including the cells on the boundary that are partially enclosed, and we connect dual vertices by an edge if their corresponding cells are diagonally adjacent. Note that is a union of two disjoint graphs (Figure 4(a)). For any state there is a corresponding dual subgraph defined as follows: for each interior vertex in , if is incident to two red edges and two green edges, then contains the dual edge passing through that separates the two red edges from the two green edges. This construction is well-defined because the red edges are rotationally adjacent. See Figure 4(b) for an example. For any , we say that has a horizontal fault line if contains a simple path from a left dual boundary vertex to a right dual boundary vertex. We define horizontal fault lines similarly and let be the set of all states containing a horizontal or vertical fault line. Observe that fault lines completely separate red and green edges, and hence are topological obstructions that prohibit monochromatic bridges.
Last, we extend the notion of fault lines to almost fault lines. We say that has a horizontal almost fault line if there is a simple path in connecting a left dual boundary vertex to a right dual boundary vertex such that all edges except for one are in . We define vertical almost fault lines similarly and let the set denote all states containing an almost fault line. Finally, let denote the set of states not in that one move away from in the state space according to the Glauber dynamics.
Lemma 4.1 ([Liu18]).
We can partition the state space into . Furthermore, .
4.2 Weighted Non-Backtracking Walks and a Peierls Argument
In this subsection we show that is an exponentially small bottleneck in the state space . The analysis relies on Lemma 4.1 and a new multivariate upper bound for weighted self-avoiding walks (Lemma 4.2). Our key observation is that when a fault line changes direction, the vertices in its path change from type- to type- or vice versa. Therefore, our goal in this subsection is to generalize the trivial upper bound for the number of self-avoiding walks by accounting for their changes in direction in aggregate. We achieve this by using generating functions to solve a system of linear recurrence relations.
We start by encoding non-backtracking walks that start from the origin and take their first step northward using the characters in , representing straight, left, and right steps. For example, the walk SLRSSL corresponds uniquely to the sequence . If a fault line is the same shape as SLRSSL up to a rotation about the origin, then there are only two possible sequences of vertex types through which it can pass: and . This follows from the fact that once the first vertex type is determined, only turns in the self-avoiding walk (i.e., the L and R characters) cause the vertex type to switch. We define the weight of a fault line to be the product of the vertex types through which it passes. More generally, we define the weight of a non-backtracking walk that initially passes through a fixed vertex type to be the product of the induced vertex types according to the rule that turns toggle the current type. Formally, we let denote the weight of a non-backtracking walk that starts by crossing a type- vertex. We define the function similarly and note that and . Last, observe that a sequence of vertex types can have many different walks in its preimage. The non-backtracking walk SRRSSR also maps to and —in fact, there are such walks in this example since we can interchange L and R characters.
The idea of enumerating the preimages of a binary string corresponding to sequence of vertex types suggests a recursive approach for computing the sum of weighted non-backtracking walks. This naturally leads to the use of generating functions, so overload the variables and to also denote function arguments. For nonempty binary string , let count the number of pairs of adjacent characters that are not equal and let denote the number of ones in (e.g., if then and ). The sum of weighted self-avoiding walks is upper bounded by the sum of weighted non-backtracking walks, so we proceed by analyzing the following function:
Note that recovers the number of non-backtracking walks that initially cross type- or type- vertices. We compute a closed-form solution for in Lemma B.2 by diagonalizing a matrix corresponding to the system of recurrence relations, which allows us to accurately capture the discrepancy between fault lines when the Boltzmann weights and differ. We use the following more convenient upper bound for in our Peierls argument and defer its proof to Appendix B.
Let be the generating function for weighted non-backtracking walks defined in Equation 3. For any integer and , we have
We are now ready to present our Peierls argument to bound , which gives us a bound on the conductance and allows us to prove Theorem 1.2. First, we describe which antiferroelectric parameters cause to decrease exponentially fast. We defer the proof of this computation to Appendix B.
If is antiferroelectric and , then .
If is antiferroelectric and , for free boundary conditions we have
For any self-avoiding walk and dual vertices on the boundary, let be the set of states containing as a fault line or an almost fault line such that starts at and ends at . Without loss of generality, assume that the (almost) fault line is vertical. Reversing the direction of all edges on the left side of defines the injective map such that if is a fault line of , then the weight of its image is amplified by or . See Figure 4(c) for an example. Similarly, if is an almost fault line, decompose into subpaths and separated by a type- vertex such that starts at and ends at . In this case, the weight of the images of almost fault lines is amplified by a factor of for some . Using the fact that is injective and summing over the states containing as a fault line and an almost fault line separately gives us
where the sum is over all decompositions of into and .
Equipped with Equation 4 and Lemma 4.2, we use a union bound over all pairs of terminals and fault line lengths to upper bound in terms of our generating function for weighted non-backtracking walks . Since the antiferroelectric weights satisfy , it follows from Lemma 4.3 that
Note that the convolutions in the first inequality generate all almost weighted non-backtracking walks. ∎
Proof of Theorem 1.2.
Let , , and . It follows from Lemma 4.1 that is a partition with the properties that and . Since the partition is symmetric, Lemma 4.4 implies that , for sufficiently large. Therefore, we can upper bound the conductance by . Using Theorem 2.1 along with Lemma 4.4 and Lemma 4.3 gives the desired mixing time bound. ∎
5 Tail Behavior of Correlated Random Walks
In this section we prove Lemma 3.4, which gives an exponentially small upper bound for the tail of a correlated random walk as a function of its momentum parameter . Our proof builds off of the PDF for the position of a correlated random walk restated below, which is combinatorial in nature and not readily amenable for tail inequalities. Specifically, the probability is a sum of marginals conditioned on the number of turns that the walk makes [RH81].
There are two main ideas in our approach to develop a more useful bound for the position of a correlated random walk . First, we construct a smooth function that upper bounds the marginals as a function of (a continuation of the number of turns in the walk ), and then we determine its maximum value. Next we show that the log of the maximum value is asymptotically equivalent to for , which gives us desirable bounds for sufficiently large values of . We note that our analysis illustrates precisely how correlated random walks generalize simple symmetric random walks and how the momentum parameter controls the exponential decay.
5.1 Upper Bounding the Marginal Probabilities
We start by using Stirling’s approximation to construct a smooth function that upper bounds the marginal terms in the sum of the PDF for correlated random walks. For , let
It can easily be checked that is continuous on all of using the fact that .
For any integer , a correlated random walk satisfies
Consider the probability density function for in Lemma 2.2. If the claim is clearly true, so we focus on the other case. We start by bounding the rightmost polynomial term in the sum. For all , we have
Next, we reparameterize the marginals in terms of , where , and use a more convenient upper bound for the binomial coefficients. Observe that
Stirling’s approximation states that for all we have
so we can bound the products of binomial coefficients up to a polynomial factor by
The proof follows the definition of given in Equation 5. ∎
There are polynomially-many marginal terms in the sum of the PDF, so if the maximum term is exponentially small, then the total probability is exponentially small. Since the marginal terms are bounded above by an expression involving , we proceed by maximizing on its support.
The function is maximized at the critical point
We start by showing that is log-concave on , which implies that it is unimodal. It follows that a local maximum of is a global maximum. Since and are fixed as constants and because the numerator is positive, it is sufficient to show that
is concave. Observe that the first derivative of is
and the second derivative is
Because on , the function is log-concave and hence unimodal.
To identify the critical points of , it suffices to determine where since is increasing. Using the previous expression for , it follows that
Therefore, the critical points are the solutions of , so we have
It remains and suffices to show that is a local maximum since is unimodal. Observing that
using the chain rule, the definition ofgives
We know , so has the same sign as . Therefore, is a local maximum of . Using the continuity of on and log-concavity, is a global maximum. ∎
It is worth noting that for , the asymptotic behavior of the critical point is continuous as a function of . In particular, it follows from Lemma 5.2 that .
5.2 Asymptotic Behavior of the Maximum Log Marginal
Now that we have a formula for , and hence an expression for , we want to show that
for some constant . Because there are polynomially-many marginals in the sum, this leads to an exponentially small upper bound for . Define the maximum log marginal to be
Equivalently, we show that for sufficiently large using asymptotic equivalences.
The maximum log marginal can be symmetrically expressed as
Grouping the terms of by factors of , and gives
Using Equation 6, observe that the last term is
The proof follows by grouping the terms of the desired expression by factors of and . ∎
The following lemma is the crux of our argument, as it presents an asymptotic equality for the maximum log marginal in the PDF for correlated random walks. We remark that we attempted to bound this quantity directly using Taylor expansions instead of an asymptotic equivalence, and while this seems possible, the expressions are unruly. Our asymptotic equivalence demonstrates that second derivative information is needed, which makes the earlier approach even more unmanageable.
For any and , the maximum log marginal satisfies .