A Combined Molecular Simulation-Molecular Theory Method Applied To A Polyatomic Molecule In A Dense Solvent

Simulation of small molecules, polymers, and proteins in dense solvents is an important class of problems both for processing of materials in liquids and for simulation of proteins in physiologically relevent solvent states. However, these simulations are expensive and sampling is inefficient due to the ubiquitous dense solvent. Even in the absence of the dense solvent, rigorous sampling of the configurational space of chain molecules and polypeptides with traditional Metropolis Monte-Carlo, or molecular dynamics is difficult due to long time scales associated with equilibration. In this paper we discuss a series of configurational-bias Monte-Carlo (CBMC) simulations that use a rigorous molecular theory based implicit solvent to achieve efficient sampling of a chain molecule in a dense liquid solvent. The molecular theory captures solvent packing around the chain molecule as well as the energetic effects of solvent-polymer interactions. It also accounts for entropic effects in the solvent.


Introduction
Simulation of molecules with implicit solvent has become an important computational strategy employed to increase the efficiency of calculations by effectively integrating out the solvent degrees of freedom [1][2][3][4][5].These approaches have been detailed extensively elsewhere [6,8], and so we give only a brief summary.
One class of models assumes that the solvation free energy is proportional to the solvent accessible surface area (SASA) of the solute molecule of interest [9][10][11].The Generalized Born (GB) method extends analytic approximations for solvation energies of ions to macromolecules [2,11,12].Similarly, the screened coulomb potential (SCP) models combine estimates for electrostatic energy with SASA based approximation for nonpolar solvation energy [13].Finally, a 3-dimensional solution to Poisson's equation can be obtained in the vicinity of a solute molecule [8,12,14,15].This approach is much more computationally demanding than the heuristic approaches cited above, but has the advantage of producing a unique and detailed 3-dimensional picture of solvation as a function of protein conformation.Even so, this approach is based on a continuum approximation for the dense solvent (water).
This letter presents implicit solvent simulations that focus on the short range structural solvation effects ignored in virtually all of the implicit solvent methods cited above.To this end we consider a simple neutral system consisting of a 10-bead tangent site chain immersed in a solvent.We efficiently sample chain conformations using a configurational-bias Monte Carlo (CBMC) approach.We compare the predictions of explicit solvent and 3 different implicit solvent models including a continuum solvent, a pair solvation potential model, and a 3-dimensional (3D) solvation model based on density functional theory (DFT).

Theory
In this section we briefly review the statistical mechanics underpinnings for implicit solvent approaches.Our derivation was originally developed in the context of colloidal stability [16], and differs from others that have been presented in the literature [6] in that we consider a semi-grand rather than a canonical ensemble.We begin with the partition function for a system that is held at constant volume, V and temperature, T , and that is closed with respect to the number, N, of solute molecules in the system while it is open with respect to the number, n of solvent molecules in the system.In this case, the solvent chemical potential is µ and the semi-grand partition function is where U N denotes the intramolecular interactions of the solute molecule while u n includes both the solvent-solvent and solvent-solute interactions.The term in the curly brackets above can be identified immediately as the grand canonical partition function for a fluid in an external field (Ξ), and Ξ is related to the grand potential of that inhomogeneous fluid system by Ω(r N ; µ, T ) = −β −1 ln Ξ µV T .The partition function in equation 1 can be rewritten as where the configurational integral is now only over the configurations of the solute molecule, and the solvent is captured by the effective potential, Ω.Note that Ω is a free energy and therefore includes both energetic and entropic contributions of the fluid phase.This brief derivation provides a firm statistical mechanical basis for implicit solvation in molecular simulations.
The solvation free energy, Ω is intimately related to the potential of mean force (PMF).Specifically, the PMF is W (r) = Ω(r N ) − Ω ref where Ω ref is the free energy of a suitable reference system [6].If chosen properly, the reference system is irrelevant to the simulation because it is a constant shift to the free energy.While the derivation presented above is based on the statistical mechanics of inhomogeneous fluids, a much older derivation of the PMF can be found in the literature.In its earliest description, McMillian and Mayer derived a PMF for electrolytes from a consideration of solutions [7].In their derivation, the PMF is where g(r) is the radial distribution function.In the McMillian-Mayer treatment, the reference system is two atoms (or ions) separated by r = ∞ in the solvent (electrolyte) of interest.That choice leads to the intuitive limit W (r) → 0 (or g(r) → 1) as r → ∞.While this is the intuitive choice for a reference system when one considers the interaction of two monatomic particles in a solvent or electrolyte, it is not the obvious choice for a solvated polyatomic chain.In this case the PMF describes solvent mediated intramolecular interactions.There is no particular interest in computing these PMFs relative to say the sum of the solvation free energies of the individual atoms that make up the chain.Rather, we find that the convenient reference system for the polyatomic case is a pure solvent.Thus our PMF is a measure of the surface excess free energy, Ω ex .We note that while it is not straightforward to directly apply equation 3 to the simulation of solvated polyatomic molecules, there is a rather obvious approximation that we will consider for comparison with the 3D implicit solvation approach developed here.This approximation is based on interactions of pairs of atoms on the polyatomic chain.To apply this approach, we first compute g(r) for an atomic fluid of Lennard-Jones spheres at the density of the solvent of interest in the chain simulations.We then assume that the total solvation potential for a chain composed of those same Lennard-Jones spheres is where the sum runs over all pairs j = i, j = i + 1, and j = i − 1, ignoring nearest neighbors on the tangent chain.Before presenting results for the polyatomic solute, we briefly consider a simpler system where the PMF for both McMillian-Mayer theory and the 3D solvation approach presented here can be computed.Specifically, consider a simple Lennard-Jones fluid.The McMillian-Mayer PMF was computed from equation 3 where the radial distribution function was computed from a Monte-Carlo simulation of the atomic system.The 3D solvation PMF was computed via a series of 3D-DFT calculations where two atoms from the fluid were treated explicitly as surfaces in the calculation.The one-body external field generated by these two atoms was simply based on the same Lennard-Jones potential used to describe fluid interactions.The distance between these two explicit atoms was varied to find W (r) = Ω ex (r) − Ω ex (∞).The PMF computed using Monte-Carlo simulation (solid line), and the PMF computed using a series of 3-dimensional DFT calculations (symbols) as described in the main text.The fluid system was an atomic Lennard-Jones fluid with a cutoff distance of r c /σ = 2.5, at a temperature of kT / = 0.8, and a density of ρσ 3 = 0.75.

CBMC-DFT simulations
This paper presents results for both implicit and explicit solvent simulations.The specific numerical implementations have been detailed elsewhere [17,18].In all cases, interparticle interactions are described by Lennard-Jones interactions that are cut and shifted at 2σ where σ is the characteristic size of the beads on the tangent chain as well as the size of the solvent particles.Furthermore, the energy parameter for all interactions is = 1.
The implicit solvent simulations were run on the Cplant distributed memory parallel computer located at Sandia National Labs.The simulations were on an average of 32 processors and required approximately 3 days to complete.While these simulations appear expensive, the true cost of a simulation is measured by both the CPU time and the efficiency of sampling.If trial moves are rarely accepted, poor statistical sampling of the configuration space occurs and a fast algorithm is irrelevant.We return to the question of efficiency in the next section.
In the implicit 3D solvation simulations presented here, a 3D-DFT calculation was performed at each Monte-Carlo move attempt, and 5,000 total Configurational-Bias Monte Carlo (CBMC) moves were performed.Each move consisted of an attempted chain regrowth along with a calculation of the DFT based potentials of mean force on which acceptance criteria were based.CBMC generates trial moves in a biased manner that favors lower energy conformations and then removes this bias by utilizing the ratio of the new and old Rosenbluth weights as the acceptance probability.A coupled-decoupled CBMC algorithm [19] was utilized with 10 nonbond trials coupled to 360 dihedrals trials and decoupled from 1000 trials for the bending angles.The implicit solvation term was ignored during the CBMC growth procedure, computed once the growth was complete, and incorporated into the acceptance criteria as where W R is the total Rosenbluth weight for the CBMC growth procedure.Explicit solvent simulations were run for 10,000 CBMC cycles, and required approximately 30 hours on a single processor.An NPT Gibbs ensemble mimic of the semi-grand canonical ensemble was used in these simulations with 2500 solvent molecules.The DFT we use here is a perturbation approach that includes volume exclusion and van der Waals attraction effects.Volume exclusion is treated with Rosenfeld's fundamental measures theory [20,21], and the attractions are treated in a strict mean field approximation.Specifically, we write where the first term is the ideal contribution, the second term is the excess hard sphere term as described by Rosenfeld's FMT theory [21], and the third term gives the attractions where the Weeks-Chandler-Anderson approximation [22] is used to define the attractive potential, u a .The external field, V (r) is given by interactions of a solvent particle with the chain in a given conformation, where the "cs" subscript denotes the chain-solvent interaction.
In the DFT calculation, Ω[ρ(r)] in equation 6 is minimized with respect to the density distribution, ρ(r).Given the equilibrium solution for ρ(r), the grand free energy, Ω is then calculated from equation 6.The numerical methods behind our 3D-DFT implementation have been described elsewhere [23,24].Here we set the cartesian mesh size to σ/4.Furthermore, we assume that some distance from the chain, t, the fluid is uniform.Beyond this distance (for |r| > t) we set ρ(r) = ρ b .At every node, i in this uniform region, we replace the complex DFT residual equations with the discretized residual ρ i − ρ b = 0 when forming the matrix problem.For the calculations presented in this paper, we have taken this distance to be t = 1.25σ from the surface of the molecule.At this distance, we find that the errors in free energy difference with conformation change are less than 30% of the t → ∞ limit as shown in figure 2. The coarse mesh and the rather small t facilitate DFT solution fast enough O(100 seconds) for coupling with CBMC.One example of a 3D-DFT solution is shown in figure 3 where the dark regions indicate positions of the segments on the chain, and the light regions correspond to a density isosurface that indicates regions of adsorption of the solvent on the chain.

Results
We present here simulations that vary the fluid from a low density gas-like state (ρσ 3 = 0.1) to a high density liquid-like state (ρσ 3 = 0.7) all at kT / = 1.While the explicit method is reasonably efficient at generating polymer conformations at low solvent density, the liquid-like solvent densities are much more difficult.Table 1 compares the acceptance statistics of the implicit (Im) and explicit (Ex) solvent methods at the liquid-like density of ρσ 3 = 0.7.The table shows both the acceptance rates and the ratio of those rates as a measure of the increased efficiency of the implicit solvent method from the perspective of sampling alone.In this dense solvent the implicit solvent method is O(100) times slower than the explicit method from a CPU perspective, yet it is O(1000+) more efficient than the explicit solvent simulation with respect to actual sampling of the interior atom conformations of the molecule.A complete demonstration of the method also requires a comparison of Figure 4 shows the radius of gyration of the chain as a function of solvent density.In addition to the results for the explicit solvent and the implicit 3D solvation methods, the figure also shows the results for the pair potential approach of equation 4, and a continuum solvent approach that uses the same CBMC-DFT mechanics, but sets the solvation radius to t = 0 thus assuming that there is bulk solvent everywhere around the chain.This is equivalent to an incompressible fluid approximation [25].At low densities where good acceptance statistics are obtained from all methods we find that the 3D solvation method presented here agrees well with the explicit solvent CBMC calculations.At higher densities, there is a deviation in the predictions of the 3D solvation model from the fully explicit model.However, in the dense solvent, there are large error bars associated with the explicit solvent CBMC calculations due to poor acceptance of chain regrowth.Thus we conclude that the implicit solvent approach yields more accurate results than the explicit solvent approach.The pair potential approach of equation 4 deviates from the 3D solvation approach in underpredicting the radius of gyration at all densities.At ρσ 3 = 0.1 the pair potential approach matches the results from the continuum fluid model.Since the pair potential does not accurately treat the additive effects due to multiple chain segments interacting with a particular region of fluid, deviation from the full 3D solvation model is to be expected.If anything, this approximation seems to work better than might be expected from such a simple approach.It remains to be seen if this is a general result or is specific to this case where the polymer and fluid are composed of identical segments.
In contrast, the continuum solvent model underpredicts the radius of gyration for all the densities considered here.More importantly, the incompressible fluid model is qualitatively wrong in that it predicts decreasing radius of gyration with increasing solvent density.

Conclusions
In this work we present the first combination of CBMC with an accurate 3D-DFT based molecular theory approach where the DFT calculations are performed on-the-fly for every trial configuration generated by the CBMC algorithm.We test the approach with simulations of a 10-mer chain in solvents of varying density.These implicit solvent CBMC-DFT calculations exhibit a significantly improved efficiency with respect to sampling chain conformations over explicit solvent CBMC calculations of the same system.

Figure 1 Figure 1 .
Figure1demonstrates that these two routes produce identical results, and confirms the accuracy of our 3D-DFT numerical implementation.

Figure 2 .
Figure 2. The deviation of the solvation free energy from the t → ∞ limit for three pairs of configurations.

Figure 3 .
Figure 3.A density isosurface at ρσ 3 = 1.5 that indicates regions of adsorption on the chain on particular 3D conformation.

Figure 4 .
Figure 4. CBMC results for radius of gyration as a function of solvent density at kT / = 1.The curves show explicit solvent (inverted triangles, wide error bars), a 3D implicit solvent (circles, narrow error bars), pair potential solvation (open squares), and continuum solvent (diamonds) models.For the last two cases, the error bars are smaller than the symbols in the figure.

Table 1 .
Acceptance statistics for explicit and implicit solvent CBMC calculations of a solvated chain at a high temperature (kT / = 1) and a high solvent density (ρσ 3 = 0.7).Regrowth statistics for various chain lengths attempted in the CBMC simulation are shown.