Comparison of the TIP4P-2005, SWM4-DP and BK3 interaction potentials of liquid water with respect to their consistency with neutron and X-ray diffraction data of pure water

Following a fairly comprehensive study on popular interaction potentials of water (Pusztai et al., J. Chem. Phys., 2008, 129, 184103), here two more recent polarizable potential sets, SWM4-DP (Lamoureux et al., Chem. Phys. Lett., 2006, 418, 245) and BK3 (Kiss et al., J. Chem. Phys., 2013, 138, 204507) are compared to the TIP4P-2005 water potential (Abascal et al., J. Chem. Phys., 2005, 123, 234505) that had previously appeared to be most favoravble. The basis of comparison was the compatibility with the results of neutron and X-ray diffraction experiments on pure water, using the scheme applied by Pusztai et al. (2008). The scheme combines the experimental total scattering structure factors (TSSF) and partial radial distribution functions (PRDF) from molecular dynamics simulations in a single structural model. Goodness-of-fit values to the O-O, O-H and H-H simulated PRDF-s and to the experimental neutron and X-ray TSSF provided a measure that can characterize the level of consistency between interaction potentials and diffraction experiments. Among the sets of partial RDF-s investigated here, the ones corresponding to the SWM4-DP potential parameters have proven to be the most consistent with the particular diffraction results taken for the present study, by a hardly significant margin ahead of BK3. Perhaps more importantly, it is shown that the three sets of potential parameters produce nearly equivalent PRDF-s that may all be made consistent with diffraction data at a very high level. The largest differences can be detected in terms of the O-O partial radial distribution function.


Introduction
When assessing the results of computer simulations that use interaction potentials, comparison with experimental data is essential. For the microscopic structure the proper quantity to compare with is the total scattering structure factor (TSSF), F (Q), that is related to the partial radial distribution functions (PRDF) via:  In the above equations, c i and b i are the molar ratio and the neutron scattering length of species i, G(r) is the total radial distribution function, ρ 0 is the number density and Q is the scattering variable (proportional to the scattering angle); indices i and j run through nuclear species. (For details of the formalism used here, see reference [1]).
Pure liquid water is arguably the system that has been targeted by the greatest number of computer simulation studies of all pure materials (see, e.g., works of Prof. Holovko on the subject [2,3]), resulting in an excessive number of interaction potential parameter sets. The importance of the issue is reflected by the fact that new potential parameters have been introduced even in this very calendar year [4]. Reviews describing many of the available interaction potential models introduced for liquid water are, for instance, references [5,6]. Unfortunately, the overwhelming majority of computer simulation studies show comparisons with the 'experimental' PRDF-s only [4,7,8], whose functions have been shown to be only interpretations of the measured diffraction data [9][10][11][12]. The proper quantity to cross-check with would be the direct experimental information, the total scattering structure factors. In order to simply resolve this (mostly, technical) issue, a simple protocol was introduced some years ago [10], that was later applied to the case of pure water [11]. The main finding of this latter investigation was, in short, that out of the 8 water potential models considered therein, it was TIP4P-2005 [8] that proved to be consistent at the highest level with a given neutron diffraction dataset [13] taken on pure liquid heavy water.
In this work, we wish to extend the investigation described in reference [11], by (a) adding two recent polarizable water potential models, SWM4-DP [7] and BK3 [4]; the reasons for picking exactly these two potential parameter sets was that, on the one hand, SWM4-DP could be very successfully applied to structural studies of various concentrated aqueous electrolyte solutions [14,15], and, on the other hand, the very recent BK3 set is claimed to be capable of outperforming most water potentials in a detailed comparison with a great number of properties [4]; (b) scrutinizing the compatibility not only with neutron-, but also with X-ray diffraction data.
Here, results for SWM4-DP and BK3 will be compared to those obtained for TIP4P-2005; this is the way we wish to find a direct link to our earlier study [11]. As experimental data, the same neutron diffraction results as considered previously are taken from reference [13] on liquid heavy water together with one of the most recent published sets of X-ray data of Fu et al. [16].

Reverse Monte Carlo modelling
The protocol mentioned above [10] is based on a now 25 year old technique of structural modelling, the so-called Reverse Monte Carlo (RMC) method [17] and, therefore, a short description of the method may be appropriate here.
Reverse Monte Carlo [17][18][19][20][21] is a simple tool for constructing large, three-dimensional structural models that are consistent with total scattering structure factors (within the estimated level of their errors) obtained from diffraction experiments. Via random movements of particles, the difference between experimental and model total structure factors (calculated similarly to the χ 2 -statistics) is minimized. As a result, by the end of the calculation, a particle configuration is available that is consistent with the experimental structure factor(s). If the structure is analyzed further, partial radial distribution functions, as well as other structural characteristics (neighbor distributions, cosine distribution of bond angles) can be calculated from the particle configurations.
A possible algorithm that can realize the above features may be outlined as follows [17]: 1. Start with an initial collection of particle coordinates in a cubic box; this may be a crystalline or a random distribution of at least a few thousand of particles, or even the final particle distribution from the previous simulation.
2. Calculate the partial radial distribution functions for the configuration. Compose total radial distribution functions according to the experimental weighting factors. Use Fourier transformation for calculating total scattering structure factors.
3. Calculate differences between model and experimental functions as follows (shown here for one single TSSF): The 'C' and 'E' superscripts refer to 'calculated' and 'experimental' functions, respectively; σ is a control parameter that is related to the assumed level of experimental errors. 4. Move one particle at random.
6. If the χ 2 for the new position is smaller than it was for the old position (i.e., the difference between simulated and measured TSSF-s has become smaller), then accept the move immediately. Otherwise accept the move only with a probability that is proportional to exp(−∆χ 2 ); accepting 'bad' moves with such small but finite probability will prevent calculations from sticking in local minima. If a move is 'accepted' then the 'new' position becomes the 'old' one for the next attempted move.

Continue from step 4.
The most valuable feature of the RMC method concerning the present investigation is that it can incorporate any piece of information that can be calculated directly from particle coordinates. Partial radial distribution functions coming from computer simulations are obviously these types of data as well as the measured total structure factors. If a consistency (i.e., agreement within errors) with all input data is reached, then it may be stated that these input data are mutually consistent. On the other hand, if some of the input data cannot be approached within their uncertainties, it means that a particular part of the input data set is not consistent with the other pieces of input information. In our case, this would mean that some of the input PRDF-s from MD simulations would not be consistent with the experimental input total scattering structure factor(s).
In the RMC calculations that are the basis of the present work, one total scattering structure factor from neutron diffraction [13], one TSSF from X-ray diffraction [16] and three partial radial distribution functions (O-O, O-H and H-H) from references [4,7,8] are used as input data for each RMC calculation. That is, in conjunction with each of the 3 potential parameter sets [4,7,8] RMC is performed with the requirement that 5 input datasets (neutron TSSF, X-ray TSSF, and O-O, O-H and H-H PRDF-s from molecular dynamics simulations described in references [4,7,8]) should be approached within the errors simultaneously. The primary condition assumed that experimental data must be reproduced at the same level as they are approached in the absence of PRDF-s from MD; this goal could be achieved by systematically varying the σ control parameters (see above) for each individual dataset. On average, at least 5 independent calculations were needed to find the proper balance between individual σ parameters; thus, the total number of calculations reported in this study is over 40 (see table 1).
Both experimental and simulation data were obtained at 298 K and 0.1 MPa (ambient conditions); the molecular number density was 0.0334 molecules Å −3 in each case. Molecular dynamics simulations that resulted in the PRDF-s used in this work were performed in the canonical (NVT) ensemble [4,7,8] at ambient conditions. Since the number of particles and the volume remain unchanged during a RMC calculation, the NVT ensemble is a rather natural analogue for our Reverse Monte Carlo computations as well (with the notion that temperature appears only implicitly, via the experimental number density and diffraction data applied). In each Reverse Monte Carlo run, the simulation box contained 2000 water molecules (6000 atoms). Typically, 'equilibration' (i.e., reaching the state where the fits to input data have not improved further and started to fluctuate) last for a few million accepted moves. Shorter refinement calculations (e.g., when small modifications of σ parameters for only one or two datasets were made) were run for about 200000 accepted moves. Goodness-of-fit values, R w -s (which are sums of the squared differences, see below), are reported in a normalised form, so that variations in terms of the number of r and Q points considered would not affect the assessment; additionally, the applied r and Q ranges were kept as uniform as possible.
Definitions of the various R w -s used throughout this contribution are provided below (for PRDF-s, only the example of the O-O g (r ) is shown): where N i and N j are the numbers of Q and r points, respectively, for the experimental TSSF-s and 'experimental' (MD simulated) g (r )-s, respectively. Indices 'C' and 'E' refer to 'RMC calculated' and 'experimental' quantities. In table 1 below, the square roots of the left hand side expressions are shown in the units of '%'. Table 1 summarizes the main findings of the present work, namely the goodness-of-fit values for each calculation reported here. It is instructive to look at figure 1 in parallel, so that the level of consistency with experimental TSSF-s may be appreciated in terms of both numbers and 'visual inspection'. It is clear that each PRDF-containing RMC run produced a full agreement, within the estimated error levels and without any visible sign of any disagreement, with both sets of diffraction data.  [4,7,8], together with neutron diffraction TSSF of heavy water [13] (part a) and X-ray diffraction TSSF of light water [16] (part b). Note that each curve runs together with the experimental results, indicating the agreement without any visible deviations from experiment.

Results and discussion
In table 1, the R w values for the reference RMC runs, containing either PRDF-s only or TSSF-s only, are also quoted. Clearly, the modelling of TSSF-s and PRDF-s together causes the deterioration of the goodnessof-fit; differences between R w -s obtained for 'combined' (TSSF+RSDF) and 'PRDF-only' calculations can give an idea as to how far from the MD results the best match to the experimental TSSF-s lies. In this sense, SWM4-DP PRDF-s seem to suffer the largest deviation from the end results of MD simulations [7], whereas BK3 PRDF-s [4] should have changed the least.
The level of consistency between a given interatomic potential and the two sets of experimental TSSFs is measured by the overall R w values ('R w sum' in table 1) of the 'combined' Reverse Monte Carlo calculations. According to this measure, it is the SWM4-DP potential [7] that comes out as the best (lowest 'R w sum'), but only by a very small gap ahead of BK3 [4]. TIP4P-2005 [8] performs noticeably worsebut still, the lag behind the two polarizable models is quite small. Considering that TIP4P-2005 was the clear 'winner' in the previous study (although using neutron diffraction data only) [11], the improvement of water potentials in terms of the structure is remarkable (even though only ambient pure water has been investigated here).  table 1). This is very different from the findings reported in reference [11], where the O-H partial appeared to be the most problematic. In this respect, the fact that in reference [11] only one experimental neutron TSSF was used makes a huge difference: the relevant contribution ('weighting factor') of the O-O PRDF to the neutron-weighted TSSF of pure heavy water is somewhat less than 10 %. That is, the neutron data considered [13] are rather insensitive to the variations of the O-O PRDF. The appearance of X-ray diffraction data makes a huge difference: such data for water are sometimes evaluated so that purely the O-O contribution (in the form of a 'center-of-mass TSSF' [16,23]) is quoted as final results. That is, the TSSF from X-ray diffraction experiments poses a very strong constraint (and consequently, leaves very little freedom) for the O-O partial. Since there is still a controversy about the most appropriate way of measuring and correcting X-ray diffraction data on water (see, e.g., reference  [13] and X-ray diffraction TSSF of light water [16]. Left hand panels: worst cases; right hand panels: best cases. Note that differences between 'worst' and 'best' are hardly visible.

43604-5
[24]), the issue should be investigated further in more detail, involving old [23] and new [24] results alike. Figure 3 compares PRDF-s resulting from MD simulations of the three water potentials [4,7,8] considered in this work. From a distance (even from a short one), the corresponding partials look remarkably similar to each other, much more so than reported by figure 3 of reference [11]. Therefore, it is quite in order to diagnose that over the past few years, the two-particle level structure of ambient liquid water brought about by various modern interatomic potential parameter sets seems to be converging. However, it is still worthwhile emphasizing some of the clear (although small) differences that may help develop the potentials even further on. The most successful O-O PRDF (see table 1), from BK3, appears to be more structured than its SWM4-DP equivalent but has a clearly weaker first maximum than that from TIP4P-2005. Also note that, unlike for the other two partials, there is no dispute about the position of the main peak here. As concerns the O-H PRDF, SWM4-DP turns out to be most appropriate: this potential produces the longest H-bonding distance (which is the first intermolecular O. . . H distance). This is consistent with the findings of references [9,11]. Also, SWM4-DP causes the most definite oscillations at higher r values.
Finally, for the H-H PRDF SWM4-DP, that seems to be somewhat better than the other two potentials, brings about the longest first intermolecular H. . . H distance, again, with a little better defined ordering at larger distances. In summary, it seems surprising that for all the three PRDF-s, the potential that produced the strongest long range ordering (by however small a margin) always turned out to be the best performer. g OO (r ); mid panel: g OH (r ); bottom panel: g HH (r ). Note that differences between corresponding partials are much smaller than reported in reference [11].

Conclusions and outlook
It has been shown that three modern interaction potential parameter sets for liquid water, TIP4P-2005 [8], SWM4-DP [7] and BK3 [4], are nearly equally consistent with the measured neutron [13] and X-ray [16] diffraction data of liquid water at ambient conditions (see figure 2). Meticulous comparison (see table 1) reveals that a distinction may be made between the 2 polarizable potentials (SWM4-DP and BK3) and the non-polarizable one (TIP4P-2005): in terms of the structure, these up-to-date polarizable models perform somewhat better than the 'champion' of the previous, analogous investigation [11].