Fluid of fused spheres as a model for protein solution

In this work we examine thermodynamics of fluid with"molecules"represented by two fused hard spheres, decorated by the attractive square-well sites. Interactions between these sites are of short-range and cause association between the fused-sphere particles. The model can be used to study the non-spherical (or dimerized) proteins in solution. Thermodynamic quantities of the system are calculated using a modification of Wertheim's thermodynamic perturbation theory and the results compared with new Monte Carlo simulations under isobaric-isothermal conditions. In particular, we are interested in the liquid-liquid phase separation in such systems. The model fluid serves to evaluate the effect of the shape of the molecules, changing from spherical to more elongated (two fused spheres) ones. The results indicate that the effect of the non-spherical shape is to reduce the critical density and temperature. This finding is consistent with experimental observations for the antibodies of non-spherical shape.


Introduction
Aggregation of proteins in solution is both desired and undesired. It represents the first step in the downstream processing, i.e., salting out of the proteins for the purpose of cleaning and application. It is also one of the intermediate steps in the process of protein crystallization. The unwanted, pathological, protein aggregation is known to cause several diseases. Very importantly, bio-pharmaceutical drugs should be free of aggregates, otherwise they may be harmful. To increase the stability of protein in aqueous solutions is, therefore, an important technical problem. For an excellent review of the theoretical and experimental studies of protein solutions see reference [1].
The class of proteins we are interested in here are the so-called globular proteins. A typical representative of this class is lysozyme, which was extensively studied both experimentally and theoretically (see for example [1], Chapter 9). Despite their complicated 3D structure, many properties of protein solutions can be successfully described using relatively simple models [2][3][4][5][6][7][8]. Globular proteins are in their native form (we assume that during the experimental treatment they do not denature) most often pictured as perfectly spherical objects. This naïve representation is in reality never satisfied, it is used merely to simplify the calculations. There is a large list of non-spherical proteins, for example the shape of lysozyme mentioned above is ellipsoidal, including antibodies, lactoferrin, and others, where more complex geometry of particles would need to be taken into account to generate realistic results. This is important because the interactions leading to protein aggregation are directional and of short-range.
The shape of protein molecules influence their mutual interaction directly and indirectly. For example, (i) depletion interaction is largely dependent on the shape of particles [9]. (ii) Experimentally, it is observed that many of proteins with roughly spherical shape exhibit upper critical solution temperature at protein concentration equal to 240 g/L [1,10,11]. In contrast to that, Y-shaped antibodies exhibit the shift toward much lower values, way down to 100 g/L [12,13]. (iii) The hydrodynamic radii of the nonspherical objects are different, therefore, their hydrodynamic and transport coefficients [14], as well as, kinetic parameters [15] are modified. It is also known that classical nucleation theory has difficulties in describing the crystallization of other than spherical (for example ellipsoidal) particles [1,16].
Recently, we used a simple spherical model [8] to analyze experimental results for the cloud-point temperatures in aqueous protein solutions with added salts [10,11]. We modelled the solution as a onecomponent system; the protein molecules were represented as perfect spheres, embedded in the continuum solvent composed of water, buffer, and various simple salts. The attractive short-range interactions between the proteins were due to the square-well sites located on the surface of protein molecules. The model was numerically evaluated using Wertheim's perturbation theory [17][18][19]. The short-range and directional nature of the interactions among proteins led to good agreement with the experimental data for the liquid-liquid phase diagram in case of lysozyme and γ-crystallin solutions [10,11]. With knowledge of the experimental cloud-point temperature as a function of composition of electrolyte present in the system, the model gave predictions for the liquid-liquid coexistence curves, the second virial coefficients, and other similar properties under such experimental conditions.
One weakness of the model presented above was its simplified geometry. Neither lysozyme nor other proteins are spherical, and some of them for example, lactoferrin [20] look more like two fused spheres. The other weakness was that we assume for protein molecules to exist in form of monomers, which is not true. Even in very dilute solutions, proteins can be at least partially dimerized. The purpose of the present work is to investigate how the relaxing of these two basic assumptions influence the liquid-liquid coexistence curve.
The models for the association of spherically symmetric particles into dimer molecules are of considerable interest to science and technology and have been actively studied earlier. Of particular interest for us are the models where there is an inter-penetration ("fusing" of cores) of the spherical particles upon association so that the bonding length L is less than the core diameter σ. The "shielded sticky shell" and the "shielded sticky point" models of Stell and co-workers [21][22][23] and their extensions [24][25][26], belong to this group of models and are the starting point for our work. These types of the models were studied using regular [21,22,24] and multi-density [23,[25][26][27][28][29] integral equation theories.
In the present study, we use spherical particles as building blocks, which are fused together to form a new species. In this way, we compose the molecule with arbitrary spacing L between the centers of spheres. Next, we decorate the surfaces of fused-sphere molecules with the attractive short-range binding sites, which may cause the association of the newly formed molecules. Such an extension of the protein model follows from our previous work [8]. Here, we wish to explore the effects of the non-spherical shape on various thermodynamic properties. Different versions of the model of dimerizing particles, represented by the tangentially bonded chain molecules, have been studied earlier [30,31]. In this type of the model, dimerization occurs due to squarewell bonding site, placed on the surface of one of the hard-sphere terminal monomer of each chain. Theoretical description of the model was carried out using first order thermodynamic perturbation theory (TPT1) of Wertheim [17,18]. There are two major features of our model that set it apart from the models studied earlier, i.e., (i) in our model the molecules are represented by the two hard-sphere monomers fused at a distance L, which is less than the contact distance σ and (ii) the molecules upon association can form a three-dimensional network. Due to the former feature of the model, a straightforward application of Wertheim's multi-density approach fails to produce accurate results [29,32,33]. In the present work, we use a modified version of the TPT1, which takes into account the change of the overall packing fraction of the system due to the association forces [29,34,35]. The accuracy of our modified TPT1 approach is checked by the newly generated Monte Carlo simulation data.

Model
We introduce a one component model of spherical particles, decorated with additional binding sites of two different types, A and B. The binding site A is placed within the sphere, with the displacement where u R is the pair potential for hard spheres, and u AA and u BB are inter-particle site-site potentials. The vector r (r = |r|) connects the centers of hard spheres, and x M M denotes the inter-particle vector connecting two sites of the type M . As mentioned above, u M M is the orientation dependent square-well potential between the sites M ∈ {A,B}, defined as follows: The site A causes inter-penetration of particles (see figure 3). Note that we need the term ξ AA → ∞ to compensate for the hard sphere repulsion. For the periphery sites B, we do not need such a term, therefore ξ BB → 0. To fuse hard cores at separation L, we choose d A = L/2 and take the limit ε AA → ∞.

Theory
An appropriate theoretical approach to be used is the first-order Wertheim's thermodynamic perturbation theory (TPT1) [17,18]. According to this theory, the Helmholtz free energy of the system can be written as a sum of several terms: where A id + A hs = A R is the free energy of the reference system represented by the hard-sphere system [36] and A A-A + A B-B = A ass is the contribution due to A-A and B-B interactions. Following Chapman et al. [19], we have: (2.6)

23801-3
Here, β = (k B T ) −1 and k B is Boltzmann's constant as usual, T is the absolute temperature, and N is the number of spheres. Further, X M defines the average number fraction of particles, which are not bonded through the binding site M . Parameters X A and X B are determined by the mass-action law [19] where ρ = N /V is the number density of spheres and ∆ M N connects the pair distribution function of hard spheres g hs (r ) (reference system) and the binding potential for sites M and N . The corresponding ∆ M N parameters are: Expression for the solid-angle averaged Mayer function was initially derived by Wertheim [37] and further generalized here to bē To suppress the cross interactions A-B, we set ∆ AB = ∆ BA = 0, which finally yields two independent equations, written in a quadratic form (2.13)

Association parameters ∆ AA and ∆ BB
The association parameter ∆ AA is related to X A via equation (2.12) and to the free energy contribution due to A-A binding, by equation (2.5). For the complete association limit, i.e., fusing of hard cores at separation L, no monomer spheres are present, so X A = 0. We re-write the association parameter ∆ AA and introduce the cavity correlation function y hs (r ) to obtain y hs (r )e hs (r )f AA (r )r 2 dr, (2.14) where e hs (r ) = exp[−βu R (r )]. Note that, as already mentioned before, 2d A = L. By applying the sticky limit approximation [37], that is by assuming the constant value of y hs within the integration domain, we The integral given by equation (2.16) is not used in further calculations and, accordingly, it will not be considered in more detail here.
Parameter ∆ BB determines the degree of association of fused spheres and the free energy contribution due to the B-B binding, see equations (2.6) and (2.13). Notice that due to the association between A-type of the sites, the packing fraction of fused spheres η eff is different from the packing fraction originally present (un-fused) hard spheres η. These fractions are related as follows: where η = πρσ 3 /6 is the packing fraction of hard spheres and l * = L/σ is reduced A-A bonding distance.
Using the sticky limit approximation [37] for ∆ BB [equation (2.9)], we have: (2.20) The integral in I BB can be evaluated analytically. We have used the Carnahan-Starling approximation for the contact value of g hs at the effective packing fraction of fused spheres η eff (2.21)

Cavity correlation function y hs
The last unknown quantity in equation (2.15) is the cavity correlation function of hard sphere fluid, y hs . It is calculated by using the Tildesley-Streett expression for pressure of the hard dumbbell fluid [38]. By choosing K B = 0, d A σ/2 and applying sticky limit conditions, i.e., ε AA → ∞, a AA → 0 while keeping the integral in equation (2.14) finite, our model reduces to the hard dumbbell fluid. We modify the mass action law [equation (2.12)], by inserting equation (2.15) with X A = ρ 0 /ρ, where ρ 0 stands for the number density of spheres, not bonded through binding site A (monomers). The result represents a different form of equation (110) of Wertheim's paper [37] ρ = ρ 0 + ρ 2 0 I AA y hs (r = L). (2.22) Following Wertheim [37], we get the expression for the excess pressure in the form: (2.23) We are now in position to obtain the cavity correlation function y hs of hard sphere system. We use the Carnahan-Starling equation of state [36] for the reference system (P R ) and the Tildesley-Streett equation of state [38] for the perturbed hard dumbbell system (P ).
• Carnahan-Starling EOS: (2.24) • Tildesley-Streett EOS: where ρ d = ρ/2 is the number density of hard dumbbells. The set of numerical parameters U , V , W , X , Y , Z is given in table 1.

23801-5
The set of equations which determine a i and b i [D ≡ D(l * )] are as follows: with the arrays

Other thermodynamic properties
Next, we calculate the excess pressure P ass = P − P R and the excess chemical potential µ ass = µ − µ R needed in phase diagram calculations as also excess internal energy E ass = E − E R , due to association.
Starting with the pressure, we have The excess chemical potential µ ass = µ − µ R is obtained through the relation µ ass = A ass N + P ass ρ . (2.36) The logarithmic term ln X A in equation (2.5) is divergent for the complete association limit (∆ AA 1), therefore we re-write this term by using equation ( µ(ρ I , T ) = µ(ρ II , T ), (2.38) P (ρ I , T ) = P (ρ II , T ), (2.39) where ρ I and ρ II are the two coexisting densities. At this step, the phase diagram can be constructed by applying equations (2.38)-(2.39) as it is in more detail explained in the previous work [8].
Another thermodynamic quantity is the excess internal energy E ass = E − E R , obtained as Since E A-A /N is divergent, the only relevant part is E B-B /N . Derivative ∂∆ BB /∂β is obtained analytically from equations (2.19)-(2.20), since g hs is β independent. Thermodynamic functions for the reference system of hard spheres, βA R /N , βP R , u R and E R /N , can be found elsewhere [36].

N , P, T Monte Carlo simulation
To validate the accuracy of the modified TPT1 approach, we performed Monte Carlo computer simulations in the N , P, T ensemble [39]. We assumed fused spheres with one and two binding sites on each sphere, where the prescribed arrangement of sites was preserved during the simulation. Simulated molecules are schematically shown in figure 3. We adopted the sampling method suggested by Tildesley and Streett [40], where a single displacement parameter was needed to describe the translation and rotation of fused spheres. The simulation box contained 250 fused spheres (molecules), which is equivalent to 500 penetrating (original, un-fused) spheres. We defined the cycle with 250 attempts to move the object and by 1 attempt to change the volume box. Next, we defined the block to be equal to 5 × 10 4 cycles.
Initially, we performed 1 block, to equilibrate the system, while 4 independent blocks were needed to calculate thermodynamic properties via the block averaging. Simulations were performed for three l * values: 0.2, 0.6, 1.0, and four different pressures P k B T /σ 3 : 0.5, 1.0, 2.0, and 4.0, for each model object visualized in figure 3. The acceptance rate of trial configurations was between 0.2 and 0.6. In the last example we set α BB = π/2 with perpendicular orientation of lines, connecting sites B on each sphere. Center-to-center separation L (l * = L/σ) and displacement distance d B = σ/2 were fixed.

Thermodynamic properties: Theory against Monte Carlo simulations
To test the accuracy of TPT1 we performed N , P, T Monte Carlo simulations for values of K B equal to 1 and 2 (three l * values for each K B ). We chose to compare the pressure P and the internal energy due to B-B binding, E B-B . Since we used the complete A-A association limit within TPT1, the latter quantity, E B-B , was the one that could be directly compared to computer simulations. We fixed the temperature T * = k B T /ε = 1, while the pair potential characteristics are given in table 2. The comparison between the 0.5σ theory and simulations is presented in figure 4. We found very good agreement for the pressure, while the theoretical predictions for E B-B were less accurate. In case of l * = 1 we obtained very good agreement for K B = 1 and fair agreement for K B = 2 (black lines and corresponding symbols in E B-B /N FS k B T subfigures). If we reduced the l * values (blue and red lines, symbols), the deviations became larger, though the qualitative picture remained correct. Deviations at low l * could be caused by the facts that: (i) fusing of two spheres at small l * is not a small perturbation regarding the reference system of hard spheres,  and (ii) the arrangement of binding sites B is fixed during the simulation, which is not the case in TPT1, where the orientation average over all geometries was assumed.

Effect of protein's shape on the liquid-liquid phase diagram
To illustrate the influence of protein shape on the liquid-liquid phase behavior, we compared the phase diagrams for two versions of the model: model (I) of two fused hard spheres and the model (II) of equivalent hard sphere, which was defined as a limiting case of (I), when L → 0 and σ → d eqv . The latter was chosen in such a way, that the volume of two fused spheres in case (I) was equal to that of the equivalent sphere (II): d eqv = σ In figure 5, we show phase diagrams for the variants (I) and (II) described above, at three l * values and for different numbers of sites B. As observed before [41], an increase of the number of sites B shifts the critical density toward higher values. What is more interesting here is the effect of the separation distance parameter l * on the phase behavior. For a sufficiently small l * , i.e., l * = 0.2 -see figure 5 (a), the difference between the phase diagrams for two versions of the model becomes negligible, regardless of the K B value. If centers of fused spheres are located at larger distance, l * = 0.6 -see figure 5 (b), the difference becomes more pronounced: both critical temperature and density are lowered. Deviations become the strongest for the limiting example of two spheres fused in contact, that is for l * = 1.0, cf.

23801-9
figure 5 (c). In this case, the larger number of sites (larger area available for interaction) additionally affects the liquid-liquid phase diagram. The shift toward lower critical densities (or packing fractions) is consistent with experimental studies of the Y-shaped antibodies [12,13].

Conclusions
Proteins come in many shapes, from ellipsoidal to Y-like and are never perfectly spherical as treated by most theoretical models. Further, even in dilute solutions they have a tendency to form dimers and can be represented by two fused spheres. For dense systems close to precipitation, the actual geometry of the protein molecules is important; the inter-particle interactions are directional and of short-range. In the present study, we modify the first-order thermodynamic perturbation theory for associating fluids to be applicable to the models allowing hard-sphere particles to inter-penetrate. These particles can further aggregate. We confront theoretical predictions for thermodynamic properties of the proposed model with predictions of the corresponding Monte Carlo simulations. We obtain an excellent agreement for the pressure and fair agreement for the excess internal energy. Next, we use this model to predict the liquidliquid phase diagram for protein solutions. We are interested in the effects of protein shape on the phase coexistence curve. We show that the fused hard-sphere model reduces the critical density of the system in comparison with the same quantity calculated for the hard-sphere model. This finding is consistent with experimental observations for Y-shaped antibodies. Using the cloud-point temperature measurements, we currently investigate the influence of various salts on the stability of lactoferrin solutions in water. The latter protein has a shape of two fused spheres, and the hard-sphere model is not a good representation of it. Theoretical approach developed in this paper will be used to analyze experimental data for lactoferrin and some other proteins of non-spherical geometry.