Entropy and Temperature
This worksheet contains calculations that illustrate the statistical mechanics definition of entropy. Two coupled Einstein solids are considered following the explanations by Thomas A Moore and Daniel V Schroeder (Am. J. Phys. 65 , 26 (1997)).
In macroscopic thermodynamics entropy is defined as the quantity that increases by Q/T when a thermodynamic system receives the energy amount Q by heating while at temperature T. This is not intuitive, and the additional explanations about entropy measuring order in the system are also ad-hoc. This definition of entropy is referred to as the Clausius definition. Boltzmann is the originator of the statistical mechanics definition. In the Einstein solid the thermodynamic system is simplified by considering a given number of oscillators at fixed locations. Each atom has three degrees of freedom, and the oscillators can take up energy in quantized units. The quantum zero-point energy of the oscillators is irrelevant for the discussion.
The connection between the macroscopic and microscopic descriptions is as follows. Macroscopically the system is described by a a few variables (For a tank of an atomic gas: Number of atoms, Volume of the gas, and total internal energy U; other variables such as pressure and temperature follow from the ideal gas law). Microscopically there can be many states that are consistent with this macrostate. The number of consistent microstates is called the multiplicity. Different macrostates have different multiplicities.
If a system is isolated in a given macrostate it is equally likely to be in any of the consistent microstates - they have all equal probability [fundamental assumption of stat mech]. The probability equals the inverse multiplicity for that macrostate. This assumption leads to the second law of thermodynamics, namely that macroscopic objects exhibit irreversible behaviour. To prove this point we consider two coupled Einstein solids.
First we calculate the multiplicity for an Einstein solid (for a derivation see: H B Callen, Thermodynamics and an Introduction to Thermostatistics , p. 334).
We have a system of N one-dimensional oscillators (the number of atoms is N /3), each oscillator can be in a state where n units of energy are stored, n =0,1,2,...; the total energy of the system equals q units of energy ( q is also a non-negative integer). The unit of energy E = h f , where f is he natural frequency of the oscillator and h is Planck's constant. The multiplicity for a system with N degrees of freedom and a total number of q units of energy can be shown to be (at least try to verify this by counting for low N , q ):
> restart; with(plots):
> Omega:=(N,q)->binomial(q+N-1,q);
A system of 3 oscillators with 3 units of energy shared can be in a state (300), (210), (111), (012), (030), ... a total of 10 combinations.
> Omega(3,3);
The two coupled Einstein solids represent an interacting system; we label the two by A and B. We can consider a macrostate where the total energy U is split between A and B as U = U_A + U_B. The reason for being able to specify U_A and U_B lies in the fact that the coupling between the two solids is weak: while within the solids the systems are fluctuating between the microstates consistent with the macrostates (e.g., U_A, N_A), the transfer of energy between A and B happens on a longer timescale and it is possible to know U_A and U_B. One calculates the multiplicities for A and B, and takes the product of these to find the multiplicity for the coupled system. This is true since the two solids are independent of each other, and any microstate in A can combine with any microstate in B (for given macrostates in A and B).
What we need now is a counting scheme: for given N_A, N_B and total number of excitation quanta q_t, we wish to find out how many microstates there are for a given combination where q_A, and q_B units of energy stored in A, B respectively (q_A + q_B = q_t).
> N_A:=300; N_B:=200;
> q_t:=20;
We work with lists in Maple (lists are ordered sets), and define ourselves a function to add an element to a list:
> ladd:=(L,e)->[op(L),e];
> ladd([1,2,3],4);
> O_A:=[]: O_B:=[]: O_t:=[]:
> for q_A from 0 to q_t do:
> q_B:=q_t-q_A;
> o_A:=Omega(N_A,q_A);
> o_B:=Omega(N_B,q_B);
> o_t:=o_A*o_B;
> O_A:=ladd(O_A,o_A); O_B:=ladd(O_B,o_B); O_t:=ladd(O_t,o_t);
> od:
The lists can be printed. If q is large, however, the lists are very long, and we have commented out the statement.
> #O_A;
> #O_B;
> #O_t;
> listplot(O_t);
We want to look at the width of the distribution. We can write a procedure to generate the mean, and the deviation of the data from the mean.
In the calculation of the average we subtract by one since the first value of the list corresponds to q_A=0.
> MSD:=proc(L) local n,i,av,dev,wt;
> n:=nops(L); wt:=add(L[i],i=1..n):
> av:=add((i)*L[i],i=1..n)/wt:
> dev:=evalf(sqrt(add((i-av)^2*L[i],i=1..n)/wt));
> print(`average: `,av-1,` deviation: `,dev);
> end:
> MSD(O_t);
We calculated the expectation value of the distance squared from the average value and took the square root to obtain a distance. The expectation values for the average are normalized by dividing the sums by the sum over all data points.
N_A=300, N_B=200:
We see from repeated calculations with increased q:
q=100: 60, dev 5.4
q=200: 120, dev 8.2
q=300: 180, dev 10.7
q=400: 240, dev 13.1
q=800: 480, dev 22.3
For fixed number of degrees of freedom we have increased the amount of energy deposited. The combination of energy quanta distributed between A and B follows some order: q_A = 6/10 * q_t is the location with the largest multiplicity for the choice of N_A vs N_B. How does the width grow with q?
This has to be considered carefully. The number of states increases like q
For 20/30 we get
q=800: 480. dev 56.6
> v1:=56.6/480;
> v2:=111.5/960;
> v2/v1;
> evalf(1/sqrt(2));
> v3:=166.3/1440;
> v3/v1;
> evalf(1/sqrt(3));
> v4:=13.1/240;
> v4/v1;
> evalf(1/sqrt(4));
> v8:=22.3/480;
> v8/v1;
> evalf(1/sqrt(8));
The relative width is not decreasing as 1/sqrt(q), but we are also not in the limit of large q, yet. If N was larger than q, the rel. width would go as 1/sqrt(N).
For many degrees of freedom (N approaching 10^24, and implying also a large number of energy units deposited) the distribution becomes very narrowly peaked. Thus, there is an almost uniquely defined macrostate with a very high multiplicity compared to the others. The second law of thermodynamics (that states that entropy increases) can be cast into the form that the system evolves always towards the macrostate with the largest multiplicity. It happens since in the course of fluctuations the system is bound to evolve into the state with the largest probability. Given that all microstates are assumed to be equally probable, it must evolve towards the macrostate with largest multiplicity.
Entropy is defined now as the logarithm of the multiplicity. This ensures the additivity of entropies in weakly coupled systems (since their combined multiplicity was given as a product of the individual ones). In addition, the entropy definition contains Boltzmann's constant as a proportionality factor.
S = k ln(Omega)
For plotting purposes we chose k=1 (a unit system). We can calculate the entropies for the subsystems and the total.
> P1:=listplot(map(log,O_A),color=red):
> P2:=listplot(map(log,O_B),color=blue):
> P3:=listplot(map(log,O_t),color=green):
> display(P1,P2,P3);
The graph shows that at equilibrium (which happens at q_A=60) the total entropy is stationary and has a maximum.
We understand now that the evolution of the macrosystem can be described as consistent with the 2nd law of thermodynamics in the form that the isolated system evolves towards a state with the largest entropy.
Note that the total entropy S_t is fairly flat in the vicinity of the maximum as a function of q_A.
How can we define temperature?
One can adopt a definition based on energy flow and thermal equilibrium: two objects in thermal contact are at the same temperature if they are in thermal equilibrium, in which case there is no spontaneous net flow between them. If there is a spontaneous net flow from A to B, then A loses energy and is at a higher temperature, and B gains energy and is at a lower temperature. Now to relate temperature to entropy we have to find in the above figure at the equilibrium point two quantitites that become equal. These are the slopes of the curves for S_A(q_A) and S_B(q_A), which are equal in magnitude and opposite in sign. S_B(q_B) has the same slope as S_A(q_A), i.e., the sign is also equal. Thus, temperature has to be related to dS/dq.
On dimensional grounds (given the choice of constant k in the definition of S) it is the inverse temperature that equals the rate of change of entropy with energy.
The above figure is used by Moore and Schroeder to justify this. Assume that you are on the right of the equilibrium point q_A=60 units, i.e., solid A has more energy than what it would have at equilibrium. In this region the curve for S_A (red) has a positive slope that is levelling off. The (blue) curve has a negative slope that is increasing in magnitude. What does this mean? If a small amount of energy (e.g. one unit dq=1) were to pass from A to B in this region (a movement to the left in the graph, or a move towards equilibrium), the increase in S_B would be bigger than the decrease in S_A. The total entropy would increase (green curve). The second law of TD tells us that this process will happen spontaneously (there are many more microstates in this region, i.e., a fluctuation will find this regime easily).
The steeper the slope in the entropy versus energy curve the more the system wants to obey the second law and increase its energy (B), while the a shallower entropy-energy curve for the other system (A) means that it doesn't mind to give up some energy.
To summarize: to the right of the equilibrium point A has more energy than at equilibrium and energy flows from A to B to reach equilibrium. A thus has a higher temperature than B there. Its entropy-energy curve has a smaller slope, while the slope for |S_B(q_A)| is larger. Thus, the inverse of the temperature is proportional to the rate of change of entropy with energy:
1/T = dS/dU
When the volume of the system is allowed to change (as is true for gases) one has to take partial derivatives, and one cannot simply rearrange the above eq. into dS=dU/T !
For the Einstein solid we can use the above equation to calculate the temperature as T = dU/dS by using neighbouring values in the entropy as the energy is changed by one unit up and down. A central difference formula is used, the spacing in U equals dq=2.
We take a single solid with 50 degrees of freedom and vary q between zero and 100
> N:=50;
>
> O1:=[]:
> for q from 0 to 100 do:
> o1:=Omega(N,q);
> O1:=ladd(O1,o1);
> od:
> S1:=map(evalf,map(log,O1)):
>
Now we can calculate the temperature:
> T1:=[]:
> for q from 2 to 99 do:
> t1:=2/(S1[q+1]-S1[q-1]);
> T1:=ladd(T1,t1); od:
> listplot(T1);
> S1[13];
> T1[13];
We are out by one step when comparing the entropy and temperature lists! (cf fig. 5 in MS)
The graph shows how the temperature increases with the energy amount stored in the solid. Traditionally one measures the heat capacity as a function of temperature.
The heat capacity is the rate of change of the amount of energy with temperature. We can calculate it also by differencing. One normalizes it usually by the number of oscillators (to take out the size of the system).
> C1:=[]:
> for q from 2 to 97 do:
> c1:=2/(T1[q+1]-T1[q-1])/N;
> C1:=ladd(C1,c1); od:
The simple listplot graphs the specific heat capacity (at fixed volume) as a function of energy (q).
> listplot(C1);
To plot C(T) we do:
> plot([[T1[j],C1[j+1]] $j=1..95],view=[0..3,0..1]);
Note that dimensionful constants such as k, and the energy quantum E have been set to unity. The temperature unit is actually E/k. The heat capacity is constant above E/k=1, and drops to zero below. At high temperatures the specific heat capacity approaches k.
Can we investigate the low-temperature behaviour? We need many oscillators.
>
> N:=1000000; q_m:=200;
> O1:=[]:
> for q from 0 to q_m do:
> o1:=Omega(N,q);
> O1:=ladd(O1,o1);
> od:
> S1:=map(evalf,map(log,O1)):
Now we can calculate the temperature:
> T1:=[]:
> for q from 2 to q_m-1 do:
> t1:=2/(S1[q+1]-S1[q-1]);
> T1:=ladd(T1,t1); od:
> listplot(T1);
We have clearly reached low temperatures.
> C1:=[]:
> for q from 2 to q_m-3 do:
> c1:=2/(T1[q+1]-T1[q-1])/N;
> C1:=ladd(C1,c1); od:
The simple listplot graphs the specific heat capacity (at fixed volume) as a function of energy (q).
> listplot(C1);
To plot C(T) we do:
> plot([[T1[j],C1[j+1]] $j=1..q_m-5],view=[0..0.2,0..0.02]);
The curve for the specific heat C(T) turns around at low T to appraoch zero with a small slope!
The Boltzmann distribution
An important quantity in statistical mechanics is the Boltzmann factor: for a system at a given temperature it provides the probability to find the system in a microstate of energy E. The system is at temperature T while in thermal contact with a heat bath (a large system). This relative probability is given by exp(-E/kT), and is derived in texts on statistical mechanics.
The results from the previous section can be used to demonstrate this behaviour. We can consider a coupled system of a N=1000 oscillator Einstein solid coupled to a single oscillator. We vary the temperature by changing the amount of energy units shared between the single oscillator and the heat bath.
> restart; with(plots):
> Omega:=(N,q)->binomial(q+N-1,q); ladd:=(L,e)->[op(L),e];
The 'solid' A is the single oscillator, while B represents the heat bath.
> N_A:=1; N_B:=1000;
> q_t:=100;
> O_A:=[]: O_B:=[]: O_t:=[]:
> for q_A from 0 to q_t do:
> q_B:=q_t-q_A;
> o_A:=Omega(N_A,q_A);
> o_B:=Omega(N_B,q_B);
> o_t:=o_A*o_B;
> O_A:=ladd(O_A,o_A); O_B:=ladd(O_B,o_B); O_t:=ladd(O_t,o_t);
> od:
> O_t:=map(evalf@log,O_t):
> P1:=listplot(O_t,view=[0..30,250..340],color=blue): display(P1);
The plot is shifted by one unit on the horizontal axis, i.e., the point at (1,335) corresponds to q_A=0. The graph shows the logarithm of the probability for the microstate (its multiplicity) as a function of the number of units of energy. The function falls off exponentially which implies a linear fall-off of the logarithm. Thus, we have demonstrated the Boltzmann factor variation with E. [At the high end of the curve near q_A=q_t one can see a deviation due to the finite number q_t].
Next we observe the change with temperature. We estimate the temparature of the heat bath at the centre of the graph shown above (it varies a little due to the finiteness of the calculation, i.e., the removal of 1-30 units of energy out of a bath of 1000 oscillators that share 100 units is not negligible).
> Temp1:=1/(O_t[14]-O_t[15]);
Now we quadruple the number of energy units:
> q_t:=400;
> O_A:=[]: O_B:=[]: O_t:=[]:
> for q_A from 0 to q_t do:
> q_B:=q_t-q_A;
> o_A:=Omega(N_A,q_A);
> o_B:=Omega(N_B,q_B);
> o_t:=o_A*o_B;
> O_A:=ladd(O_A,o_A); O_B:=ladd(O_B,o_B); O_t:=ladd(O_t,o_t);
> od:
> O_t:=map(evalf@log,O_t):
> P2:=listplot(O_t,view=[0..30,790..840],color=green): display(P2);
Let us compare the two slopes:
In case 1 (q_t=100) : 70 units vertical vs 30 horizontal
In case 2 (q_t=400) : 35 units vertical vs 30 horizontal
We quadrupled the number of energy units in the heat bath made up of 1000 oscillators and the slope in the log(Omega) plot halved. If Omega is proportional to exp(-E/kT) as claimed we need to show that the temperature increases like the square root of the number of energy units.
>
> Temp2:=1/(O_t[14]-O_t[15]);
The temperature of the heat bath is approximately twice of what it was in the first case. Thus, we have demonstrated that the probability distribution as calculated from the multiplicity associated with the heat bath follows the dependence given by the Boltzmann factor exp(- E /k T ).
Thermal behaviour of a two-state paramagnet
Moore and Schroeder discuss this problem as a an example that leads to further implications of the theoretical definition of temperature given above. In conventional thermodynamics temperature is the measure of the amount of kinetic motion (linear motion of atoms in a gas, vibrational motion of diatomic or other molecules, vibrations of stationary atoms in the Einstein solid, etc.). In the case of a magnet the degrees of freedom are contained in the spin orientation of individual atoms with respect to an externally defined direction.
The energy of an atom with an outer electron with unpaired spin in an outer magnetic field depends on the orientation of the electron spin, which gives rise to a magnetic moment (if the electron is in an s state,or in a state with zero magnetic quantum number there is no contribution from orbital angular momentum to the magnetic moment). The magnetic moment vector is opposite to the spin vector due to the negative charge of the electron (see e.g., Wolfson and Pasachoff, p. 1098).
The energy of a magnetic dipole in an external field is given as U = - mu . B , and therefore the state with a magnetic dipole oriented in the same direction as the external field has a lower energy than the state that has a counteraligned magnetic dipole moment. The magnetic dipole moment is defined such that the torque produced by the magnetic field on it acts to align the dipole with the magnetic field vector. (The evidence for this is obtained from tracing the magnetic field of a big solenoid or bar magnet by placing a compass at various locations: near the North and South poles it is evident how the compass needle (its dipole) aligns with the field; at the sides of the 'big' magnet it apparently counteraligns: this is a result of the inhomogeneity of the field: the field lines outside the bar magnet run in the opposite direction to form closed loops - this causes two permanent magnets side-by-side to snap together in a counteraligned fashion).
For our problem this complication does not exist. The external field is homogeneous. The total potential energy of a system of N dipoles depends on how many are aligned (pointing up) N_u and how many are counteraligned (pointing down) N_d, with N_u + N_d = N . If we chose units such that mu=1 and B=1 we have
The energies of the individual up and down states for the dipoles:
E_u = -mu B, E_d = mu B.
U = N_u E_u + N_d E_d = mu B (N_d - N_u) = mu B (N - 2 N_u)
The magnetization of the ensemble is given as
M = mu (N_u - N_d) = - U / B .
These relations imply that the macrostate is specified by the total number N, the total energy U, or equivalently by specifying N and N_u (or N_d).
The multiplicity of such a macrostate is given by the number of possibilities to chose N_u out of N states.
> restart; with(plots): ladd:=(L,e)->[op(L),e];
> Omega:=(N,N_u)->binomial(N,N_u);
The number of spins
> N:=100;
The energy range is from -100 to 100 (or -N to N) given that the spins can all be aligned up to counteraligned. First we calculate the energies, multiplicities and entropy: given that a single spin can only be aligne or counteraligned the loop runs over the spins:
> U1:=-[]: O1:=[]: S1:=[]:
> for N_u from N to 0 by -1 do:
> u1:=(N-2*N_u);
> o1:=Omega(N,N_u);
> s1:=log(o1);
> U1:=ladd(U1,u1); O1:=ladd(O1,o1); S1:=ladd(S1,s1);
> od:
> U1;
> plot([[U1[j],S1[j]] $j=1..N+1]);
The magnetization is trivial: it is simply related to the energy, but we are interested in the temperature and the specific heat capacity. The temperature is given as the inverse of the slope of the above graph.
> S1[3];
> T1:=[]:
> for q from 2 to N do: dif:=(S1[q+1]-S1[q-1]);
> if dif<>0 then t1:=evalf((U1[q+1]-U1[q-1])/dif) fi;
> T1:=ladd(T1,t1): od:
> #listplot(T1);
> T1[10];
> j:='j': plot([[T1[j],-U1[j]/N] $j=1..N-2],style=point,view=[-10..10,-1..1],title=`Magnetization per site M(T)`);
When all spins are aligned the temperature T is at its lowest. At high temperatures the average magnetization drops to zero.
The definition of temperature allows for two signs, depending on the sign of the change of rate of the energy U with the entropy S.
What is the heat capacity?
>
> C1:=[]:
> for q from 2 to N-2 do:
> c1:=(U1[q+1]-U1[q-1])/(T1[q+1]-T1[q-1])/N;
> C1:=ladd(C1,c1); od:
> j:='j': plot([[T1[j+1],C1[j]] $j=1..N-3],style=point,view=[0..10,0..0.5],title=`Heat Capacity per site C(T)`);
> C1[1];
> T1[1];
The disatvantage of using lists instead of tables: indexing is out of step if finite differencing is used (loop starts at two instead of one; thus T[2] corresponds to S[1], U[1], C[1] corresponds to T[2], etc.
NOTE: WE HAVE SIMPLY MADE THE QUANTITIES DIMENSIONLESS FOR PLOTTING PURPOSES by setting k=1, e=1 (energy quantum), mu=1. To check the dimensions one should put the right units back in. Where entropy is shown, it is really S/k; where the heat capacity is shown, it is C/k, where the temperature is shown, it is kT/e for the Einstein solid, and kT/(mu B) for the paramagnet, etc.
>