Week 8: Electronic Structure and Quantum Chemistry
Overview and Objectives
In this week we will start to do some quantum chemistry calculations on molecules. Roughly speaking, quantum chemistry involves solving for the electronic properties of a system given the 3-D molecular structure, and a very small amount of additional information, usually the total charge and the spin multiplicity. (As a reminder, the spin multiplicity is equal to the number of unpaired electrons plus one, so a typical neutral, non-radical organic molecule has charge 0 and multiplicity 1). The wavefunction and electronic properties of a system come from solving the time-independent Schrodinger equation for its eigenvalues and eigenfunctions. Exact solutions for Schrodinger’s Equation exist only for the simplest systems; the hydrogen atom has been solved analytically, and is covered in CHE 110A and other quantum chemistry courses. Beyond this, no exact analytic solution exists for any atom or molecule containing more than one electron. Therefore, any solutions going beyond one electron have to involve numerical calculations on computers. Because most systems of interest contain dozens of electrons or more, quantum chemistry must involve making approximations of many kinds in order to produce results accurate enough to be useful, while at the same time keeping the calculation simple enough so that the computer can produce a result in a reasonable time. Due to the need to make a compromise between accuracy and complexity of the calculation, there is no “perfect” quantum chemistry calculation. Instead, today there exists a whole ecosystem of hundreds of different quantum chemistry methods, all based on approximating the solution to Schrodinger’s Equation in different ways, and tailored to different types of systems and properties.
The basis set
At the center of nearly all quantum chemistry methods is the concept of molecular orbitals. Molecular orbitals help to simplify the quantum mechanics problem by providing the concept that each orbital is a function in space that can be occupied by up to two electrons (spin-up and spin-down). Molecular orbitals are orthonormal: assuming that they are real functions (which is true for most quantum chemistry methods), orthonormal functions are normalized in that the square of an orbital integrates to one, and orthogonal in that the product of two different orbitals integrates to zero. The square of an orbital also represents the probability density of the electron being located at a position in space.
Because quantum chemistry methods are numerical, we don’t have exact formulas for the molecular orbitals of a given system, and instead they need to be represented numerically, i.e. as a list of numbers. The most efficient way to do this is as a linear combination of atomic orbital basis functions:
(equation)
where phi_i(x) represents an atomic orbital basis function, and there is a fixed set of N of these orbitals in the whole calculation, and c_i represents the linear combination coefficients determined by the calculation. The purpose of the basis functions is to give the molecular orbital a large degree of flexibility to assume nearly any shape in 3-D space as efficiently as possible. The intuition here is that although in principle it should take an infinite number of points to exactly represent the values of a function at every single point in space, the electrons in the molecule should mostly be located around the atomic positions and in the space between atoms where there are chemical bonds. Therefore, it shouldn’t take too many atomic orbital basis functions (i.e. a few hundred to a few thousand for a decent-sized organic molecule) to cover most of the possible orbital shapes. The “basis set” is a dictionary that maps each element in the periodic table to a set of predetermined atomic orbital basis functions. All quantum chemistry codes come with a menu of basis sets that the user may choose from.
The functional form of the atomic orbitals are designed to be similar to the hydrogen atom wavefunctions (though without the radial nodes) while at the same time being mathematically easy to work with. In practice, an atomic orbital typically consists of a product of a radial part that is a sum of one or more Gaussian functions of the distance from the atomic center, times an angular part that is a spherical harmonic function. (Instead of the spherical harmonics one can also use products of the Cartesian directions.) Although the Gaussian functions do not have the exact features of hydrogen atom wavefunctions (i.e. they do not have a cusp at the origin and decay to zero too rapidly), they are much easier to work with mathematically because the product of two Gaussian functions is another Gaussian function. Therefore most modern calculations use a large number of Gaussian functions to make up for these shortcomings.
The smallest possible basis set for an atom involves providing one basis function for each of its shells: that is, a single s function for hydrogen, and two s functions and three p functions (one for each of x, y, and z) for oxygen. Because Gaussian functions don’t have the right shape, and chemical bonding distorts the electron density, a basis set that incorporates additional functions can give more accurate molecular orbitals. Therefore, a larger basis can give a more accurate calculation but also increases the cost of the calculation (i.e. time to reach a solution). A commonly used basis set, called 6-31G*, uses the following prescription: Each core (non-valence) orbital gets one basis function comprised of six Gaussians, and each valence orbital gets two basis functions comprised of three Gaussians and one Gaussian respectively. Furthermore, there is one shell of basis functions added at a higher angular momentum (l value) than the valence shell to account for distortion of the electron density, called polarization functions. In this basis, H atoms have two basis functions for the 1s shell, and O atoms have 14 basis functions (one for the 1s shell, two for the 2s shell, six (two sets of three) for the 2p shell, plus five d functions for polarization).
Although basis sets are conceptually relatively simple, there is a lot of research that goes into optimizing the number of basis functions and their coefficients to make calculations as accurate and efficient as possible. Because of this, there is no single choice of basis set that is optimal for every calculation, and doing some research (or prior experience) is necessary to choose the best basis set for a research project.
The quantum chemistry method
The quantum chemistry method is how to compute the electronic properties for a given molecular structure. (In fact, the quantum chemistry method is conceptually more important than the basis set, but it is more abstract, which is why we talk about it second). Most of these methods involve the use of orbitals. It makes sense to talk about quantum chemistry methods by introducing the most basic of these methods, called Hartree-Fock (HF) theory. In HF, the wavefunction is an antisymmetrized product of occupied molecular orbitals, where “antisymmetric” is the fundamental property of electrons that the wavefunction must change sign if any two electron labels are exchanged. (This is also the reason for the Pauli Exclusion Principle.) The antisymmetrized product wavefunction, which takes a set of orbitals and occupation numbers as input, is called a Slater determinant, or an electron configuration. The energy is calculated from the orbitals by assuming that each electron is attracted to the nucleus and each pair of electrons repels one another as if they were independent electron densities (though the repulsion between same-spin electrons is reduced as a consequence of the wavefunction antisymmetry, called the exchange interaction). The wavefunction is then determined by adjusting the molecular orbitals (in practice, we adjust the linear combination coefficients in the LCAO-MO approximation) so that the energy is minimized; this minimization procedure is called the variational method in QM, which states that any “inexact” normalized wavefunction is guaranteed to be equal or higher in energy to the true ground state. At the end of the calculation, we get a set of molecular orbitals where each one is an array of linear combination coefficients of the basis functions, as well as the final minimized energy (with respect to the wavefunction coefficients, not with respect to the molecular geometry; we haven’t touched that). It is then possible to compute various other properties from the wavefunction, such as the dipole moment of the molecule, or the nuclear gradient of the energy, i.e. the derivative with respect to atomic positions.
So from the above discussion we see that Hartree-Fock is a single-configuration method, which distinguishes it from other more sophisticated methods that involve multiple configurations. It also makes the assumption that individual electrons behave as independent electron densities despite the fact that they repel one another. During the variational minimization, the only way to reduce the electron repulsion (while maximizing electron-nuclear attraction) is to adjust the orbital shapes, but the electrons move independently within those orbitals, even as their repulsion adds to the energy. In reality, electrons avoid one another when they get too close to reduce the repulsion; this effect, called electron correlation, is missing from the HF method. If a quantum chemistry method had the features of HF but also exactly captured the tendency for electrons to avoid one another, then it would be exact. The correlation energy is conceptually defined as the difference between the HF result at the complete basis limit and the exact energy. Even though HF appears to ignore electron repulsion, which makes us think it should be wildly inaccurate, it is actually an incredibly useful theory. It can predict atomic ionization energies to within 0.2 eV (a few percent of experiment), and most molecular geometries as well. However, it is not accurate enough for many chemical applications (such as analysis of a reaction mechanism) and it can even be qualitatively incorrect in some cases.
Hartree-Fock was developed in the 1930s, even before transistors and modern computers were invented. Since then, it has provided the basis for the development of an almost countless number of quantum chemistry / electronic structure methods (the two terms are used interchangeably). Many of these methods have been developed in order to capture the electron correlation effects that are neglected in Hartree-Fock, and some other methods aim to make calculations less expensive by various amounts of empirical fitting to experimental data. We cannot possibly talk about them all here but will introduce a few branches.
Density functional theory (DFT)
Density functional theory has its conceptual roots early in the history of quantum mechanics. Simply speaking, it is an alternate branch of quantum mechanics that seeks to understand the properties of the system not directly from the wavefunction, but from the electron density. The electron density is related to the wavefunction as:
(equation)
Clearly, the density is a simpler object than the wavefunction, as it is a function of only three variables rather than the full 3N. The Hohenberg-Kohn (HK) theorems establish the formal foundations of DFT, and they state that: (1) Given the ground state electron density, it is possible to obtain all other properties of the system, including the energy and the wavefunction itself. (A “functional” is a mapping of a function to a number, for example, taking the integral over all space is a functional that maps the electron density to the number of electrons). (2) By minimizing the energy functional, one obtains the exact ground state electron density. The implication of the HK theorems is that it is possible to solve everything in QM by dealing only with the electron density, but it comes with a hitch: It only states the existence of the energy functional but does not state what it is. To this day, the exact energy functional is unknown and all modern functionals are approximate in some way.
In practice, designing a energy functional that is a pure functional of the density (i.e. without involving orbitals) is a very difficult task. The predominant way of doing DFT today is by using the “Kohn-Sham” (KS) method. KS-DFT introduces a set of molecular orbitals, the “Kohn-Sham” orbitals, that are used for the purpose of calculating the electron kinetic energy terms. The basic reason for adding orbitals into DFT is that the kinetic energy terms are very difficult to calculate from the density directly, so KS-DFT is essentially taking a page out of Hartree-Fock’s book to compute a big chunk of the energy. In principle, the KS orbitals should not be interpreted chemically although in practice people do interpret them. The repulsion between electrons is also computed in the same way as Hartree-Fock as a double integral over the electron density.
Where, then, does KS-DFT differ from HF? The difference is in the exchange and correlation terms. DFT computes the exchange and correlation terms directly from the electron density, using formulas based on some mathematically idealized limits such as the uniform electron gas. Both the exchange and correlation terms are approximated in DFT, as opposed to HF, in which exchange is exact and correlation is zero. All DFT functionals contain “local density approximation” (LDA) energy terms that involve taking the integral of the density, raised to some power, over all space. The early LDA functionals are closest to the founding ideas of DFT but they are not very accurate. In the 90s DFT developers created some functionals with added terms depending on the density gradient; these are called generalized gradient approximations (GGA). Around the same time, they realized that the HF exchange energy could be computed using the KS orbitals, and mixed in a portion of the HF exchange energy, and optimized the coefficient of the mixing to reproduce some experimental data. This led to the “hybrid” functionals, of which the B3LYP functional is the most famous and widely used functional of all time. The development of DFT methods continues today, and has mainly moved in the direction of increasing complexity and empirical parameters to reproduce experimental data more accurately.
Because of its high price-to-performance ratio (B3LYP is significantly more accurate than HF for many properties and costs about the same), DFT is perhaps the most widely used quantum chemistry method of all. B3LYP/6-31G* is a combination of method and basis set that can do a decent job at modeling organic reactions. Some more recent directions in DFT development include range separated hybrids which employ different mixings of HF and DFT exchange depending on the inter-electronic distance, and empirical dispersion corrections that add back the dispersion interaction that is missing from most DFT functionals. It is generally a good idea to use empirical dispersion corrections in modern calculations that employ functionals from the 90s such as B3LYP.
Post Hartree-Fock
The main disadvantage of DFT is that there isn’t a straightforward procedure to systematically improve on the energy functional. The non-DFT methods, collectively called “wavefunction methods”, tend to be more expensive but are also systematically improvable. Whether DFT or a wavefunction method is best for an application depends on many factors.
The central concept in post-HF methods is that in order to go beyond HF, the wavefunction can be written as a linear combination of multiple electron configurations. To make multiple electron configurations, we can remove electrons from occupied orbitals and place them into unoccupied (virtual) ones. If one attempts to solve Schrodinger’s equation using a wavefunction that is a linear combination of all possible electron configurations that could be created by placing N electrons into M orbitals, that is called “full configuration interaction” (full CI). Full CI in the complete basis limit is the numerically exact solution to Schrodinger’s Equation. In practice, because the cost of a full CI calculation is factorial in the number of basis functions and electrons, it can only be used on systems containing a few dozen electrons with a few dozen basis functions. Therefore, full CI is more useful as a conceptual reference in thinking about how approximate post-HF methods can neglect large portions of the electron configurations to capture most of the electron correlation while remaining computationally affordable.
Many post-HF wavefunction methods are based on the assumption that a single electron configuration dominates the wavefunction, called the “reference”. For example, truncated CI is a method that includes only excitations from a reference configuration in the wavefunction, up to a certain limit of excitations; configuration interactions singles and doubles (CISD) is the most common. MP2 (short for second-order Moller-Plesset perturbation theory)