Numerical Integration II

1 Gaussian Quadrature

In previous lecture, we studied the Newton-Cotes methods which employ an evenly spaced grid. In contradistinction to this, what the method(s) of Gaussian quadrature accomplish is both simple and impressive: instead of limiting yourself to equally spaced nodal abscissas, expand your freedom so that both the n abscissas x_i and the n weights c_i are at your disposal. Then, you will be able to integrate all polynomials up to degree 2n - 1 exactly!

It should be straightforward to see why this is impressive: it implies that even using a very small number of points (say n = 5) allows us to integrate all polynomials up to quite high order exactly (up to ninth order for n = 5). As you may have already guessed, the reason this is possible is that we have doubled the number of parameters at our disposal: we can use the 2n parameters (the x_i’s and the c_i’s) to handle polynomials up to degree 2n - 1.

In our discussion of Newton–Cotes methods, we always started from an elementary interval and later generalized to a composite integration rule, which could handle the full integral from a to b. Now, for the discussion of Gaussian quadrature, we will employ it directly in the full interval from a to b.

It is standard to take a = -1 and b = 1 at this stage, since we can transform the integration domain to [-1,1] in most times. We will make the approximation \int_{-1}^{1} f(x)dx \simeq \sum_{k=0}^{n-1}c_k f(x_k). \tag{1} We claim that by an intelligent choice of both \{x_k\} and \{c_k\}, we will be able to integrate polynomials up to degree 2n-1 exactly. We also point out that all Gaussian quadrature methods are open, so the x_ks are not to be identified with -1 and +1. Let’s see how this all works.

1.1 Gauss–Legendre: n = 2 Case

We start from explicitly addressing the exact integrability of polynomials up to degree 2n-1 for the simplest non-trivial case, namely that of two points. We take Equation 1 and apply it to the case of n=2: \int_{-1}^1 f(x)dx = c_0 f(x_0) + c_1f(x_1). \tag{2}

The abscissas x_0 and x_1 along with the weights c_0 and c_1 are the four quantities we need to now determine. Since Gaussian quadrature methods are open, x_0 and x_1 will lie in (a, b).

As advertised, we will determine the four unknown parameters by demanding that all polynomials up to order 2n - 1 = 3 can be integrated exactly.

It is simplest to use monomials, i.e., single powers of x, instead of polynomials (which can later be arrived at as linear combinations of monomials). Thus, we will take f(x) = \begin{cases} x^0\equiv 1 \\ x^1 \\ x^2 \\ x^3 \end{cases} and assume that for all these cases Equation 2 holds. This leads to the following four equations in four unknowns: \begin{gather} \int_{-1}^{1} 1 dx = 2 = c_0 + c_1, \tag{a} \\ \int_{-1}^{1} x dx = 0 = c_0x_0 + c_1 x_1, \tag{b} \\ \int_{-1}^{1} x^2 dx = \frac{2}{3} = c_0 x_0^2 + c_1 x_1^2, \tag{c} \\ \int_{-1}^{1} x^3 dx = 0 = c_0 x_0^3 + c_1 x_1^3 \tag{d}. \end{gather} Using Eq. (b), we have c_0 x_0 = -c_1 x_1. Inserting this into Eq. (d), we have c_0 x_0(x_0^2 - x_1^2) = 0.

Note that the weight c_0\neq 0. Moreover, x_0\neq 0, otherwise we would have x_0 = x_1 = 0! Because of this, we must have x_0^2 = x_1^2. Since x_0 and x_1 should be distinct, this implies x_0 = -x_1. Now, we can use Eqs. (a) and (b) to obtain c_0 = c_1 = 1. We can also use Eqs. (a) and (c) to get 2x_0^2 = 2x_1^2 = \frac{2}{3}, which gives x_{0,1} = \pm \frac{1}{\sqrt{3}}.

To summarize, we have \int_{-1}^1 f(x)dx = f(\frac{1}{\sqrt{3}}) + f(-\frac{1}{\sqrt{3}}).

This employs two points and is exact for polynomials up to third order; it is approximate for other functions. It’s worth pausing for a second to realize that our accomplishment is already noteworthy: Simpson’s rule could integrate polynomials up to third order exactly, but it needed at least three points in order to do that.

2 Gauss–Legendre: General Case

There are many ways to introduce the general case of Gauss-Legendre method. For example, you can repeat the previous section by considering a larger n manually. But this can become very complicated as you can imagine for larger and larger n.

Instead, in this section, we tackle Gaussian quadrature using Hermite interpolation, which will be introduced in the following. This approach will allow us to explain in a unified manner:

how to pick the nodes x_k, thereby justifying the presence of the name “Legendre” in Gauss–Legendre,
how to compute the weights c_k,
the overall error scaling of Gauss–Legendre quadrature.

2.1 Hermite Interpolation

In this section, we solve the problem of interpolating through both the points (x_j, y_j) and the first derivatives (x_j, y'_j) at those points: p(x_j) = y_j, \quad p'(x_j) = y_j', \quad j=0,1,\dots, n-1. This involves 2n conditions gives rise to what is known as the Hermite interpolation.

As you can guess, using a polynomial of degree n−1 won’t work: this would have n undetermined parameters, but we need to satisfy 2n conditions. In other words, our Ansatz for the interpolating polynomial should start from a form containing 2n undetermined parameters, i.e., a polynomial of degree 2n-1.

We start from the following guess: p(x) = \sum_{k=0}^{n-1} y_k\alpha_k(x) + \sum_{k=0}^{n-1} y_k'\beta_k(x) where the \alpha_k(x) are meant to capture the function values and the \beta_k(x) the derivative values.

In other words: \begin{gather*} \alpha_k(x_j) = \delta_{kj}, \quad \beta_k(x_j) = 0 \\ \alpha_k'(x_j) = 0, \quad \beta_k'(x_j) = \delta_{kj}, \end{gather*} \tag{3} for k,j = 0,1, \dots, n-1.

We can then write \alpha_k(x) and \beta_k(x) in terms of L_k(x), the cardinal polynomials. Note that L_k(x) is a polynomial of degree n-1, satisfying L_k(x_j) = \delta_{kj}. In order to get a polynomial of degree 2n-1, we can take L_k(x) which has degree of 2n-2, and then multiply it by a linear polynomial. We thus write \alpha_k(x) = u_k(x) L_k^2(x), \quad \beta_k(x) = v_k(x)L_k^2(x), \quad k = 0, 1, \dots, n-1, where both u_k(x) and v_k(x) are linear.

To see how these are determined, we first write out two properties of the squared cardinal polynomials: \begin{gather*} L_k^2(x_j)=(\delta_{kj})^2 = \delta_{kj} \\ (L_k^2(x_j))' = 2L_k(x_j)L_k'(x_j) = 2\delta_{kj} L_k'(x_j) \end{gather*}

Using conditions in Equation 3, we have \delta_{kj} = \alpha_k(x_j) = u_k(x_j)L^2_k(x_j) = u_k(x_j) \delta_{kj}, implying u_k(x_k)=1.

Similarly, we have 0 = \alpha_k'(x_j) = u_k'(x_j)L_k^2(x_j) + u_k(x_j)(L_k^2(x_j))' = u_k'(x_j)\delta_{kj} + 2u_k(x_j) \delta_{kj} L_k'(x_j), which implies (taking j = k): u_k'(x_k) + 2 u_k(x_k) L_k'(x_k) = 0.

Since u_k(x) is linear and we can write u_k(x) = \gamma x + \delta. This means \gamma x_k + \delta = 1, \quad \gamma + 2L_k'(x_k) = 0. We can solve it \gamma = -2L_k'(x_k), \quad \delta = 1 + 2L_k'(x_k)x_k, and it gives rise to \alpha_k(x) = \left[1+2L_k'(x_k)(x_k - x)\right] L_k^2(x).

A completely analogous derivation allows you to determine \beta_k(x) (left in homework). Putting everything together, our interpolating polynomial is then p(x) = \sum_{k=0}^{n-1} y_k\left[1 + 2L_k'(x_k)(x_k - x)\right]L_k^2(x) + \sum_{k=0}^{n-1} y_k'(x - x_k)L_k^2(x). \tag{4}

2.2 Node Placement

In the previous lecture, we saw how to produce Newton–Cotes quadrature rules starting from Lagrange interpolation, writing the interpolating polynomial p(x) of degree n-1 in terms of the function values at the grid points, as well as the cardinal polynomials. We now write down the analogous formula for the case of Hermite interpolation, since we are interested in studying polynomials of degree up to 2n-1.

We introduce p(x) = \sum_{k=0}^{n-1}f(x_k) \alpha_k(x) + \sum_{k=0}^{n-1} f'(x_k)\beta_k(x) where we have updated the notation to employ f(x_k) and f'(x_k) for our integrand and derivative values at the nodes; the \alpha_k(x) and \beta_k(x) are determined in terms of the cardinal polynomials as in Equation 4.

Note that if we pick f(x) to be a polynomial of degree 2n-1 or less, then p(x)=f(x). We now perform integration \begin{align*} \int_{-1}^{1} f(x) dx &\simeq \int_{-1}^1 p(x) dx = \sum_{k=0}^{n-1} f(x_k) \int_{-1}^{1} \alpha_k(x) dx + \sum_{k=0}^{n-1} f'(x_k) \int_{-1}^{1} \beta_k(x) dx \\ &=\sum_{k=0}^{n-1} c_k f(x_k) + \sum_{k=0}^{n-1} d_k f'(x_k). \end{align*} \tag{5} In the last step we introduced c_k = \int_{-1}^{1}dx \alpha_k(x), \quad d_k = \int_{-1}^1 dx \beta_k(x).

If we can now somehow make the d_ks vanish, we will have accomplished what we set out to. The left-hand side of Equation 5 is the integral we wish to evaluate, and the last step on the right hand side would express it (exactly) as a sum of weights times function values. Also, if the d_ks dropped out of the problem then we would no longer have to worry about the fact that we don’t know the f'(x_k) values.

To see how we could possibly make the d_ks vanish, we express the \beta_k(x) in terms of the cardinal polynomials d_k = \int_{-1}^{1} \beta_k(x) dx = \int_{-1}^1(x - x_k)L_k^2(x) dx. Using the relation (prove in your homework) L_k(x) = \frac{L(x)}{(x - x_k)L'(x_k)} \Leftrightarrow (x - x_k)L_k(x)=\frac{L(x)}{L'(x_k)}, \tag{6} where we introduced the node polynomial, L(x) = \prod_{j=0}^{n-1}(x - x_j), we can rewrite d_k = \frac{1}{L'(x_k)}\int_{-1}^1 L(x)L_k(x) dx.

Now, we want to make d_k vanish. In order to do so, we need some concepts from orthogonal polynomials, which will be introduced in the following.

2.3 Orthogonal polynomials

Here we will generalize the Gram–Schmidt orthonormalization process to work with functions. We realize that we are lacking the concept of the “length” of a function (which would correspond to the norm used in the denominator to normalize) as well as the concept of the “inner product” of two functions (which would be needed to subtract out any component that’s not orthogonal).

Let’s introduce the latter, namely the inner product of the function f(x) with the function g(x). We choose to work with the symmetrical interval -1 \leq x \leq 1 and define: (f,g) \equiv \int_{-1}^{1} f(x)g(x) dx assuming these are real functions. It is then straighforward to define the length of a function as simply \|f\| = \sqrt{(f,f)}.

We are now in a position to start following the Gram-Schmidt steps. We will call q_j the result of orthonormalizing the monomials a_0 = 1, a_1 = x, a_2 = x^2, a_3 = x^3, \dots. Taking them in order we have q_0 = \frac{a_0}{\|a_0 \|} = \frac{1}{\sqrt{\int_{-1}^1 dx}} = \frac{1}{\sqrt{2}}.

Next, we have a_1' = a_1 - (q_0,a_1)q_0 = x - (\frac{1}{\sqrt{2}},x)\frac{1}{\sqrt{2}} = x - \frac{1}{\sqrt{2}}\int_{-1}^1 \frac{1}{\sqrt{2}}x dx = x, and the normalized one q_1= \frac{a_1'}{\|a_1' \|} = \frac{x}{\sqrt{\int_{-1}^1 x^2 dx}} = \sqrt{\frac{3}{2}}x.

Next, we have a_2' = a_2 - (q_0, a_2)q_0 - (q_1, a_2)q_1 = x^2 - (\frac{1}{\sqrt{2}}, x^2) \frac{1}{\sqrt{2}} - (\sqrt{\frac{3}{2}}x, x^2)\sqrt{\frac{3}{2}}x = x^2 - \frac{1}{3}. With normalization, we obtain q_2 = \frac{a_2'}{\|a_2' \|} = \frac{x^2 - 1/3}{\sqrt{\int_{-1}^1 (x^2 - 1/3)^2 dx}} = \sqrt{\frac{5}{2}}(\frac{3}{2}x^2 - \frac{1}{2}).

With this Gram-Schmidt procedure, we can obtain q_n which is a polynomial of degree n, and satisfies the orthonormal condition (q_n, q_m) \equiv \int_{-1}^1 q_n(x) q_m(x) dx = \delta_{nm}. Thus, these q_js are called orthogonal polynomials, and they serve as a basis for polynomials.

Any polynomial of degree n-1 can then be expressed as a linear combination of the orthonormal q_js as follows: r_{n-1}(x) = \sum_{j=0}^{n-1} c_j q_j(x). Because of this, we must have (r_{n-1},q_n) \equiv \int_{-1}^1 r_{n-1}(x) q_n(x) dx = 0. In words, q_n is orthogonal to all polynomials of a lower degree.

2.4 Making d_k vanish

Let us now come back to the expression d_k = \frac{1}{L'(x_k)}\int_{-1}^1 L(x)L_k(x) dx. Here, L(x) is the nodal polynomial of degree n, which vanishes at its nodes x = x_0, x_1, \dots, x_{n-1}. On the other hand, L_k(x) = \frac{\prod_{j = 0, n\neq k}^{n-1} (x - x_j)}{\prod_{j = 0, n\neq k}^{n-1} (x_k - x_j)} is a polynomial in x of degree n-1.

Now, if L(x) is an orthogonal polynomial (of degree n), then d_k = 0 according to the results in the previous section !.

Notice that L(x) = \prod_{j=0}^{n-1} (x - x_j) is a monic polynomial (namely, the coefficient for the highest degree of monomial is 1). If we choose the nodes x_j to be the zeros of the orthogonal polynomial of degree n, then L(x) will be proportional to this orthogonal polynomial (so it is a monic orthogonal polynomial). With this choice, the d_k = 0 and we will have \int_{-1}^{1}f(x) dx \simeq \sum_{k=0}^{n-1} c_k f(x_k). The approximation will become exact if f(x) is a polynomial of degree n.

2.5 Why “Legendre”

In the above discussion, the orthogonal polynomial q_n(x)s were ctually multiples of the Legendre polynomials, P_n(x)s. If we take the nodal abscissas x_js to match the zeros of P_n(x), then we have P_n(x) = \frac{(2n)!}{2^n(n!)^2} L(x). This is known as the Rodrigues’ formula.

Now you see why we have been speaking of Gauss-Legendre quadrature: this is Gaussian quadrature when the nodal abscissas are taken to be the roots of Legendre polynomials. This also explains why we decided to focus on the interval [-1,1] in the first place.

2.6 Weight Computation

In this section, let us compute the weights c_j. We have mentioned that if f(x) is a polynomial of degree less than 2n-1, the integration is exactly. Hence, if we simply take f(x) = L_j(x), we have \int_{-1}^{1}L_j(x)dx = \sum_{k=0}^{n-1} c_k L_j(x_k) = \sum_{k=0}^{n-1}c_k\delta_{kj}.

This implies c_j = \int_{-1}^{1} L_j(x)dx , formally identical to what we had found for Newton-Cotes integration, although the difference here is that the integral is on the interval [-1,1].

Using Equation 6, we have c_j = \int_{-1}^{1} \frac{L(x)}{(x - x_j)L'(x_j)}dx = \frac{1}{P_n'(x_j)}\int_{-1}^{1} \frac{P_n(x)}{x - x_j} dx.

In order to prceed, we have to use some properties of the Legendre polynomials. I will only directly state the result without providing detail derivation, that c_j = \frac{2}{(1-x_j^2)[P_n'(x_j)]^2}. Thus, this requires the value of the Legendre polynomial derivatives at the abscissas.

2.7 Error scaling

Without an explicit error scaling, I will simply state the error scaling for Gauss–Legendre quadrature: \mathcal{E}\simeq \frac{\pi}{4^n}\frac{f^{(2n)}(\xi)}{(2n)!}

Recall that for the simpson rule, the error \mathcal{E}\sim h^4 \sim 1/n^4 has a power law scaling, whereas for the Gauss-Legendre quadrature \mathcal{E}\sim 4^n is exponentially scaling. Thus, it should come as no surprise that Gaussian quadrature does a great job for many well-behaved functions.

2.8 Integrating from a to b

In general we need to integrate from a to b instead of from -1 to 1. The solution is simply to carry out a change of variables t = \frac{b+a}{2} + \frac{b-a}{2}x, \quad dt = \frac{b-a}{2}dx.

Thus, we have \int_{a}^{b} f(t)dt = \frac{b-a}{2}\int_{-1}^{1} f\left(\frac{b+a}{2}+\frac{b-a}{2}x\right)dx \simeq \frac{b-a}{2}\sum_{i=0}^{n-1}c_k f(\frac{b+a}{2}+\frac{b-a}{2}x_k).

2.9 Implementation

In the following implementation, we called a Scipy function scipy.special.roots_legendre(). This takes an input integer corresponding to the order of the quadrature (the degree of the Legendre polynomial), and output two arrays:

a 1Darray for the zeros of the polynomial (abscissas)
weights at these points

# import legendre  method
from scipy.special import roots_legendre
import numpy as np

def f(x):
    return 1/np.sqrt(x**2 + 1)

def gauleg(f,a,b,n):
    xs, cs = roots_legendre(n)
    coeffp = 0.5*(b+a) 
    coeffm = 0.5*(b-a)
    ts = coeffp + coeffm*xs
    contribs = cs*f(ts)
    return coeffm*np.sum(contribs)

if __name__ == '__main__':
    ans = np.log(1 + np.sqrt(2))
    print(ans)
    for n in range(2,10):
        print(n, gauleg(f,0.,1,n))

0.8813735870195429
2 0.8817898064445446
3 0.8813312019379165
4 0.8813752230725129
5 0.8813735706987258
6 0.8813735849145593
7 0.8813735871721438
8 0.8813735870147529
9 0.8813735870195224

3 Other Gaussian Quadratures

In the previous sections, we focused on Gauss–Legendre quadrature only. In the present short section, we will provide a qualitative argument of how one goes about generalizing things.

The general form of the problem at hand is \int_{a}^{b} w(x)f(x) dx \simeq \sum_{i=0}^{n-1} c_k f(x_k) where w(x) is a non-negative weight function. As you can see, the presence of w(x) on the left-hand side (only) is new. As before, the Gaussian quadrature approach allows you to find the exact answer when f(x) is a polynomial of degree up to 2n- 1. On the other hand, the full integrand does not need to be a polynomial. Here are three standard choices for the weight function, which have been worked out in the literature

Gauss-Chebyshev: w(x) = \frac{1}{\sqrt{1-x^2}}
Gauss-Laguerre: w(x) = x^\alpha e^{-x}
Gauss-Hermite: w(x) = e^{-x^2}

Note that the Gauss-Legendre case is equivalent to w(x) = 1.

Here, the weights are c_k = \int_{a}^{b}w(x)\alpha_k(x)dx.

The integration domain for the above three cases are actually [-1,1], [0,\infty), and (-\infty, \infty). The nodal abscissas x_js should be taken as roots of Chebyshev, Laguerre, and Hermite polynomials, respectively.

4 Homework

Please prove the following formula in the lecture note (prime denotes derivative) L_k(x) = \frac{L(x)}{(x - x_k)L'(x_k)} where L_k(k) = \frac{\prod_{j=0, j\neq k}^{n-1} (x - x_j)}{\prod_{j=0, j\neq k}^{n-1} (x_k - x_j)} and L(x) = \prod_{j=0}^{n-1} (x - x_j).
Please derive in the Hermite interpolation (using Equation 3) \beta_k(x) = (x - x_k)L_k^2(x).