manyspikes

Higher-dimensional integration

Initialising environment...

So far, we have been looking at integrating functions of a single variable. We will now look at how integration works for multivariate functions.

Let's say we have the following function:

f(x1,x2)=ex12x22\begin{equation} f(x_1, x_2) = e^{- x_1^2 - x_2^2} \end{equation}

This is an instance of a 2-dimensional Gaussian curve and we will see this function again in future modules. For now, assume that integrating this function over a given interval provides us with valuable information on how likely values in these interval are.

Let's say that we want to integrate this function for x1[0,1]x_1 \in [0, 1] and x2[2,3]x_2 \in [2, 3]. We would write the integral as follows:

[0,1]×[2,3]ex12x22d(x1,x2).\begin{equation} \int_{[0, 1] \times [2, 3]} e^{- x_1^2 - x_2^2} \,\textrm{d}(x_1, x_2). \end{equation}

In this particular case, the integral above can be computed by first integrating with respect to x2x_2 and then with respect to x1x_1:

01(23ex12x22dx2)dx1.\begin{equation} \int_{0}^{1} \left( \int_{2}^{3} e^{- x_1^2 - x_2^2} \,\textrm{d}x_2 \right) \,\textrm{d}x_1. \end{equation}

The outer integral is concerned with the variable x1x_1 while the inner integral is concerned with the variable x2x_2. Again in this case, this is equivalent to integrating first with respect to x1x_1 (inner integral) and then to x2x_2 (outer integral):

23(01ex12x22dx1)dx2.\begin{equation} \int_{2}^{3} \left( \int_{0}^{1} e^{- x_1^2 - x_2^2} \,\textrm{d}x_1 \right) \,\textrm{d}x_2. \end{equation}

This is called iterated integration. We have included the parenthesis to make it clear that we are dealing with the computation of the inner integral followed by the computation of an outer integral. However, the parentheses are usually ommited to avoid clutter, so we can usually write

2301ex12x22dx1dx2.\begin{equation} \int_{2}^{3} \int_{0}^{1} e^{- x_1^2 - x_2^2} \,\textrm{d}x_1\textrm{d}x_2. \end{equation}

For most functions we're likely to encounter, the order in which we perform multiple integrations does not matter. However, that is not always the case: Fubini's theorem sets out the conditions under which the order of integration is irrelevant. We won't cover this here since it is rather unlikely you will encounter a function that violates Fubini's conditions in ML/AI.

Rules of integration in multiple dimensions

Just like for univariate integration, multivariate integration follows the rules of linearity. For instance, for a 2-dimensional integration:

Addition/Subtraction:[f(x,y)±g(x,y)]d(x,y)=f(x,y)d(x,y)±g(x,y)d(x,y)Scalar multiplication:cf(x,y)d(x,y)=cf(x,y)d(x,y)\begin{align} \textrm{Addition/Subtraction}:& \int\int [f(x,y) \pm g(x,y)] \,\textrm{d}(x,y) = \int\int f(x,y) \,\textrm{d}(x,y) \pm \int\int g(x,y) \,\textrm{d}(x,y)\\[3 ex] \textrm{Scalar multiplication}:& \int\int cf(x,y) \,\textrm{d}(x,y) = c\int\int f(x,y) \,\textrm{d}(x,y) \end{align}

Integration by parts and the change of variables technique can also be applied in multiple dimensions. The next example demonstrates the latter in action.

Example

Let's try to integrate the function we were working with above:

A=++ex12x22d(x1,x2).\begin{equation} A = \int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty} e^{- x_1^2 - x_2^2}\,\textrm{d}(x_1, x_2). \end{equation}

We start by applying the change of variables technique using a polar coordinate transformation where

x1=rcosθx2=rsinθd(x1,x2)=rd(r,θ)\begin{align} x_1 &= r\cos\theta\\[3 ex] x_2 &= r\sin\theta\\[3 ex] \textrm{d}(x_1, x_2) &= r\,\textrm{d}(r, \theta) \end{align}

Quick note: if you are wondering why d(x1,x2)=rd(r,θ)\textrm{d}(x_1, x_2) = r\,\textrm{d}(r, \theta), think about what is necessary to define an infinitesimal area in 2-d cartesian coordinates vs in polar coordinates. In 2-d cartesian coordinates, an infinitesimal area can be defined solely by Δx1\Delta x_1 and Δx2\Delta x_2. However, in polar coordinates, Δr\Delta r and Δθ\Delta \theta are not sufficient to express an infinitesimal area. For instance, consider a region SS defined by 1<r<21<r<2 and another region TT defined by 100<r<101100<r<101. Even though Δr=1\Delta r=1 for both regions, their areas are quite different.

AS=π22π129.42AT=π1012π1002631.46\begin{align} A_S &= \pi 2^2 - \pi 1^2 \approx 9.42 \\[3 ex] A_T &= \pi 101^2 - \pi 100^2 \approx 631.46 \\[3 ex] \end{align}

Coming back to our integral, we can now write

A=02π0+e(rcosθ)2(rsinθ)2rd(r,θ)=02π0+er2(cos2θ+sin2θ)rd(r,θ)=02π0+er2rd(r,θ).\begin{align} A &= \int_0^{2\pi} \int_0^{+\infty} e^{- (r\cos\theta)^2 - (r\sin\theta)^2}r\,\textrm{d}(r, \theta)\\[3 ex] &= \int_0^{2\pi} \int_0^{+\infty} e^{- r^2(\cos^2\theta + \sin^2\theta)} r\,\textrm{d}(r, \theta)\\[3 ex] &= \int_0^{2\pi} \int_0^{+\infty} e^{- r^2} r\,\textrm{d}(r, \theta). \end{align}

Let's break down what we have done above:

  1. we adjusted the integration intervals to the domains of the new variables, i.e. rR0+r \in \mathbb{R}_0^+ and θ[0,2π]\theta \in [0, 2\pi].
  2. we factored out r2r^2 so that we can use the equality cos2θ+sin2θ=1\cos^2\theta + \sin^2\theta = 1

Applying Fubini's theorem, we can convert the double integral into an iterated integral, yielding

A=0+02πer2rdθdr.\begin{align} A &= \int_0^{+\infty} \int_0^{2\pi} e^{- r^2} r\,\textrm{d}\theta\,\textrm{d}r. \end{align}

The integral with respect to θ\theta is trivial and evaluates to 2π2\pi, allowing us to write:

A=2π0+rer2dr.\begin{align} A &= 2\pi \int_0^{+\infty} r e^{- r^2}\,\textrm{d}r. \end{align}

If we move the factor 2 into the integral, you'll notice we can apply the change of variables rule. Here, we assume that g(x)=x2g(x)=x^2, and thus g(x)=2xg'(x)=2x. We also assume that f(x)=exf(x)=e^{-x} and thus its primitive is F(x)=exF(x)=-e^{-x}. By the change of variables rule, we can thus write:

A=π0+2rer2dr=π(F()F(0))=π(e2(e0))=π(0(1))=π.\begin{align} A &= \pi \int_0^{+\infty} 2r e^{- r^2}\,\textrm{d}r\\[3 ex] &= \pi \left( F(\infty) - F(0) \right)\\[3 ex] &= \pi \left( -e^{-\infty^2} - (-e^0) \right)\\[3 ex] &= \pi \left( 0 - (-1) \right)\\[3 ex] &= \pi. \end{align}

We will see this integral later in the probability course, when we cover the normal distribution.

Approximating higher-dimensional integrals

Just like with univariate integration, we can use Riemann sums to approximate the multiple integrals. Let's approximate the integral above over an increasingly larger integration interval—if we are correct, we should that the approximation tends to π\pi as we increase the integration interval.

Note that the Riemann sum scales the values of ff by ϵ2\epsilon^2, since we are now dealing with an area. As expected, our approximation gets progressively closer to π\pi as we increase the range over which we are approximating the integral.

In summary, in this section we introduced multiple integrals by covering a 2-dimensinal example. We also saw how 2-dimensional integrals can be approximated by Riemann sums.

It is of course possible to integrate in higher-dimensional spaces, but this naturally becomes trickier. Luckily, having a basic understanding of integrals in 1 and 2 dimensions is all we need to understand the next modules, so we won't go into complex integration topics here.