Manyspikes

So far, we have been looking at integrating functions of a single variable. We will now look at how integration works for multivariate functions.

Let's say we have the following function:

\begin{equation} f(x_1, x_2) = e^{- x_1^2 - x_2^2} \end{equation}

This is an instance of a 2-dimensional Gaussian curve and we will see this function again in future modules. For now, assume that integrating this function over a given interval provides us with valuable information on how likely values in these interval are.

Let's say that we want to integrate this function for $x_1 \in [0, 1]$ and $x_2 \in [2, 3]$ . We would write the integral as follows:

\begin{equation} \int_{[0, 1] \times [2, 3]} e^{- x_1^2 - x_2^2} \,\textrm{d}(x_1, x_2). \end{equation}

In this particular case, the integral above can be computed by first integrating with respect to $x_2$ and then with respect to $x_1$ :

\begin{equation} \int_{0}^{1} \left( \int_{2}^{3} e^{- x_1^2 - x_2^2} \,\textrm{d}x_2 \right) \,\textrm{d}x_1. \end{equation}

The outer integral is concerned with the variable $x_1$ while the inner integral is concerned with the variable $x_2$ . Again in this case, this is equivalent to integrating first with respect to $x_1$ (inner integral) and then to $x_2$ (outer integral):

\begin{equation} \int_{2}^{3} \left( \int_{0}^{1} e^{- x_1^2 - x_2^2} \,\textrm{d}x_1 \right) \,\textrm{d}x_2. \end{equation}

This is called iterated integration. We have included the parenthesis to make it clear that we are dealing with the computation of the inner integral followed by the computation of an outer integral. However, the parentheses are usually ommited to avoid clutter, so we can usually write

\begin{equation} \int_{2}^{3} \int_{0}^{1} e^{- x_1^2 - x_2^2} \,\textrm{d}x_1\textrm{d}x_2. \end{equation}

For most functions we're likely to encounter, the order in which we perform multiple integrations does not matter. However, that is not always the case: Fubini's theorem sets out the conditions under which the order of integration is irrelevant. We won't cover this here since it is rather unlikely you will encounter a function that violates Fubini's conditions in ML/AI.

Rules of integration in multiple dimensions

Just like for univariate integration, multivariate integration follows the rules of linearity. For instance, for a 2-dimensional integration:

\begin{align} \textrm{Addition/Subtraction}:& \int\int [f(x,y) \pm g(x,y)] \,\textrm{d}(x,y) = \int\int f(x,y) \,\textrm{d}(x,y) \pm \int\int g(x,y) \,\textrm{d}(x,y)\\[3 ex] \textrm{Scalar multiplication}:& \int\int cf(x,y) \,\textrm{d}(x,y) = c\int\int f(x,y) \,\textrm{d}(x,y) \end{align}

Integration by parts and the change of variables technique can also be applied in multiple dimensions. The next example demonstrates the latter in action.

Example

Let's try to integrate the function we were working with above:

\begin{equation} A = \int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty} e^{- x_1^2 - x_2^2}\,\textrm{d}(x_1, x_2). \end{equation}

We start by applying the change of variables technique using a polar coordinate transformation where

\begin{align} x_1 &= r\cos\theta\\[3 ex] x_2 &= r\sin\theta\\[3 ex] \textrm{d}(x_1, x_2) &= r\,\textrm{d}(r, \theta) \end{align}

Quick note: if you are wondering why $\textrm{d}(x_1, x_2) = r\,\textrm{d}(r, \theta)$ , think about what is necessary to define an infinitesimal area in 2-d cartesian coordinates vs in polar coordinates. In 2-d cartesian coordinates, an infinitesimal area can be defined solely by $\Delta x_1$ and $\Delta x_2$ . However, in polar coordinates, $\Delta r$ and $\Delta \theta$ are not sufficient to express an infinitesimal area. For instance, consider a region $S$ defined by $1<r<2$ and another region $T$ defined by $100<r<101$ . Even though $\Delta r=1$ for both regions, their areas are quite different.

\begin{align} A_S &= \pi 2^2 - \pi 1^2 \approx 9.42 \\[3 ex] A_T &= \pi 101^2 - \pi 100^2 \approx 631.46 \\[3 ex] \end{align}

Coming back to our integral, we can now write

\begin{align} A &= \int_0^{2\pi} \int_0^{+\infty} e^{- (r\cos\theta)^2 - (r\sin\theta)^2}r\,\textrm{d}(r, \theta)\\[3 ex] &= \int_0^{2\pi} \int_0^{+\infty} e^{- r^2(\cos^2\theta + \sin^2\theta)} r\,\textrm{d}(r, \theta)\\[3 ex] &= \int_0^{2\pi} \int_0^{+\infty} e^{- r^2} r\,\textrm{d}(r, \theta). \end{align}

Let's break down what we have done above:

we adjusted the integration intervals to the domains of the new variables, i.e. $r \in \mathbb{R}_0^+$ and $\theta \in [0, 2\pi]$ .
we factored out $r^2$ so that we can use the equality $\cos^2\theta + \sin^2\theta = 1$

Applying Fubini's theorem, we can convert the double integral into an iterated integral, yielding

\begin{align} A &= \int_0^{+\infty} \int_0^{2\pi} e^{- r^2} r\,\textrm{d}\theta\,\textrm{d}r. \end{align}

The integral with respect to $\theta$ is trivial and evaluates to $2\pi$ , allowing us to write:

\begin{align} A &= 2\pi \int_0^{+\infty} r e^{- r^2}\,\textrm{d}r. \end{align}

If we move the factor 2 into the integral, you'll notice we can apply the change of variables rule. Here, we assume that $g(x)=x^2$ , and thus $g'(x)=2x$ . We also assume that $f(x)=e^{-x}$ and thus its primitive is $F(x)=-e^{-x}$ . By the change of variables rule, we can thus write:

\begin{align} A &= \pi \int_0^{+\infty} 2r e^{- r^2}\,\textrm{d}r\\[3 ex] &= \pi \left( F(\infty) - F(0) \right)\\[3 ex] &= \pi \left( -e^{-\infty^2} - (-e^0) \right)\\[3 ex] &= \pi \left( 0 - (-1) \right)\\[3 ex] &= \pi. \end{align}

We will see this integral later in the probability course, when we cover the normal distribution.

Approximating higher-dimensional integrals

Just like with univariate integration, we can use Riemann sums to approximate the multiple integrals. Let's approximate the integral above over an increasingly larger integration interval—if we are correct, we should that the approximation tends to $\pi$ as we increase the integration interval.

Note that the Riemann sum scales the values of $f$ by $\epsilon^2$ , since we are now dealing with an area. As expected, our approximation gets progressively closer to $\pi$ as we increase the range over which we are approximating the integral.

In summary, in this section we introduced multiple integrals by covering a 2-dimensinal example. We also saw how 2-dimensional integrals can be approximated by Riemann sums.

It is of course possible to integrate in higher-dimensional spaces, but this naturally becomes trickier. Luckily, having a basic understanding of integrals in 1 and 2 dimensions is all we need to understand the next modules, so we won't go into complex integration topics here.

Higher-dimensional integration

Rules of integration in multiple dimensions

Example

Approximating higher-dimensional integrals