manyspikes

Integration as the inverse of differentiation

Initialising environment...

In the differential calculus course, we learned about derivatives and how they can be interpreted as the rate of change in the value of a function around a particular point. We also saw how derivatives can be approximated by the method of finite differences, where we calculate an approximation of the derivative by subtracting the value of a particular function at two different but very close points, and divide it by the distance between those points:

dfdxf(x+ϵ)f(x)ϵ\begin{equation} \frac{df}{dx} \approx \frac{f(x+\epsilon) - f(x)}{\epsilon} \end{equation}

where ϵ\epsilon is a very small number.

To make this more concrete, let's look at an example with f(x)=x2f(x) = x^2. We will start by approximating the derivative of ff, dfdx\frac{df}{dx}, via the method of finite differences with ϵ=0.01\epsilon = 0.01.

As expected, our approximation of dfdx\frac{df}{dx} looks pretty much like a straight line with a slope of 2, since d(x2)dx=2x\frac{d(x^2)}{dx} = 2x.

Now here is something to think about: what is the inverse operation of differentation? That is, what is the operation that, once applied to dfdx\frac{df}{dx} returns the original function ff?

Before formally defining this operation, let's try to think about it intuitively. As we have seen above, the approximation of the derivative tells us the rate at which the function f(x)f(x) is changing. For instance, if we know the value of f(a)f(a), then we know that the value of f(a+ϵ)f(a + \epsilon) will be the value of f(a)f(a) plus how much the function has changed between aa and a+ϵa+\epsilon, which we can approximate as ϵdfdxx=a\left.\epsilon\frac{df}{dx} \right|_{x=a}. Thus, we can write

f(a+ϵ)f(a)+ϵdfdxx=a.\begin{equation} f(a + \epsilon) \approx f(a) + \left.\epsilon\frac{df}{dx} \right|_{x=a}. \end{equation}

Likewise, the value of the function at the point x=a+2ϵx= a + 2\epsilon could be retrieved by considering how much the function has changed between a+ϵa + \epsilon and (a+ϵ)+ϵ(a + \epsilon) + \epsilon:

f(a+2ϵ)f(a+ϵ)+ϵdfdxx=a+ϵ=f(a)+ϵdfdxx=a+ϵdfdxx=a+ϵ=f(a)+ϵ(dfdxx=a+dfdxx=a+ϵ).\begin{align} f(a + 2\epsilon) &\approx f(a + \epsilon) + \left. \epsilon\frac{df}{dx} \right|_{x=a + \epsilon} \\[3 ex] &= f(a) + \left.\epsilon\frac{df}{dx} \right|_{x=a} + \left.\epsilon\frac{df}{dx} \right|_{x=a + \epsilon} \\[3 ex] &= f(a) + \epsilon \left( \left.\frac{df}{dx} \right|_{x=a} + \left.\frac{df}{dx} \right|_{x=a + \epsilon}\right). \end{align}

Thus, f(a+nϵ)f(a + n\epsilon) is given by

f(a+nϵ)f(a)+ϵi=1ndfdxx=a+(i1)ϵ.\begin{equation} f(a + n \epsilon) \approx f(a) + \epsilon \sum_{i=1}^n \left.\frac{df}{dx} \right|_{x=a + (i-1) \epsilon}. \end{equation}

If we look at the last term in the previous equation, we see that it represents the cumulative sum of dfdx\frac{df}{dx}. Thus, all we need to do to reconstruct f(x)f(x) is to accumulate our approximation of the derivative, scale it by ϵ\epsilon and offset it by f(a)f(a).

Let's see this in practice by recovering the f(x)f(x) starting with a=2a=-2 and thus f(a)=4f(a)=4:

Our approximation f^(x)\hat{f}(x) looks pretty much the same as f(x)f(x), which is great! But we sort of cheated a bit, because our approximation of f(x)f(x) relies on knowing the value of f(a)f(a). Without knowing f(a)f(a), the best we can do is to approximate f(x)f(x) up to a constant offset.

This is an inevitable consequence of differentiation, since differentiation looks at differences between subsequent values and does not consider all the values of the function. It turns out that recovering the original function up to an offset is the best we can do. For instance, the functions x2+5x^2+5 and x21x^2-1 have exactly the same derivative, so all we can say is that the function whose derivative is dfdx=2x\frac{df}{dx} = 2x is of the form f(x)=x2+Cf(x)=x^2 + C, where CC is a constant.

The process we have been looking at is called integration. Integrating a given function f(x)f(x) provides us with a primitive (also known as the antiderivative or indefinite integral) of that function, which can be differentiated to obtain the original function f(x)f(x). Thereby, integration is the inverse operation of differentiation.

Formally, we can write that F(x)F(x) is a primitive of f(x)f(x) if

F(x)+C=f(x)dx\begin{equation} F(x) + C = \int f(x) \,\textrm{d}x \end{equation}

where \int represents integration, CC represents a constant, and dxdx denotes the variable with respect to which integration applies (similar to the logic of the notation used in differentiation).

Definite integrals

The type of integration above is called indefinite because it applies to the function f(x)f(x) as a whole, i.e. across its entire domain. However, it is possible to integrate a function over a particular interval of interest—this is known as definite integration. Formally, the definite integral of a function in the interval [a,b][a, b] is defined as:

abf(x)dx=F(b)F(a)\begin{equation} \int_a^b f(x)\,\textrm{d}x = F(b) - F(a) \end{equation}

where F(x)=f(x)dxF(x) = \int f(x)dx is the indefinite integral of the function f(x)f(x).

But what does this represent? As we have seen before, and for a very small epsilon, the value of F(x)F(x) can be thought of as the cumulative sum of the values of f(x)f(x) in increments of epsilon up to the value xx. Thus, F(b)F(b) can be thought of as the sum of the values of the function f(x)f(x) up to the value bb and F(a)F(a) can be thought of as the sum of the values of f(x)f(x) up to the value aa. Assuming b>ab>a, when we subtract F(a)F(a) from F(b)F(b) the cumulative sum of all the values up to aa cancels out and we are left with the cumulative sum of the values of f(x)f(x) in the interval [a,b][a, b].

To make things a bit more concrete, let's take a look at the function f(x)=2xf(x) = 2x in the interval [1,3][1, 3].

The plot above depicts the values of f(x)f(x) that are accumulated to compute F(b)F(b) and F(a)F(a), in orange and green respectively. As you can see, all the values up to x=3x=3 are used to calculate F(b)F(b). Conversely, all the values up to x=1x=1 are used to compute the value of F(a)F(a). If we look closely, we can thus say that F(b)F(a)F(b) - F(a) approximates the area under the curve of f(x)f(x) in the interval [1,3][1, 3].

This approximation is known as the Riemann sum, which can be formally written as

i=1kf(ck)ϵk\begin{equation} \sum_{i=1}^k f(c_k)\epsilon_k \end{equation}

where ckc_k is the coordinate of the kk-th point and ϵk\epsilon_k is the increment size of the kk-th point. If we keep the increment size constant for all kk, then the Riemann sum becomes:

ϵi=1kf(ck).\begin{equation} \epsilon\sum_{i=1}^k f(c_k). \end{equation}

Summary

In this section, we have introduced integration as the inverse of differentiation. We have also seen how definite integration can be understood as the area under the curve over a particular interval [a,b][a, b]. But note that we have been working exclusively with approximations of differentiation and integration! In the next sections, we will look at how to integrate functions analytically whenever possible and we will cover the most important rules of integration.