Manyspikes

If you have studied high-school mathematics before, you may remember covering a topic called the determinant of a matrix. Determinants are a bit complex: at first, they are not as easy to understand as they are to memorise. In this section, we will cover the main properties of determinants and we will try to provide some intuition on how they relate to linear transformations.

Definition of determinant

Consider an $n$ -by- $n$ square matrix $\mathbf{A}$ :

\begin{equation} \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & \ldots & a_{1n} \\ a_{21} & a_{22} & \ldots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \ldots & a_{nn} \end{bmatrix} \end{equation}

The determinant of such matrix is given by the formula:

\begin{equation} \det(\mathbf{A}) = \sum_{\sigma \in \mathcal{S_n}} \left[ \text{sgn}(\sigma) \prod_{i=1}^{N} a_{i\sigma(i)} \right] \end{equation}

which is a pain to understand, so let's break it down.

$\mathcal{S_{n}}$ is the set of permutations of $n$ indices of the columns (or rows) of the matrix $\mathbf{A}$ . For a 2-by-2 matrix, this would be $\mathcal{S_2} = \{(1, 2), (2, 1)\}$ . For a 3-by-3 matrix, we would have $\mathcal{S_3} = \{(1, 2, 3), (1, 3, 2), (3, 2, 1), (3, 1, 2), (2, 1, 3), (2, 3, 1)\}$
$\text{sgn}$ is a function that returns the sign of that permutation. The sign of the permutation can be defined as $(-1)^{Inv(\sigma)}$ , where $Inv$ denotes the number of inversions of order between the sequence $\sigma$ and the sequence of $n$ elements ordered ascendingly. For instance, $Inv((1,2,3))$ would return 0 since the elements are sorted in ascending order. Conversely, $Inv(1, 3, 2)$ would return 1 since there is one inversion relative to the sorted sequence (3 appears before 2). $Inv(3, 2, 1)$ would return 3 since there are three inversions relative to the sorted sequence (3 appears before 2, 3 appears before 1, 2 appears before 1).
If the number of inversions is even, the sign of the term is positive; if the number of inversions is odd, the term is negative.

Example calculation

As an illustrative example, let's analytically calculate the determinant for the matrix defined below:

\begin{equation} \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}= \begin{bmatrix} 10 & 3 & 1 \\ 1 & 7 & 2 \\ 2 & 0 & 9 \end{bmatrix} \end{equation}

Taking the set $\mathcal{S_3} = \{(1, 2, 3), (1, 3, 2), (3, 2, 1), (3, 1, 2), (2, 1, 3), (2, 3, 1)\}$ as the set of permutations of indices, let's look at the values and the signs of each term in the formula:

$(1,2,3): -1^0 \times a_{11}a_{22}a_{33} = 10 \times 7 \times 9 = 630$
$(1,3,2): -1^1 \times a_{11}a_{23}a_{32} = 10 \times 2 \times 0 = 0$
$(3,2,1): -1^3 \times a_{13}a_{22}a_{31} = 1 \times 7 \times 2 = -14$
$(3,1,2): -1^2 \times a_{13}a_{21}a_{32} = 1 \times 1 \times 0 = 0$
$(2,1,3): -1^1 \times a_{12}a_{21}a_{33} = 3 \times 1 \times 9 = -27$
$(2,3,1): -1^2 \times a_{12}a_{23}a_{31} = 3 \times 2 \times 2 = 12$

Now we add all terms to get:

\begin{equation} \det(\mathbf{A}) = 630 + 0 - 14 + 0 - 27 + 12 = 601 \end{equation}

Properties of determinants

Now that we've worked through a calculation, let's look at some important properties of determinants.

Property 1: Transposing a matrix has no effect on its determinant

Formally, $\det(\mathbf{A}) = \det(\mathbf{A}^T)$ . This results from the fact that the values of the different terms and their signs do not change in the transpose. For instance, for the permutation $(1, 3, 2)$ , the product $a_{11} a_{23} a_{32}$ will become $a_{11}a_{32}a_{23}$ , which is the same. The sign also remains the same because the permutation $(1, 3, 2)$ itself did not change and thus its sign remains the same.

Property 2: Swapping two columns (or rows) in a matrix changes the sign of its determinant.

Contrary to the transpose, swapping two columns changes the number of inversions in each permutation in a way that flips the sign of the permutation. To prove this, consider swapping two neighbouring columns: in each of the terms of the formula for the determinant, this would either introduce or remove one inversion depdending on the order the factors of the columns appear in the term: if there were in order, swapping would cause an inversion; if they were inverted, swapping would remove that inversion. This causes the sign of the term to flip.

The same holds for swapping any two columns in a determinant. This is due to the fact that any such swap can be expressed as an odd number of swaps between neighbouring columns, producing the overall effect of flipping the sign of each of the terms in the determinant.

Property 3: A determinant with two identical columns is equal to 0

We know from property 2 that swapping any two columns in a determinant flips the sign of the determinant. However, swapping to columns that are identical means that we produce the same determinant. The only case where both of these can be true is when the determinant is zero.

Property 4: Linearity

If a column can be linearly decomposed into any two columns of numbers with coefficients $p$ and $q$ , then the determinant of the matrix can be expressed as $D = pD_{1} + qD_{2}$ , where $D_{1}$ and $D_{2}$ are identical to $D$ except that the linearly dependent column is replaced by the said columns of numbers.

For a 3-by-3 matrix, this property can be written as:

\begin{equation} \det \left( \begin{bmatrix} a_{11} & p \times x_{1} + q \times y_{1} & a_{13} \\ a_{21} & p \times x_{2} + q \times y_{2} & a_{23} \\ a_{31} & p \times x_{3} + q \times y_{3} & a_{33} \end{bmatrix}\right)= p \det\left( \begin{bmatrix} a_{11} & x_{1} & a_{13} \\ a_{21} & x_{2} & a_{23} \\ a_{31} & x_{3} & a_{33} \end{bmatrix}\right) + q \det\left( \begin{bmatrix} a_{11} & y_{1} & a_{13} \\ a_{21} & y_{2} & a_{23} \\ a_{31} & y_{3} & a_{33} \end{bmatrix}\right) \end{equation}

Why is this the case? If every element of a given column $j$ can be written as a linear combination, each term of the determinant can be rewritten as:

\begin{equation} a_{\alpha_{1}1}a_{\alpha_{2}2} \cdots (p x_{j} + q y_{j}) \cdots a_{\alpha_{n}n} \end{equation}

Expanding the equation above, we have:

\begin{equation} p a_{\alpha_{1}1}a_{\alpha_{2}2} \cdots x_{j} \cdots a_{\alpha_{n}n} + q a_{\alpha_{1}1}a_{\alpha_{2}2} \cdots y_{j} \cdots a_{\alpha_{n}n} \end{equation}

The first term corresponds to the term associated with the determinant of the original matrix but with the j-th column substituted by $x_{j}$ . The second term corresponds to the term associated with the determinant of the original matrix but with the j-th column substituted by $y_{j}$ .

Property 5: If a column of a determinant is the zero vector, then it's determinant is zero.

This is easy to see because all the terms of the determinant would have at least one factor equal to 0, which means that all terms would vanish.

Determinants and linear transformations

Now that we have a formal definition of the determinant of a matrix, let's focus what it can tell us about linear transformations. Consider the unit square defined by the vectors $\mathbf{i}=[1, 0]^T$ and $\mathbf{j}=[0, 1]^T$ .

Now let's transform the two vectors using the following transformation matrix:

\begin{equation} \mathbf{A} = \begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \end{equation}

We have seen before that the columns of the matrix tell us the destination of the basis vectors, so this matrix moves the basis vectors $\mathbf{i}=[1, 0]$ to $[a_{11}, a_{21}]$ and $\mathbf{j}=[0, 1]$ to $[a_{12}, a_{22}]$ . What the precise transformation looks like depends on the values of $\mathbf{A}$ , so for demonstration purposes let's assume the following

\begin{equation} \mathbf{A} = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} \end{equation}

This would move the basis vectors to the following positions:

Now let's compute the are of the resulting parallelogram. We will use a simple geometric approach that relies on defining the following regions.

The area of the the parallelogram can be obtained by subtracting the area of the green, purple and orange regions from the area of the enclosing area (which in this case is a square, but could also be a rectangle).

The total area of the region is given by $(a_{11} + a_{12})(a_{21} + a_{22})$
The area of the top left square is given by $a_{12}a_{21}$
The area of the bottom right square is given by $a_{12}a_{21}$
The area of the left triangle is given by $a_{12}a_{22}/2$
The area of the right triangle is given by $a_{12}a_{22}/2$
The area of the bottom triangle is given by $a_{11}a_{21}/2$
The area of the top triangle is given by $a_{11}a_{21}/2$

Putting it all together:

\begin{align} Area & = (a_{11} + a_{12})(a_{21} + a_{22}) - 2a_{12}a_{21} - a_{12}a_{22} - a_{11}a_{21} \\ & = a_{11}a_{22} - a_{12}a_{21} \end{align}

The transformation $\mathbf{A}$ thus had the effect of scaling the area of the unit square by $a_{11}a_{22} - a_{12}a_{21}$ . You probably noticed that this corresponds to the determinant of the transformation matrix $\mathbf{A}$ . Indeed, the determinant of a transformation matrix tells us the scaling factor associated with the transformation, i.e. how much space stretches or shrinks under that transformation.

When the determinant is negative, it's absolute value still represents the extent to which the transformations stretches or shrinks space, but its sign indicates that the transformation flips the orientation of space.

Finally, what happens if the determinant is zero? In this case, the transformation is mapping the input space onto a lower-dimension. To understand this, remember that a determinant is zero when two of its columns are linearly dependent. That means that the two basis vectors associated with that transformation would be linearly dependent (also called collinear). Let's see this in the 2-d case: consider the transformation:

\begin{equation} \mathbf{A} = \begin{bmatrix} 2 & 4 \\ 1 & 2 \end{bmatrix} \end{equation}

The two basis vectors $[2, 1]^T$ and $[4, 2]^T$ are linearly dependent, since $[4, 2] = 2 \times [2, 1]$ . This transformation will map the unit vectors $\mathbf{i}=[1, 0]^T$ and $\mathbf{j}=[0,1]^T$ onto $[2, 1]^T$ and $[4, 2]^T$ respectively. Representing this visually:

With the vectors $[2, 1]^T$ and $[4, 2]^T$ we are no longer able to represent any vector in the 2-D space, which means that they are not a basis of that space. Instead, we can only represent vectors that sit along the same line as those vectors. Thus, under the transformation above, we can no longer represent areas (2-D space) but only distances (1-D space). In other words, this transformation maps a higher-dimensional space onto a lower-dimensional one. In the next sections we will see some interesting implications of such transformations.