Linear Systems Pt.1 : Systems of Linear Equations and Deriving Linear Transformations

Posted by John Hill on Mon 18 November 2019 Updated on Mon 18 November 2019

Most of the time in engineering, we're focused on solving complex problems. But what exactly makes a problem "complex"? Is it that the complex problems involves a topic that the majority of others lack the fundamentals to understand? Probably not -- get a room full of domain experts and get them to figure out why the car won't start. But the average person can deduce that either the spark plug died, the engine's flooded, the tank's empty, the battery's dead, and/or the alternator kicked (among a myriad of other possible simple issues).

Is it that complex problems are really just a group of simple problems whose solutions need to work in tandem with one another? In my opinion, that's frequently the case: Going back to the car starting problem, each possible source of the problem solves a unique problem, but all of the solutions of those problems work together to get something as complex as an internal combustion engine to operate effectively.

Mathematics serves as a foundation to understand and manipulate the world around us. And the world is full of problems to be solved, so it stands to reason that mathematics provides a means to understand and manipulate problems. This is the spirit of applied mathematics: its focus is upon the application of reasoning and theory on tangible problems to increase our understanding of those problems, manipulate the problems into something more manageable (easier to solve), and ultimately provide the means to solve them. In a sense, this gives us the tools to solve any problem we can describe as a mathematical relationship.

To that end, complex problems are simply a group of mathematical relationships whose solutions must work with each other! When we have a group of interrelated mathematical relationships, we'll call this a mathematical system. In a sense, a system is any mechanism in which multiple parts work together to achieve a common goal, hence how the word is used in mathematics.

Linear Equations

The simplest form of a mathematical relationship is equation; we're just stating that one side of the relationship is identical to the other. To make our lives easier, we're going to focus on a specific sort of equation: the linear equation: $a_1 \cdot x_1 + ... + a_n \cdot x_n = b = ax$ This has a few important properties that we'll make use of in a bit:

  • Invariant to Scaling: $k \cdot (a_1 \cdot x_1 + ... + a_n \cdot x_n) = k \cdot b = k(ax)$ - This means that scaling/multiplying by a value $k$ on one side of the equation is equivalent to scaling/multiplying by that same value
  • Transative Property: $k(ax) = (ka)x$ - This means that it doesn't matter if we compute $ax$ first or $ka$; we'll get the same result

When there is a system of linear equations, we're describing multiple equations that should have a relationship with each other (more on that in a moment). Let's consider a toy example that floats around on /r/facepalm:

When I was 6, my sister was half my age. Now I'm 70, how old is my sister?

We could describe it with the following system of equations:

$$ \begin{cases} 2x_1 = 6 \\ 2x_1 + x_2 = 70 \end{cases} $$

As simple as this is, just by reading the question, we know that 64 years elapsed since the first fact, and mentally we can say "Oh, well your sister was 3 years old when you were 6". And the system above looks a little complex given our mental reasoning. But what's really cool is that it shows that the reasoning has to be consistent:

$$ \text{Let } x_1 = 3, x_2 = 64 \text{ then } \\ \begin{cases} 2(3) = 6 \\ 2(3) + 64 = 70 \end{cases} $$

To make this convenient, an entire branch of algebra is dedicated to this, which is aptly named linear algebra. Given the above properties, we can describe this far more mathematically and introduce some nice operations:

$$ A = \begin{bmatrix} 2 & 0 \\ 2 & 1 \end{bmatrix} \\ x = \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} \\ b = \begin{bmatrix} 6 \\ 70 \end{bmatrix} \\ Ax = b = \begin{bmatrix} 2 & 0 \\ 2 & 1 \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \end{bmatrix} = \begin{bmatrix} 6 \\ 70 \end{bmatrix} $$

And we can use a technique from linear algebra to determine $x$ knowing only the values of $A$ and $b$: $x = A^{-1} b$. I won't go over how to invert $A$, but I do want to mention a few things regarding that:

  • In order for a solution to exist for $x$, the system of equations must have the following properties:
    • Linear Independence - No pair of equations can be scaled versions of each other
    • Consistency - Each equation cannot pose a contradiction (e.g., doesn't result in things like 2=3); this can be thought of as two lines being parallel with each other

When those facts aren't true, the transform matrix $A$ will be singular, thus inverting it is synonymous with dividing by zero.

Linear Transformations

As hinted earlier, we can treat the matrix $A$ as a transform: if we have values for $x$, we can come up with a new pair of values $b$, and come up with a different puzzle. For example, if we supply $x = [9, 42]$, then that means when you were 18, your sister was half your age, and today you're 60.

Let's look at a practical problem: we have two images of roughly the same object, but the camera moved a little bit between the two. This leaves a few effects comparing the images. Using the first image as the baseline, the second image shifted (because we moved across), scaled (because we moved towards the object), and rotated (because we don't have good balance).

In [1]:
import numpy as np
import cv2
from matplotlib import pyplot as plt
%matplotlib inline

simA = cv2.imread('../images/simA.jpg')
simB = cv2.imread('../images/simB.jpg')

f1, ax1 = plt.subplots(1,2, figsize=(17,7))
ax1[0].imshow(simA, cmap='gray')
ax1[0].axis('off')
ax1[1].imshow(simB, cmap='gray')
ax1[1].axis('off')
f1.suptitle('Similarity Transformed Frames')
plt.show();

The above image pair is a bit of an extreme case of such a change. But speaking in terms of geometry, we say that the second image is a similarity transform of the first: Fundamentally, it's of the same thing, just warped consistently so that all of the angles and ratio of line lengths are the same.

The similarity transform is the linear combination of three separate transforms $S = S_s S_t S_r $:

  • Scaling $$ S_s = \alpha \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} $$

  • Translation $$ S_t = \begin{bmatrix} 1 & 0 & \Delta_u \\ 0 & 1 & \Delta_v \end{bmatrix} $$

  • Rotation $$ S_r = \begin{bmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{bmatrix} $$

Interestingly, there's only four degrees of freedom in this transformation: $\alpha$, $\theta$, $\Delta_u$, and $\Delta_v$. We can therefore summarize the transform thusly:

$$ S = \begin{bmatrix} a & -b & c \\ b & a & d \end{bmatrix} $$

So, for any pixel coordinate $(u,v)$, we can come up with a transformed coordinate $(u',v')$ using just the four values described. To facilitate this, we'll consider a homogeonized image coordinate $(u,v,1)$ so that we can transform a 3x1 vector into the 2x1 final coordinate. This is because translation by itself is not a linear operation in 2D! But, if we go to a higher-dimension, we can assume that the point is locked to a plane (hence the 1 as the last parameter of the vector), and make it a linear operation.

Estimating the Transform

Having the transform is awesome. But that's not the problem we're attempting to solve here! Instead, all we have are some set of coordinates from the original image and the resulting points in the final. We really need to understand the transform because that can give us an understanding of how the camera moved in the scene.

In [2]:
simFeats = cv2.imread('../images/simFeaturePoints.png')

plt.figure(figsize=(17,7))
plt.imshow(simFeats)
plt.axis('off')
plt.title('Similarity Images with Feature Points')
plt.show();

Let's try making a new system of equations from this; that is

$$ \begin{bmatrix} u' \\ v' \end{bmatrix} = \begin{bmatrix} a & -b & c \\ b & a & d \end{bmatrix} \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} $$

becomes

$$ \begin{cases} u' = ua - vb + c \\ v' = va + ub + d \end{cases} $$

So, for a single point, we have two equations and four unknowns. This is known as an ill-posed problem; we can't find a single solution for this system just yet. But, if we have two points, we'll have just enough information to determine all of the properties:

$$ \begin{cases} u_1' = u_1 a - v_1 b + c \\ v_1' = v_1 a + u_1 b + d \\ u_2' = u_2 a - v_2 b + c \\ v_2' = v_2 a + u_2 b + d \end{cases} $$

In order for this to work, the pairs $(u_1, v_1) \rightarrow (u_1', v_1')$ and $(u_2, v_2) \rightarrow (u_2', v_2')$ need to be unique and independent. That is, the same point $(u_1, v_1)$ cannot map to two different transform points (or vice-versa) and that those pairs cannot be a scaled version of each other.

Now you might ask yourself, "Wait, I have four equations, and each equation only has three unknowns". That's ok! The missing unknown has zero component associated with it. To better illustrate it, let's recharacterize the system of equations above as a matrix equation:

$$ \begin{bmatrix} u_1' \\ v_1' \\ u_2' \\ v_2' \end{bmatrix} = \begin{bmatrix} u_1 & -v_1 & 1 & 0 \\ v_1 & u_1 & 0 & 1 \\ u_2 & -v_2 & 1 & 0 \\ v_2 & u_2 & 0 & 1 \end{bmatrix} \begin{bmatrix} a \\ b \\ c \\ d \end{bmatrix} $$

Now it's beginning to get clear... If the pairs of points were not unique and independent, then the 4x4 matrix above wouldn't be invertable! Consequentially, that means we can find $(a, b, c, d)$ with the following equation:

$$ \begin{bmatrix} a \\ b \\ c \\ d \end{bmatrix} = \begin{bmatrix} u_1 & -v_1 & 1 & 0 \\ v_1 & u_1 & 0 & 1 \\ u_2 & -v_2 & 1 & 0 \\ v_2 & u_2 & 0 & 1 \end{bmatrix}^{-1} \begin{bmatrix} u_1' \\ v_1' \\ u_2' \\ v_2' \end{bmatrix} $$

Conclusion

Using this sort of problem manipulation allowed us to use one kind of problem to solve a different (though related) problem. That is, but understanding how we can change one image into another, we can turn the problem around into deducing how a camera moved between frames!

There are lots of other uses for linear systems which will be covered in future posts. These include:

  • Linearizing systems using state equations
  • Working with over-constrained systems
  • Solving optimization problems with constraints

Comments !