🗓️ August 2024👀  loading

Math Classics: The Triangle Inequality

Yet another proof of the triangle inequality.

poster for: math classics: the triangle inequality
linear-algebramath

In this article, we're proving the triangle inequality, which can be understood as follows:

The shortest distance between two points is a straight line.

I like this theorem because it feels intuitive ... like it needs to be true.

So let's state this theorem in mathematical language. For any two vectors u,vRn\vec{u},\vec{v}\in\R^n, the length of their sum vector (the direct path) is always shorter than the sum of their indivdual path lengths (the detour):

u+vu+v||\vec{u} + \vec{v}|| \leq ||\vec{u}|| + ||\vec{v}||

Here, :RnR||\cdot|| : \R^n \to \R denotes a vector's norm, defined by x=xx||\vec{x}|| = \sqrt{\vec{x} \cdot \vec{x}}, where we use the xy\vec{x}\cdot\vec{y} notation for the dot product :Rn×RnR:\R^n\times\R^n\to\R of two vectors.

Lemma: The Cauchy-Schwarz Inequality

First we'll be proving that the absolute value of a dot product is less than the product of the individual vector's norms, which is known as the Cauchy-Schwarz inequality:

uvuv(u,vRn)|\vec{u}\cdot\vec{v}| \leq ||\vec{u}||\cdot||\vec{v}||\:\:\:\:\:\:\Big(\forall\vec{u},\vec{v}\in\R^n\Big)

Given any two vectors u,vRn\vec{u}, \vec{v} \in \R^n, let us define w=utv\vec{w} = \vec{u} - t\vec{v}, for some arbitrary scalar tRt\in\R. By distributivity and scalar multiplication of the dot product, we have

w2=ww=(utv)(utv)=uu2tuv+t2vv\begin{aligned} ||\vec{w}||^2 &= |\vec{w}\cdot\vec{w}|\\ &= |(\vec{u} - t\vec{v})\cdot(\vec{u} - t\vec{v})|\\ &= |\vec{u}\cdot\vec{u} - 2t\vec{u}\cdot\vec{v} + t^2\vec{v}\cdot\vec{v}| \end{aligned}

We're about to use the fact that a+ba+b|a + b| \leq |a| + |b| for any a,bRa,b\in\R.

Why is this true? Note how aa|a|\geq a and bb|b|\geq b, which implies that abab|a|\cdot|b|\geq a\cdot b. It follows that a2+2ab+b2a2+2ab+b2a^2 + 2|a||b| + b^2 \geq a^2 + 2ab + b^2, implying that (a+b)2(a+b)2(|a| + |b|)^2 \geq (a + b)^2, from which it follows that a+ba+b|a| + |b|\geq |a + b|

Continuing from our previous point:

w2=uu2tuv+t2vvuu+2tuv+t2vv=u2+2tuv+t2v2\begin{aligned} ||\vec{w}||^2 &= |\vec{u}\cdot\vec{u} - 2t\vec{u}\cdot\vec{v} + t^2\vec{v}\cdot\vec{v}|\\ &\leq |\vec{u}\cdot\vec{u}| + 2t|\vec{u}\cdot\vec{v}| + t^2|\vec{v}\cdot\vec{v}|\\ &= ||\vec{u}||^2 + 2t|\vec{u}\cdot\vec{v}| + t^2||\vec{v}||^2 \end{aligned}

And, since w20||\vec{w}||^2 \geq 0, it follows that

u2+2tuv+t2v20||\vec{u}||^2 + 2t|\vec{u}\cdot\vec{v}| + t^2||\vec{v}||^2 \geq 0

The key observation here, is that this is a quadratic inequality in the variable tt. Geometrically speaking, this inequality states that a parabola, parameterized by tt, should cross the x-axis at most once.

graph of a red parabola hovering above the x-axis but not quite touching it

Since the well known quadratic formula tells us that a function f(x)=ax2+bx+cf(x) = ax^2 + bx + c intersects with the x-axis at coordinates

x1,x2=12a(b±b24ac)x_1,x_2 = \dfrac{1}{2a}\Bigg(-b \pm \sqrt{b^2 - 4ac}\Bigg)

it follows that, for the above inequality to hold, we must have b24ac0b^2 - 4ac \leq 0, i.e.:

(2uv)24u2v20\big(2|\vec{u}\cdot\vec{v}|\big)^2 - 4||\vec{u}||^2||\vec{v}||^2 \leq 0

By rearranging some terms, we find that (uv)2(uv)2(\vec{u}\cdot\vec{v})^2 \leq (||\vec{u}||\cdot||\vec{v}||)^2, from which the Cauchy-Schwarz inequality follows.

Proof for The Triangle Inequality

By using the Cauchy-Schwarz inequality, the proof for the triangle inequality becomes relatively straightforward.

Let u\vec{u} and v\vec{v} be arbitrary vectors in Rn\R^n. Then:

u+v2=(u+v)(u+v)=uu+2uv+vv=u2+2uv+v2u2+2uv+v2u2+2uv+v2(Cauchy-Schwarz)=(u+v)2\begin{aligned} ||\vec{u} + \vec{v}||^2 &= (\vec{u}+\vec{v}) \cdot (\vec{u}+\vec{v})\\ &= \vec{u}\cdot\vec{u} + 2\vec{u}\cdot\vec{v} + \vec{v}\cdot\vec{v}\\ &= ||\vec{u}||^2 + 2\vec{u}\cdot\vec{v} + ||\vec{v}||^2\\ &\leq ||\vec{u}||^2 + 2|\vec{u}\cdot\vec{v}| + ||\vec{v}||^2\\ &\leq ||\vec{u}||^2 + 2||\vec{u}||||\vec{v}|| + ||\vec{v}||^2\:\:\:\:\text{(Cauchy-Schwarz)}\\ &= \big(||\vec{u}|| + ||\vec{v}||\big)^2 \end{aligned}

From this, the triangle inequality follows: u+vu+v||\vec{u} + \vec{v}|| \leq ||\vec{u}|| + ||\vec{v}||