Crimson Channel

A Commentary on Things


Project maintained by Owen Jow Midnight theme by Matt Graham

Tensor Algebra

A few hours ago, a man named Kevin sat across from me and explained to me the beauty of mathematics and tensor algebra. Since then I’ve hardly been able to get his words out of my mind.

One thing he really liked was the abbreviated notation. I believe there are several systems for tensor notation, but the one I remember best is the Einstein summation convention. As James O’Brien states, it “takes a while to get used to,” but the simplicity should eventually make life much easier. So let’s pay the O(1) upfront cost in order to reap those O(n) benefits. :)

Note: much of the motivation for new notation comes from the fact that tensors of rank 3+ don’t translate well to standard matrix-vector representations. Pioneers of the past have devised several other means for dealing with this, but according to Kevin most of them are overly contrived or “hacky.”

For example, vectorization requires too many conversions, and often involves things like Kronecker products simply to express multiplication. Plus, the identity for row-major vectorization is \(\texttt{vec}(ABC) = (A \otimes C^T)\texttt{vec}(B)\). I rest my case.

In Einstein summation notation, there are both free indices and dummy indices. A free index appears once and implies a loop. (We’ll have an output for every value of the index.) A dummy index appears 2+ times and indicates that we should sum the term over all values of the index.

For example, the dot product \(a \cdot b\) can be written as \(a_ib_i\). The matrix-vector product \(Ab\) can be written as \(A_{ij}b_j\). The product \(RAR^T\) can be written as \(A_{kl}R_{ik}R_{jl}\). Clearly, the number of free indices equates to the order of the final result. (When there are no free indices, it’s a scalar. When there’s one, it’s a vector. And so on…)

The gradient, i.e. the first derivatives of a real-valued function of several variables, can be written as \(\nabla_if = \partial_if = \frac{\partial f}{\partial x_i}\). The Jacobian, i.e. the first derivatives of a vector-valued function, is \(\partial_j f_i\). The Hessian, i.e. the second derivatives of a real-valued function, is \(\partial_i \partial_j f\).

Tensor notation also makes it a lot easier to comprehend identities such as \(\nabla \cdot (\nabla \times v) = 0\) and \(\nabla \times (\nabla s) = 0\). Just plug in the equivalent values in Einstein summation notation!

This came up for Kevin when he was reading A Material Point Method for Snow Simulation, especially with stress and energy tensors. In section 6, for instance, they take the second derivative of a tensor with respect to a second-order tensor (a matrix), which doesn’t make any sense to me right now. Apparently, tensor algebra helps understand it.

In summary, standard linear algebra is only easy to think about for constructs like matrices and vectors and scalars. For tensors of higher orders, we’ll want to use tensor algebra. In some cases, this avoids having to deal with Hadamard products and the like.

back