# Reading Spatial Transforms

This article covers different ways to interpret homogeneous spatial transforms and provides appropriate notational practices for each.

## Homogeneous Matrices

In robot kinematics, coordinate frames are commonly used to keep track of the position and orientation of entities in the space. A coordinate frame $\{O, (\hat{i}, \hat{j}, \hat{k})\}$ in 3D Euclidean space is defined commonly using an origin $O$ and orthonormal basis vectors $(\hat{i}, \hat{j}, \hat{k})$.

With two frames $\mathsf{F}$ and $\mathsf{F}’$ defined, one can measure the relative position and orientation of one frame relative to another; or one can translate and rotate one frame relative to another. Achieving either of these requires using a representation for translation and rotation (which has a variety of formalisms).

A common representation used for forward kinematics are homogeneous matrices. Recall the familiar homogeneous transformation matrix $\mathrm{T}$ as used in forward kinematics, $$ \mathrm{T} = \begin{bmatrix} \mathrm{R} & \mathbf{d} \\ \mathbf{0}& 1 \end{bmatrix} = \begin{bmatrix} r_{xx’} & r_{xy’} & r_{xz’} & d_x \\ r_{yx’} & r_{yy’} & r_{yz’} & d_y \\ r_{zx’} & r_{zy’} & r_{zz’} & d_z \\ 0 & 0 & 0 & 1 \end{bmatrix} $$ The matrix $\mathrm{T}$ measures the state of frame $\mathsf{F}’ = \{o’, (\hat{x}’, \hat{y}’, \hat{z}’)\}$ relative to frame $\mathsf{F} = \{o, (\hat{x}, \hat{y}, \hat{z})\}$. Here $\mathsf{F}’$ is the measured frame and $\mathsf{F}$ is the reference frame. Furthermore:

- $\mathrm{R}$ and $\mathrm{d}$ are the rotation and translation components respectively.
- $\mathrm{T} \in \mathsf{SE}(3)$ and $\mathrm{R} \in \mathsf{SO}(3)$, denoting the special Euclidean and special orthogonal groups respectively.
- $r_{uv} = \hat{\mathbf{u}} \cdot \hat{\mathbf{v}} = \cos(\hat{\mathbf{u}}, \hat{\mathbf{v}})$ quantifies the relative orientation between the basis unit vectors through the oriented (typically, anticlockwise) angle $\measuredangle(\hat{\mathbf{u}}, \hat{\mathbf{v}})$ at $\hat{\mathbf{v}}$ measured from $\hat{\mathbf{u}}$.
- $d_u = oo’ \cdot \hat{\mathbf{u}}$ is the translation distance between both origins, at $o’$ measured from $o$, and projected in the direction of $\hat{\mathbf{u}}$.

## Interpretations

A general way to express the transform is $\mathrm{T}: x \mapsto y$ or $y = \mathrm{T} x$. The object $x$ typically can be a point $p$, a vector $v$, or a frame $\mathrm{F}$.

There are two obvious ways to interpret the transformation $\mathrm{T}: x \mapsto y$.

- The scene is
__unchanged__, but the object is__measured__from a different reference. - The scene has
__changed__, the object has been__moved__in the same reference.

This above notation hides this nuance, so a different notation is written below for each case.

### Case 1: Change of Basis

$${}^\mathsf{target}x_0 = {}^\mathsf{target}\mathrm{T}_\mathsf{source} \; {}^\mathsf{source}x_0$$

The object $x_0$ is unchanged. It was measured in the frame $\mathsf{source}$ and now it is measured in the frame $\mathsf{target}$ by using $\mathrm{T}$.

### Case 2: State Operator

$${}^\mathsf{reference}x_1 = {}^\mathsf{reference}\mathrm{T}_{\mathrm{R}, \mathbf{d}} \; {}^\mathsf{reference}x_0$$

The object $x_0$ has been changed into the object $x_1$ using $\mathrm{T}$ by being rotated by $\mathrm{R}$ and translated by $\mathrm{d}$, all in the same frame $\mathsf{reference}$.

Therefore, $\mathrm{T}$ either converts an object’s representation $({}^{\mathsf{S}}\Box_0 \to {}^{\mathsf{T}}\Box_0)$ or modifies its state $({}^{\mathsf{R}}\Box_0 \to {}^{\mathsf{R}}\Box_1)$.

These two interpretations are further discussed in the next sections.

## Coordinate Transformation

Transform the coordinates of the

sameobject $x_0$ from a $\mathsf{source}$ frame to a $\mathsf{target}$ frame.

- This measures the same object $x$ from a different location ($\mathsf{target}$), using a known location ($\mathsf{source}$).
- This is a change of basis re-expressing the coordinates of $x$ in a new frame, no objects were moved, no new objects were created.
- In other words, the state of the workspace has not been modified. It is just expressed differently.

### Example: $^{\mathsf{A}}p = {}^{\mathsf{A}} \mathrm{T}_\mathsf{B} {}^{\mathsf{B}}p$

- The object $x$ here is a point $p$.
- There is one point $p$ and two frames $\mathsf{A}$ and $\mathsf{B}$.
- Transforms the coordinates of the
**unchanged**point $p$ from frame $\mathsf{B}$ to frame $\mathsf{A}$.

## State Operator

Create a

newstate $\Box_1$ from an input state $\Box_0$ both measured in thesameframe $\mathsf{reference}$.

- This means either modifying the state of an existing object, or creating a new object using an old one.
- This is an operator, which is actively changing the scene, by moving or creating objects.
- The state of the workspace has been modified, by mutating existing objects or introducing new ones.

For a more programmatic illustration, see this pseudocode,

```
# Create new object
object2 ← transform(object1);
# Modify an existing object
object1 ← transform(object1);
```

- Here
`object2`

is assigned to a state representing a transformed`object1`

. - The function
`transform`

is assumed to read`object1`

’s state by copy, modify it, and then return it; it does**not**modify it in-place, nor delete it. - Here
`object1`

’s state is updated*i.e.*, it is translated and/or rotated.

### Example: ${}^{\mathsf{A}}v’=\mathrm{T}_{\mathrm{R}, \mathbf{d}} \; {}^{\mathsf{A}}v$

- There is one frame $\mathsf{A}$ and two vectors $v$ and $v’$.
- This changes the vector using rotation $\mathrm{R}$ and translation $\mathbf{d}$ into a
*new vector*in the**same**frame $\mathsf{A}$.

### Example: ${}^{\mathsf{A}} \mathrm{T}_\mathsf{B’} = \mathrm{T}_{\mathrm{R}, \mathbf{d}} \; {}^{\mathsf{A}} \mathrm{T}_\mathsf{B}$

- There are three frames $\mathsf{A}$, $\mathsf{B}$ and $\mathsf{B’}$.
- This encodes the pose of a transformed output frame $\mathsf{B’}$ measured from same frame $\mathsf{A}$.

## Tasteful Notation

Notation often implies the preferred interpretation. Consider this example,

$${}^0\mathrm{T}_n = \prod_{k=0}^{n-1}{}^{k}\mathrm{T}_{k+1} = {}^\mathrm{0}\mathrm{T}_1 {}^\mathrm{1}\mathrm{T}_2 \cdots {}^{n-2}\mathrm{T}_{n-1} {}^{n-1}\mathrm{T}_n$$

There are $n$ different frames (numbered $0$ to $n-1$) and the same $\mathrm{T}_n$ is just read from one to the next.

Equivalently, rewrite this with the following notation,

$${}^0\mathrm{F}_1 = {}^\mathrm{0}\mathrm{T}_1 {}^\mathrm{0}\mathrm{T}_2 \cdots {}^{0}\mathrm{T}_{n-1} {}^0\mathrm{F}_0$$

Here there is one reference frame (numbered $0$) and the frame $\mathrm{F}_0$ is moved $n-1$ times into its final pose $\mathrm{F}_1$.

Both these equations are perfectly equivalent, only the notation is changed. It is important to remember that the interpretations are conceptually useful, but mathematically equivalent

## References

[1] Mortenson, *Geometric Transformations for 3D Modeling*, 2007.

[2] Spong, *Robot Modeling and Control*, 2020.