Calculus and vectors

Time-dependent vectors can be differentiated in exactly the same way that we differentiate scalar functions. For a time-dependent vector $\vec{a}(t)$, the derivative $\dot{\vec{a}}(t)$ is:

Vector derivative definition.

\[\begin{aligned} \dot{\vec{a}}(t) &= \frac{d}{dt} \vec{a}(t) = \lim_{\Delta t \to 0} \frac{\vec{a}(t + \Delta t) - \vec{a}(t)}{\Delta t}\end{aligned}\]

Note that vector derivatives are a purely geometric concept. They don't rely on any basis or coordinates, but are just defined in terms of the physical actions of adding and scaling vectors.

Show:
Increment:	$\Delta t = $ s
Time:	$t = $ s

Vector derivatives shown as functions of $t$ and $\Delta t$. We can hold $t$ fixed and vary $\Delta t$ to see how the approximate derivative $\Delta\vec{a}/\Delta t$ approaches $\dot{\vec{a}}$. Alternatively, we can hold $\Delta t$ fixed and vary $t$ to see how the approximation changes depending on how $\vec{a}$ is changing.

We will use either the dot notation $\dot{\vec{a}}(t)$ (also known as Newton notation) or the full derivative notation $\frac{d\vec{a}(t)}{dt}$ (also known as Leibniz notation), depending on which is clearer and more convenient. We will often not write the time dependency explicitly, so we might write just $\dot{\vec{a}}$ or $\frac{d\vec{a}}{dt}$. See below for more details.

Newton versus Leibniz Notation

Most people know who Isaac Newton is, but perhaps fewer have heard of Gottfried Leibniz. Leibniz was a prolific mathematician and a contemporary of Newton. Both of them claimed to have invented calculus independently of each other, and this became the source of a bitter rivalry between the two of them. Each of them had different notation for derivatives, and both notations are commonly used today.

Leibniz notation is meant to be reminiscent of the definition of a derivative:

\[\frac{dy}{dt}=\lim_{\Delta t\rightarrow0}\frac{\Delta y}{\Delta t}.\]

Newton notation is meant to be compact:

\[\dot{y} = \frac{dy}{dt}.\]

Note that a superscribed dot always denotes differentiation with respect to time $t$. A superscribed dot is never used to denote differentiation with respect to any other variable, such as $x$.

But what about primes? A prime is used to denote differentiation with respect to a function's argument. For example, suppose we have a function $f=f(x)$. Then

\[f'(x) = \frac{df}{dx}.\]

Suppose we have another function $g=g(s)$. Then

\[g'(s) = \frac{dg}{ds}.\]

As you can see, while a superscribed dot always denotes differentiation with respect to time $t$, a prime can denote differentiation with respect to any variable; but that variable is always the function's argument.

Sometimes, for convenience, we drop the argument altogether. So, if we know that $y=y(x)$, then $y'$ is understood to be the same as $y'(x)$. This is sloppy, but it is very common in practice.

Each notation has advantages and disadvantages. The main advantage of Newton notation is that it is compact: it does not take a lot of effort to write a dot or a prime over a variable. However, the price you pay for convenience is clarity. The main advantage of Leibniz notation is that it is absolutely clear exactly which variable you are differentiating with respect to.

Leibniz notation is also very convenient for remembering the chain rule. Consider the following examples of the chain rule in the two notations:

\[\begin{aligned}&\text{Newton:}&\dot{y}=y'(x)\dot{x} \\ &\text{Leibniz:}&\frac{dy}{dt}=\frac{dy}{dx}\frac{dx}{dt}.\end{aligned}\]

Notice how, with Leibniz notation, you can imagine the $dx$'s "cancelling out" on the right-hand side, leaving you with $dy/dt$.

Derivatives and vector “positions”

When thinking about vector derivatives, it is important to remember that vectors don't have positions. Even if a vector is drawn moving about, this is irrelevant for the derivative. Only changes to length and direction are important.

Show:

Movement: bounce stretch circle twist slider rotate vertical fly

Vector derivatives for moving vectors. Vector movement is irrelevant when computing vector derivatives.

Derivatives in components

In a fixed basis we differentiate a vector by differentiating each component:

Vector derivative in components

\[\dot{\vec{a}}(t) = \dot{a}_1(t) \,\hat{\imath} + \dot{a}_2(t) \,\hat{\jmath} + \dot{a}_3(t) \,\hat{k}\]

Writing a time-dependent vector expression in a fixed basis gives: \[\vec{a}(t) = a_1(t)\,\hat{\imath} + a_2(t) \,\hat{\jmath}.\] Using the definition #rvc-ed of the vector derivative gives: \[\begin{aligned} \dot{\vec{a}}(t) &= \lim_{\Delta t \to 0} \frac{\vec{a}(t + \Delta t) - \vec{a}(t)}{\Delta t} \\ &= \lim_{\Delta t \to 0} \frac{(a_1(t + \Delta t) \,\hat{\imath} + a_2(t + \Delta t) \,\hat{\jmath}) - (a_1(t) \,\hat{\imath} + a_2(t) \,\hat{\jmath})}{\Delta t} \\ &= \lim_{\Delta t \to 0} \frac{(a_1(t + \Delta t) - a_1(t)) \,\hat{\imath} + (a_2(t + \Delta t) - a_2(t)) \,\hat{\jmath}}{\Delta t} \\ &= \left(\lim_{\Delta t \to 0} \frac{a_1(t + \Delta t) - a_1(t)}{\Delta t} \right) \,\hat{\imath} + \left(\lim_{\Delta t \to 0} \frac{a_2(t + \Delta t) - a_2(t) }{\Delta t}\right) \,\hat{\jmath} \\ &= \dot{a}_1(t) \,\hat{\imath} + \dot{a}_2(t) \,\hat{\jmath} \end{aligned}\] The second-to-last line above is simply the definition of the scalar derivative, giving the scalar derivatives of the component functions $a_1(t)$ and $a_2(t)$.

Warning: Differentiating each component is only valid if the basis is fixed.

When we differentiate a vector by differentiating each component and leaving the basis vectors unchanged, we are assuming that the basis vectors themselves are not changing with time. If they are, then we need to take this into account as well.

Time:	$t = $ s
Show:
Basis:	$\hat\imath,\hat\jmath$ $\hat{u},\hat{v}$

The vector derivative decomposed into components. This demonstrates graphically that each component of a vector in a particular basis is simply a scalar function, and the corresponding derivative component is the regular scalar derivative.

Differentiating vector expressions

We can also differentiate complex vector expressions, using the sum and product rules. For vectors, the product rule applies to both the dot and cross products:

Product rule for dot-product derivatives.

\[\frac{d}{dt}(\vec{a} \cdot \vec{b}) = \dot{\vec{a}} \cdot \vec{b} + \vec{a} \cdot \dot{\vec{b}}\]

Product rule for cross-product derivatives.

\[\frac{d}{dt}(\vec{a} \times \vec{b}) = \dot{\vec{a}} \times \vec{b} + \vec{a} \times \dot{\vec{b}}\]

Example Problem: Differentiating vector product expressions.

What is $\displaystyle \frac{d}{dt} \left( \frac{\vec{b}}{\ell} \cdot (\vec{a} + \vec{a} \times \vec{c}) \right)$?

\[\begin{aligned} \frac{d}{dt} \left( \frac{\vec{b}}{\ell} \cdot (\vec{a} + \vec{a} \times \vec{c}) \right) &= \left(\frac{d}{dt} \frac{\vec{b}}{\ell}\right) \cdot (\vec{a} + \vec{a} \times \vec{c}) + \frac{\vec{b}}{\ell} \cdot \frac{d}{dt} (\vec{a} + \vec{a} \times \vec{c}) \\ &= \left(\frac{\ell \dot{\vec{b}} - \dot{\ell}\vec{b}}{\ell^2}\right) \cdot (\vec{a} + \vec{a} \times \vec{c}) + \frac{\vec{b}}{\ell} \cdot (\dot{\vec{a}} + \dot{\vec{a}} \times \vec{c} + \vec{a} \times \dot{\vec{c}}) \end{aligned}\]

The chain rule also applies to vector functions. This is helpful for parameterizing vectors in terms of arc-length $s$ or other quantities different than time $t$.

Chain rule for vectors.

\[\frac{d}{dt} \vec{a} (s(t)) = \frac{d\vec{a}}{ds} (s(t)) \frac{ds}{dt}(t) = \frac{d\vec{a}}{ds} \dot{s}\]

Example Problem: Chain rule.

A vector is defined in terms of an angle $\theta$ by $\vec{r}(\theta) = \cos\theta\,\hat\imath + \sin\theta\,\hat\jmath$. If the angle is given by $\theta(t) = t^3$, what is $\dot{\vec{r}}$?

We can use the chain rule to compute: \[\begin{aligned} \frac{d}{dt} \vec{r} &= \frac{d}{d\theta} \vec{r} \frac{d}{dt} \theta \\ &= \frac{d}{d\theta}\Big( \cos\theta\,\hat\imath + \sin\theta\,\hat\jmath \Big) \frac{d}{dt} (t^3) \\ &= \Big( -\sin\theta\,\hat\imath + \cos\theta\,\hat\jmath \Big) (3 t^2) \\ &= -3 t^2 \sin(t^3)\,\hat\imath + 3 t^2 \cos(t^3)\,\hat\jmath. \end{aligned}\]

Alternatively, we can evaluate $\vec{r}$ as a function of $t$ first and then differentiate it with respect to time, using the scalar chain rule for each component: \[\begin{aligned} \frac{d}{dt} \vec{r} &= \frac{d}{dt} \Big( \cos(t^3)\,\hat\imath + \sin(t^3)\,\hat\jmath \Big) \\ &= -3 t^2 \sin(t^3)\,\hat\imath + 3 t^2 \cos(t^3)\,\hat\jmath. \end{aligned}\]

Gottfried Leibniz, one of the two co-inventors of calculus, got the product rule wrong [Child, 1920, page 100; Cirillo, 2007]. In modern notation he computed the example \[\frac{d}{dx}(x^2 + bx)(cx + d) = (2x + b)c\] and he stated that in general it was obvious that \[\frac{d}{dx} (f g) = \frac{df}{dx} \frac{dg}{dx}.\] He later realized his error and corrected it [Cupillari, 2004], but at least we know that product rules are tricky and not obvious, even for someone smart enough to invent calculus.

References

J. M. Child. The Early Mathematical Manuscripts of Leibniz. Open Court Publishing, 1920. (Google ebook, local copy).
M. Cirillo. Humanizing Calculus. The Mathematics Teacher, 101(1):23–27, 2007. (NCTM version, local copy)
A. Cupillari. Another look at the rules of differentiation. Primus: Problems, Resources, and Issues in Mathematics Undergraduate Studies, 14(3):193–200, 2004. DOI: 10.1080/10511970408984087.

Derivative of magnitude versus magnitude of derivative

Because we often denote the magnitude of a vector by dropping the superscribed arrow, i.e.,

\[a = \left\|\vec{a}\right\|,\]

you might be tempted to think that $\dot{a}=\left\|\dot{\vec{a}}\right\|$. However, that is not the case. By definition,

\[\dot{a}=\frac{d}{dt}\left\|\vec{a}\right\|,\]

whereas

\[\left\|\dot{\vec{a}}\right\|=\left\|\frac{d\vec{a}}{dt}\right\|.\]

And these two quantities are not, in general, equal:

\[\left\|\frac{d\vec{a}}{dt}\right\|\neq\frac{d}{dt}\left\|\vec{a}\right\|.\]

This result should be familiar to you. Consider the case of uniform circular motion, in which a particle moves in a circle of radius $R$ with constant speed $v$. We know that, even though the speed of the particle is constant, the acceleration of the particle is not zero. In particular, the acceleration is centripetal (i.e., it points toward the center of the circle) and has magnitude $v^{2}/R$. Hence, even though $\frac{d}{dt}\left\|\vec{v}\right\|=0$, we have that $\left\|\frac{d\vec{v}}{dt}\right\|=v^{2}/R\neq0$. It follows that $\dot{v}\neq a$.

In general, given any vector $\vec{u}$ and any scalar $s$,

\[\left\|\frac{d\vec{u}}{ds}\right\|\neq\frac{d}{ds}\left\|\vec{u}\right\|.\]

In words, “The magnitude of a derivative is not, in general, equal to the derivative of a magnitude.” Informally, you cannot “pull” a derivative out of a magnitude (or vice versa).

Changing lengths and directions

Two useful derivatives are the rates of change of a vector's length and direction:

Derivative of vector length.

\[\dot{a} = \dot{\vec{a}} \cdot \hat{a}\]

We start with the dot product expression #rvv-ed for length and differentiate it:

\[\begin{aligned} a &= \sqrt{\vec{a} \cdot \vec{a}} \\ \frac{d}{dt} a &= \frac{d}{dt} \big( (\vec{a} \cdot \vec{a})^{1/2} \big) \\ \dot{a} &= \frac{1}{2} (\vec{a} \cdot \vec{a})^{-1/2} (\dot{\vec{a}} \cdot \vec{a} + \vec{a} \cdot \dot{\vec{a}}) \\ &= \frac{1}{2\sqrt{a^2}} (2 \dot{\vec{a}} \cdot \vec{a}) \\ &= \dot{\vec{a}} \cdot \hat{a}.\end{aligned}\]

Derivative of vector direction.

\[\dot{\hat{a}} = \frac{1}{a} \operatorname{Comp}(\dot{\vec{a}}, \vec{a})\]

We take the definition #rvv-eu for the unit vector and differentiate it:

\[\begin{aligned} \hat{a} &= \frac{\vec{a}}{a} \\ \frac{d}{dt} \hat{a} &= \frac{d}{dt}\left(\frac{\vec{a}}{a}\right) \\ \dot{\hat{a}} &= \frac{\dot{\vec{a}} a - \vec{a} \dot{a}}{a^2} \\ &= \frac{\dot{\vec{a}}}{a} - \frac{\dot{\vec{a}} \cdot \hat{a}}{a^2} \vec{a} \\ &= \frac{1}{a} \big( \dot{\vec{a}} - (\dot{\vec{a}} \cdot \hat{a}) \hat{a} \big)\\ &= \frac{1}{a} \operatorname{Comp}(\dot{\vec{a}}, \vec{a}).\end{aligned}\] Here we observed at the end that we had the expression #rvv-em for the complementary projection of the derivative $\dot{\vec{a}}$ with respect to $\vec{a}$ itself.

An immediate consequence of the derivative of direction formula is that the derivative of a unit vector is always orthogonal to the unit vector:

Derivative of unit vector is orthogonal.

\[\dot{\hat{a}} \cdot \hat{a} = 0\]

From #rvc-eu we know that $\dot{\hat{a}}$ is in the direction of $\operatorname{Comp}(\dot{\vec{a}}, \vec{a})$, and from #rvv-er we know that this is orthogonal to $\vec{a}$ (and also $\hat{a}$).

Recall that we can always write a vector as the product of its length and direction, so $\vec{a} = a \hat{a}$. This gives the following decomposition of the derivative of $\vec{a}$.

Vector derivative decomposition.

\[\begin{aligned} \dot{\vec{a}} &= \underbrace{\dot{a} \hat{a}}_{\operatorname{Proj}(\dot{\vec{a}}, \vec{a})} + \underbrace{a \dot{\hat{a}}}_{\operatorname{Comp}(\dot{\vec{a}}, \vec{a})}\end{aligned}\]

Differentiating $\vec{a} = a \hat{a}$ and substituting in #rvv-el and #rvv-eu gives \[\begin{aligned} \dot{\vec{a}} &= \dot{a} \hat{a} + a \dot{\hat{a}} \\ &= ( \dot{\vec{a}} \cdot \hat{a} ) \hat{a} + a \frac{1}{a} \operatorname{Comp}(\dot{\vec{a}}, \hat{a}) \\ &= \operatorname{Proj}(\dot{\vec{a}}, \vec{a}) + \operatorname{Comp}(\dot{\vec{a}}, \vec{a}). \end{aligned}\]

Show:

Vector derivatives can be decomposed into length changes (projection onto $\vec{a}$) and direction changes (complementary projection). Compare to Figure #rvv-fu.

Integrating vector functions

The Riemann-sum definition of the vector integral is:

Vector integral.

\[ \int_0^t \vec{a}(\tau) \, d\tau = \lim_{N \to \infty} \underbrace{\sum_{i=1}^N \vec{a}(\tau_i) \Delta\tau}_{\vec{S}_N} \qquad \tau_i = \frac{i - 1}{N} \qquad \Delta \tau = \frac{1}{N} \]

In the above definition $\vec{S}_N$ is the sum with $N$ intervals, written here using the left-hand edge $\tau_i$ in each interval.

Time:	$t = $ s
Show:
Segments:	$N = $

Integral of a vector function $\vec{a}(t)$, together with the approximation using a Riemann sum.

Just like vector derivatives, vector integrals only use the geometric concepts of scaling and addition, and do not rely on using a basis. If we do write a vector function in terms of a fixed basis, then we can integrate each component:

Vector integral in components.

\[ \int_0^t \vec{a}(\tau) \, d\tau = \left( \int_0^t a_1(\tau) \, d\tau \right) \,\hat\imath + \left( \int_0^t a_2(\tau) \, d\tau \right) \,\hat\jmath + \left( \int_0^t a_3(\tau) \, d\tau \right) \,\hat{k} \]

Consider a time-dependent vector $\vec{a}(t)$ written in components with a fixed basis: \[\vec{a}(t) = a_1(t) \,\hat\imath + a_2(t) \,\hat\jmath.\] Using the definition #rvc-ei of the vector integral gives: \[\begin{aligned} \int_0^t \vec{a}(\tau) \, d\tau &= \lim_{N \to \infty} \sum_{i=1}^N \vec{a}(\tau_i) \Delta\tau \\ &= \lim_{N \to \infty} \sum_{i=1}^N \left( a_1(\tau_i) \,\hat\imath + a_2(\tau_j) \,\hat\jmath \right) \Delta\tau \\ &= \lim_{N \to \infty} \left( \sum_{i=1}^N a_1(\tau_i) \Delta\tau \,\hat\imath + \sum_{i=1}^N a_2(\tau_j) \Delta\tau \,\hat\jmath \right) \\ &= \left( \lim_{N \to \infty} \sum_{i=1}^N a_1(\tau_i) \Delta\tau \right) \,\hat\imath + \left( \lim_{N \to \infty} \sum_{i=1}^N a_2(\tau_j) \Delta\tau \right) \,\hat\jmath \\ &= \left( \int_0^t a_1(\tau) \, d\tau \right) \,\hat\imath + \left( \int_0^t a_2(\tau) \, d\tau \right) \,\hat\jmath. \end{aligned}\] The second-to-last line used the Riemann-sum definition of regular scalar integrals of $a_1(t)$ and $a_2(t)$.

Warning: Integrating each component is only valid if the basis is fixed.

Integrating a vector function by integrating each component separately is only valid if the basis vectors are not changing with time. If the basis vectors are changing then we must either transform to a fixed basis or otherwise take this change into account.

Example Problem: Integrating a vector function.

The vector $\vec{a}(t)$ is given by \[ \vec{a}(t) = \Big(2 \sin(t + 1) + t^2 \Big) \,\hat\imath + \Big(3 - 3 \cos(2t)\Big) \,\hat\jmath. \] What is $\int_0^t \vec{a}(\tau) \, d\tau$?

\[\begin{aligned} \int_0^t \vec{a}(\tau) \,d\tau &= \left(\int_0^t \Big(2 \sin(\tau + 1) + \tau^2 \Big) \,d\tau\right) \,\hat\imath + \left(\int_0^t \Big(3 - 3 \cos(2\tau)\Big) \,d\tau\right) \,\hat\jmath \\ &= \left[-2 \cos(\tau + 1) + \frac{\tau^3}{3} \right]_{\tau=0}^{\tau=t} \,\hat\imath + \left[3 \tau - \frac{3}{2} \sin(2\tau) \right]_{\tau=0}^{\tau=t} \,\hat\jmath \\ &= \left( -2\cos(t + 1) + 2 \cos(1) + \frac{t^3}{3}\right)\,\hat\imath + \left(3t - \frac{3}{2} \sin(2t)\right) \,\hat\jmath. \end{aligned}\]

Warning: The dummy variable of integration must be different to the limit variable.

In the coordinate integral expression #rvc-ei, it is important that the component expressions $a_1(t)$, $a_2(t)$ are re-written with a different dummy variable such as $\tau$ when integrating. If we used $\tau$ for integration but kept $a_1(t)$ then we would obtain \[ \int_0^t a_1(t) \,d\tau = \left[a_1(t) \, \tau\right]_{\tau = 0}^{\tau = t} = a_1(t) \, t, \] which is not what we mean by the integral. Alternatively, if we leave everything as $t$ then we would obtain \[ \int_0^t a_1(t) \,dt \] which is a meaningless expression, as dummy variables must only appear inside an integral.

Dynamics