In this drama of mathematics and physics, which fertilize each other in the dark, but which prefer to deny and misconstrue each other face to face—I cannot, however, resist playing the role of a messenger, albeit, as I have abundantly learned, often an unwelcome one. ~ Hermann Weyl
In summer of 2023, I learned some modern Differential geometry and started reading the mathematics of Classical Mechaniccs. While I was in the process of properply recalling things and aiming at coherence about the subject to myself, I was ignited by Tanush’s talk on Mathematics of General Relativity, here at NISER and decided to write the things that I managed to put together in summer, i.e the mathematics of Classical Mechanics in the form of a website, since it gives me a better control of aesthetics and also helps in communication, all while recalling things for myself! So, this page is a result of me indulging in getting creative while seeking coherence. The page needs a little refinement at various places, but the main purpose of it is served.
Classical mechanics describes systems with finitely many interacting particles.(A particle is a material body whose dimensions may be neglected in describing its motion.) A system is closed if its particles do not interact with the outside material bodies. The position of a system in space is specified by the position of its particles and defines a point in a smooth, finite-dimensional manifold M, the configuration space of a system. Coordinates on $M$ are called generalized coordinates of a system, and the dimension $n$ = dim $M$ is called the number of degrees of freedom. Systems with infinitely many degrees of freedom are described by Classical Field Theory (which is for another summer to probe into).
The above is quoted directly from Chapter-1 of Takhtajan's Quantum Mechanics for Mathematicians.
The state of a system at any instant of time is described by a point $q \in M$ and by a tangent vector $v \in T_qM$ at this point.
The motion of the system in space is captured by the trajectory $\gamma(t):\mathbb{R}\mapsto M$. The position of the particles in the system is denoted by $q(t) \in M$ given by $\gamma(t)$. The nature makes $\ddot{q}(t)$ depend on the state of the system in a certain way, and it's the aim of classical mechanics to discuss such dependence. This is casted in terms of Principle of least action, leading to the Euler-Lagrange equations.
A Lagrangian system (M,L) on a configuration space M is defined by a smooth function $L:TM\times\mathbb{R}\mapsto \mathbb{R}$ called the Lagrangian function. TM is the tangent bundle to the manifold $M$ and $\mathbb{R}$ in the domain corresponds to the time.
Definition: A set $M$ is a d-dimensional topological manifold $(M,\mathcal{O})$ if it is paracompact, Hausdorff topological space such that for every point $p \in M, \exists$ a neighbourhood $U_p$ and a homeomorphism $x:U_p \mapsto x(U_p) \subseteq \mathbb{R}^d.$
If M is a set, a topology on M is a set $\mathcal{O} \subseteq \mathcal{P(M)}$, i.e a collection of subsets of $M$ such that:
A topological space M, is an ordered pair $(M,\mathcal{O})$, consisting of the set and the topology. We omit $\mathcal{O}$, and simply denote the topological space by its set $M$ from here on. Any set $U \subseteq M$ is called an open set of M, if $U \in \mathcal{O}.$
By a neighbourhood $U_p$ of $p$, we mean $U_p \ni p$ and $U_p \in \mathcal{O}.$
We do not delve into the Hausdorff (a kind of seperation axiom) and Paracompact properties of the manifolds for now, and the basic game can still be understood without probing them.
Consider a d-dimensional manifold $\left(M,\mathcal{O}\right)$.
Definition: A pair $(U,x)$ where $U \in \mathcal{O}$ and $x: U \rightarrow x(U) \subseteq \mathbb{R}^d$ is a homeomorphism, is said to be a $chart$ of the manifold.
The component functions of the map $x$, i.e \begin{align} x^i:&U\rightarrow\mathbb{R}\\ &p \mapsto proj_i(x(p)) ; 1 \le i \le d. \end{align} give the coordinates of the point $p$ on the manifold - $\left(x^1(p),...,x^d(p)\right)$ with respect to the chart $(U,x$.)
Definition:A collection of charts $\mathcal{A} := \{(U_\alpha,x_\alpha)\mid \alpha \in A\}$ is called an $atlas$ if, $\bigcup_{\alpha\in A} U_\alpha = M$.
Since, $x$ and $y$ are homeomorphisms, we see that any two charts on the manifold are $C^0-compatible$.
Definition: A $C^k-manifold$ is a 3-tuple $\left(M,\mathcal{O},\mathcal{A}\right)$, where $\mathcal{A}$ is a maximal $C^k-atlas$.
We develop the abstraction for the tangent space by looking at a specific function of a vector in a Euclidean space. After a formal definition, we will prove a geometric result that forms the basis of our discussion on Classical Mechanics.
In a Euclidean space $\mathbb{R}^n$, one can define a geometric tangent space at a point $p \in \mathbb{R}^n$ as
\begin{align} \mathbb{R}^n_p := {(a,v)\mid v \in \mathbb{R}^n} \end{align}
Look at a specific function of each element of $\mathbb{R}^n_p$(call it a geometric tangent vector, say $v_p \in \mathbb{R}^n_p$):
Consider $f: \mathbb{R}^n \rightarrow \mathbb{R}$, we have the following linear map, \begin{align} D_v\mid_p:C^\infty(\mathbb{R}^n) \rightarrow &\mathbb{R} \\ f \mapsto &D_vf(p) \\ &=\frac{d}{dt} f(p + tv) \\ &=v^i \frac{\partial f}{\partial x^i} \end{align}
We thus recognize that a geometric tangent vector defines a linear map, more specifially a derivation of a $C^\infty(\mathbb{R}^n)$ function.
Definition: If $p \in \mathbb{R}^n$ a map $w:C^\infty(\mathbb{R}^n) \rightarrow \mathbb{R}$ is called a derivation at $p$ if it is linear over $\mathbb{R}$ and satisfies the product rule: \begin{equation} w(fg) = f(p)wg + g(p)wf \end{equation}
Denote the set of all the derivations at point $p$ by $T_p(\mathbb{R}^n)$, the tangent space at point $p$.
Claim: The set off all derivations is a vector space.
Claim: $T_p(\mathbb{R}^n) \cong \mathbb{R}^n_p$.
Claim: The n derivations, \begin{align} \frac{\partial}{\partial x^1} \Bigr\vert_p, ... , \frac{\partial}{\partial x^n} \Bigr\vert_p \end{align} where $\frac{\partial}{\partial x^i}\Bigr\vert_pf = \frac{\partial f}{\partial x^i}(p)$ forms a basis for $T_p(\mathbb{R}^n)$.
We skip the proofs for above claims and prove similar results for a general smooth manifold $M$.
With the above motivation, we now define the tangent space at point on a smooth manifold M.
Definition: Let $M$ be a smooth manifold and $p$ a point of M, then a derivation at $p$ is a linear map $w:C^\infty(M)\rightarrow \mathbb{R}$ satisfying the product rule $\eqref{leib}$. Rhe set of all derivations $C^\infty(M)$ at $p$, denoted by $T_pM$; is a vector space called the tangent space to $M$ at $p$. An element of $T_pM$ is called a tangent vector at $p$.
Definition: If M and N are smooth manifolds, and $F:M \rightarrow N$ is a smooth map, then for reach $p \in M$, we define a map \begin{align} dF_p:T_pM\rightarrow &T_{F(p)}N \\ v \mapsto &dF_p(v)(f) := v(f\circ F) \\ \end{align} called the differential of $F$ at $p$. Where $f \in C^\infty(N)$.
Injection: It suffices to show that $ker \ d\imath_p = \{0\}$. Consider $v \in T_pM$, s.t $d\imath_p(v) = 0 \in T_pM.$ Let $B$ be a neigborhood of $p$ s.t $\overline{B} \subseteq M$. Let $f \in C^\infty(U)$ be arbitrary. We can extend $f$ to $\tilde{f} \in C^\infty(M)$ with $f = \tilde{f}$ on $\overline{B}$. From previous theorem, we have \begin{align} vf = v(\tilde{f}\Bigr\vert_U) = v(\tilde{f}\circ \imath) = d\imath_p(v)\tilde{f} = 0 \end{align} Since, this is true for aribitrary $f$, we have $v = 0$. Thus $ker \ d\imath_p = \{0\}$.
Surjection: Consider any arbitatory $w \in T_pM$. Define the following map, \begin{align} v:C^\infty(U) \rightarrow &\mathbb{R} \\ f \mapsto w\tilde{f} \end{align} where $\tilde{f} \in C^\infty(M)$ is the extension of $f$ on whole of $M$, with $\tilde{f} = f$ on $\overline{B}$. By previous theorem we have, $v \tilde{f}\Bigr\vert_U = v(\tilde{f}\circ\imath) = vf$, thus $v$ is well defined and is derivation since $w$ is. And thus for every $w \in T_pM$, we have $v \in T_pU$ s.t (for any $g \in C^\infty(M)$), \begin{align} d\imath_p(v)g = v(g\circ\imath) = w(\widetilde{g\circ\imath}) = wg. \end{align} Thus $d\imath_p$ is surjective.
</p>Hence, $T_pU \cong T_pM$.
We thus recognize $d\imath_p(v)$ is the same derivation as v, acting on functions defined on larger space but giving the same output due to the locality proved in the previous theorem. </div>
We know that $T_pM \cong T_{\varphi(p)}\mathbb{R}^n$. So, the basis $$\left\{\frac{\partial}{\partial x^1}\Bigr\vert_{\varphi(p)},...,\frac{\partial}{\partial x^n}\Bigr\vert_{\varphi(p)}\right\}$$ gives a basis on $T_pM$ via $(d\varphi_p)^{-1}$ as follows: We use the same notation for a basis on $T_pM$: $\left\{\frac{\partial}{\partial x^1}\Bigr\vert_p,...,\frac{\partial}{\partial x^n}\Bigr\vert_p\right\}$ \begin{align} \frac{\partial}{\partial x^i}\Bigr\vert_p &= (d\varphi_p)^{-1} \left(\frac{\partial}{\partial x^i}\Bigr\vert_{\varphi(p)}\right) \\ &= d(\varphi^{-1})_{\varphi(p)} \left(\frac{\partial}{\partial x^i}\Bigr\vert_{\varphi(p)}\right) \\ \implies \frac{\partial}{\partial x^i}\Bigr\vert_p f &= \frac{\partial}{\partial x^i}\Bigr\vert_{\varphi(p)} \left(f\circ\varphi^{-1}\right)\\ &= \frac{\widehat{\partial f}}{\partial x^i}(\widehat{p}) \end{align} where, $\widehat{f} \equiv f\circ\varphi^{-1}, \widehat{p} = \varphi(p)$. $\frac{\partial}{\partial x^i}\Bigr\vert_p$ are called coordinate vectors at $p$ associated with the given coordinate chart. We can thus write a vector $v \in T_pM$ as $$v = v^i \frac{\partial}{\partial x^i}\Bigr\vert_p$$. And $v^1,...,v^n$ are the components of the vector $v$. Note that, $v^j = v(x^j)$; where $x^j$, \begin{align} x^j: U \rightarrow &\mathbb{R}\\ q \mapsto &\delta^j_i \varphi(q)^i \end{align}
$d\widehat{F}_\widehat{p}$ is simply given by the Jacobian matrix, which relates vectors under coordinate transformations in Eculidean spaces. (One can check this easily using the definition of differential and chain rule )
Since, $F \circ \varphi^{-1} = \psi^{-1}\circ\widehat{F}$; \begin{align} dF_p\left(\frac{\partial}{\partial x^i}\Bigr\vert_p\right) &= dF_p\left(d(\varphi^{-1})_p\left(\frac{\partial}{\partial x^i}\Bigr\vert_\widehat{p}\right)\right) = d(\psi^{-1})_{\widehat{F}(p)}\left(d\widehat{F}_\widehat{p}\left(\frac{\partial}{\partial x^i}\Bigr\vert_\widehat{p}\right)\right) \\ &=d(\psi^{-1})_{\widehat{F}(\widehat{p})} \left(\frac{\partial \widehat{F}^j}{\partial x^i}(\widehat{p})\frac{\partial}{\partial y^j}\Bigr\vert_{\widehat{F}(\widehat{p})}\right) \\ &= \frac{\partial \widehat{F}^j}{\partial x^i}(\widehat{p})\frac{\partial}{\partial y^j}\Bigr\vert_F(p) \end{align} Just like the Euclidean! We will also call differential as pushforward for the intuitive reason of pushing the tangent vector in the direction of the smooth map $F$. We will also define the so called pullback in the next section.Consider a curve on a smooth n-dimensional manifold M, $\gamma:J \subseteq \mathbb{R} \rightarrow M$, with the velocity vector at $t_0, \gamma'(t_0)$ defined as, \begin{align} T_{\gamma(t_0)}M \ni \gamma'(t_0):= d\gamma\left(\frac{\partial}{\partial t}\Bigr\vert_{t_0}\right) \implies \gamma'(t_0)f= \frac{\partial}{\partial t} (f\circ\gamma) \Bigr\vert_{t_0}. \end{align} Let $\widehat{\gamma} = x\circ\gamma\circ id_J = x\circ\gamma =: (\gamma^1,...,\gamma^n) \in \mathbb{R}^n$ where, $(J,id_J)$ defines the chart of $J$ onto $J\subset \mathbb{R}$, and $(U_{\gamma(t_0)},x)$ defines a chart of $U_{\gamma(t_0)}$ (neighborhood of $\gamma(t_0)$) onto $x(U_{\gamma(t_0)}) \subseteq \mathbb{R}^n$. Sometimes, we also the notation writing $\gamma(t) = (\gamma^1,...,\gamma^n)$ which is actually $x^{-1}(\gamma^1,...,\gamma^n)$. Using the coordinate representation of the differential, we have \begin{align} \gamma'(t_0) = \frac{\partial \gamma^i}{\partial t}\Bigr\vert_{t_0}\frac{\partial}{\partial x^i}\Bigr\vert_{\gamma(t_0)} \end{align} Velocity vector at a point on the curve is a tangent vector at that point whose components in the coordinate basis are the derivatives of the component functions of the curve. We also denote this vector by $\frac{d \gamma}{dt}(t_0)$ or $\dot{\gamma}(t_0)$. We will show, every tangent vector is a velocity vector of some curve on $M$.
\begin{align} \psi\circ\varphi^{-1} : \varphi(U\cap V) \longrightarrow &\psi(U \cap V) \\ x \longmapsto &\varphi\circ\psi^{-1}(x) := (\widetilde{x}^1(x),...,\widetilde{x}^n(x)) \end{align} The differential between the eculidean spaces gives the following, \begin{align} d(\psi\circ\varphi^{-1})_\varphi(p)\left(\frac{\partial}{\partial x^i}\Bigr\vert_{\varphi(p)}\right) = \frac{\partial \widetilde{x}^j}{\partial x^i}\Bigr\vert_{\varphi(p)}\frac{\partial}{\partial \widetilde{x}^j}\Bigr\vert_{\varphi(p)} \end{align} A coordinate vector at p associated with the chart $(U,\varphi)$ can be written as, \begin{align} \frac{\partial}{\partial x^i}\Bigr\vert_p &= d(\varphi^{-1})\left(\frac{\partial}{\partial x^i}\Bigr\vert_{\varphi(p)}\right) \\ &= d(\psi^{-1})_{\psi(p)} d(\psi\circ\varphi^{-1})_{\varphi(p)}\left(\frac{\partial}{\partial x^i}\Bigr\vert_{\varphi(p)}\right) \\ &= \frac{\partial \widetilde{x}^j}{\partial x^i}(\widetilde{p})\frac{\partial}{\partial \widetilde{x}^j}\Bigr\vert_p \end{align} We thus have the following transformation law for a tangent vector at $p \in U\cap V$ under a change of coordinates. \begin{align} \widetilde{v}^i = \frac{\partial \widetilde{x}^i}{\partial x^j}\Bigr\vert_{\varphi(p)}v^j. \end{align}
Consider a chart $U,\varphi$ with $p \in U$ (i.e centered at $p$). Write $v = v^i \frac{\partial}{\partial x^i}\Bigr\vert_p$. Then since $U$ is open, $\exists \epsilon$ such that \begin{align} \gamma:(-\epsilon,\epsilon) \rightarrow &U \\ t \mapsto & (tv^1,...,tv^n) \end{align} is smooth with, $\gamma(0) = p$ and $\gamma'(0) = v^i\frac{\partial}{\partial x^i}\Bigr\vert_{\gamma(0)} = v.$
A Lagrangian system (M,L) on a configuration space M, a smooth n-dimensional manifold is defined by a smooth function $L:TM\times\mathbb{R}\mapsto \mathbb{R}$ called the Lagrangian function or Lagrangian. We now formulate the Principle of Least Action which describes the (dynamics of) Lagrangian system.
Consider the space of smooth path on M with fixed end points,
\begin{align} P(M)^{q_1,t_1}_{q_0,t_0} := \{\gamma:[t_0,t_1] \mapsto M \mid\gamma(t_0) = q_0,\gamma(t_1) = q_1 \}. \end{align} We consider a variation $\Gamma$ of $\gamma \in P(M)$ as a smooth 1-parameter family of paths on M - $\{\gamma_\epsilon\},\gamma_\epsilon \in P(M)$, with $\gamma_0 = \gamma$. We define the vartiational derivative of a functional $F$ on the space of paths in M, i.e $F: P(M) \rightarrow \mathbb{R}$ as \begin{align} \delta F = \frac{d}{d\epsilon} F(\gamma_\epsilon)\Bigr\vert_{\epsilon=0} \end{align} We also have the tangential lift of the path $\gamma$ to $TM$, as \begin{align} \gamma': [t_0,t_1] \rightarrow &TM \\ t \mapsto & (\gamma(t), \dot{\gamma}(t)) \end{align} The notation is familiar from the previous sections, where we denote the veolocity vector of the curve $\gamma$ by $\dot{\gamma}(t)$, where$\gamma(t) = (\gamma^1,...,\gamma^n)$ and $\dot{\gamma}(t) = \frac{d\gamma^i}{dt} \frac{\partial}{\partial x^i}\Bigr\vert_{\gamma(t)}$with respect to the coordinate charts centered at $\gamma(t)$. We have thus lifted the path $\gamma$ to $TM$ where we now have information about the position $p$ in $M$ and also the it's velocity in $T_pM$ as a function of time, where the Lagrangian does it's magic. Thus, under standard coordinates on $TM$, we have $\gamma'(t) = (\gamma^1,...\gamma^n,\dot{\gamma}^1,...,\dot{\gamma}^n)$, where here, the dots actually indicate the time derivates.